Geographic information systems - Data Acquisition and Management
Understand GIS data structures, acquisition methods (primary and secondary), and the impact of coordinate systems and data quality on spatial analysis.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What two primary components of geographic phenomena are modeled and stored within a geographic database?
1 of 17
Summary
Geospatial Data Management and Modeling
Introduction to GIS Databases
At the heart of any geographic information system (GIS) is a database that stores representations of geographic phenomena—the real-world features we care about studying, such as buildings, rivers, climate zones, or population centers. These databases must capture both the geometry (where things are located) and attributes (what those things are and their properties).
Geographic data can be organized in two main ways: as separate files (like individual shapefiles or rasters) or within a single spatially enabled relational database. Modern GIS increasingly uses the database approach because it allows for more efficient storage, querying, and management of large, complex datasets.
The diagram above shows the fundamental workflow: real-world phenomena are captured and converted into raw data, which is then processed into a data model suitable for analysis and visualization. This process is the focus of this lesson.
Understanding Geographic Phenomena
To model geographic data effectively, we need to recognize that the real world contains two fundamentally different types of phenomena:
Discrete Objects are distinct, individual things with clear boundaries. Examples include houses, roads, trees, and city boundaries. When you map these features, you're identifying specific things that exist at particular locations.
Continuous Fields represent phenomena that vary across space without clear boundaries between units. Rainfall amount, temperature, population density, and elevation are all continuous fields—they exist everywhere across a region and change gradually from one location to another.
This distinction is critical because it affects how you'll choose to represent and analyze the data. Discrete objects work well in vector data structures (discussed next), while continuous fields are often better represented as rasters.
Data Structures: How Geographic Data is Stored
Once you've identified whether you're dealing with discrete objects or continuous fields, you need a way to store that information digitally. GIS uses three primary data structures:
Raster Data represents the world as a regular grid of cells (also called pixels). Each cell contains a single value—perhaps the temperature at that location, the elevation, or a land cover classification. Raster data is naturally suited for continuous phenomena because every location has a value. The resolution (cell size) determines the level of detail you capture. A raster with 1-meter cells will show more detail than one with 100-meter cells, but will require much more storage.
Vector Data represents the world using discrete geometric shapes:
Points mark specific locations (a building, a tree, a GPS waypoint)
Lines connect points to represent linear features (roads, rivers, power lines)
Polygons are closed shapes that represent areas (a park boundary, a forest, a neighborhood)
Each geometric feature in vector data is associated with attributes—additional information stored in a database table. For example, a road line might have attributes for road name, speed limit, and surface type.
Point Clouds combine three-dimensional coordinate data with color information (RGB values) to create three-dimensional colored point images. These are generated from technologies like lidar (light detection and ranging), where sensors measure distances to millions of points to create a detailed 3D representation of terrain and structures.
The choice between raster and vector depends on your data type, the precision you need, and the analyses you plan to perform. Vector data excels at precise representation of discrete objects and efficient spatial queries, while raster data is simpler for continuous phenomena and grid-based analysis.
Data Acquisition: Getting Data Into Your GIS
Geographic data doesn't appear in your GIS by magic—it must be captured through one of three methods:
Primary Data Capture involves direct measurement of the real world in the field. This includes GPS measurements, surveying, and remote sensing.
Secondary Data Capture converts existing geographic information (like paper maps or aerial photos) into digital form through digitization or scanning.
Data Transfer involves importing geographic data that already exists in digital form, such as datasets from government agencies or commercial providers.
Each method has different costs, accuracy characteristics, and appropriate use cases. Understanding the strengths and limitations of each will help you choose the right data sources for your project.
Primary Data Capture: Direct Field Measurement
Primary data capture methods involve going into the field and measuring geographic phenomena directly, creating new digital geographic data from scratch.
Survey Instruments and Coordinate Geometry
Traditional surveying—using instruments like theodolites and total stations to measure angles and distances—remains a precise method for capturing geographic data. Modern digital survey instruments can record measurements using coordinate geometry techniques, which directly convert angles and distances into precise coordinates. These coordinates can be entered directly into a GIS, allowing surveyors to create accurate digital maps of properties, utilities, and infrastructure.
Global Navigation Satellite System (GNSS) Positioning
The Global Positioning System (GPS) and other global navigation satellite systems allow users to determine their precise location anywhere on Earth by receiving signals from satellites. You've likely used GPS on a smartphone, but professional-grade GNSS equipment provides much higher accuracy—down to centimeters in some cases. Positions obtained from GNSS can be imported directly into a GIS. This makes GNSS incredibly valuable for tasks like mapping property boundaries, tracking vehicle movements, or recording wildlife observations.
Mobile Data Collection
Modern field work increasingly uses field computers and mobile devices (tablets and smartphones) that enable live editing of GIS data. A surveyor or field worker can view GIS data on their device, make edits or add new observations, and synchronize those changes back to a central database through wireless connections. Some systems support disconnected editing, where field workers collect data offline and synchronize later when they reconnect—essential for remote areas without consistent cellular coverage.
Remote Sensing Platforms
Remote sensing refers to collecting data about the Earth from a distance, typically using sensors on aircraft or satellites. These sensors include:
Cameras that collect visible light imagery (like aerial photography)
Digital scanners that record multiple wavelengths of light, revealing information not visible to the human eye (like vegetation health)
Lidar (Light Detection and Ranging), which uses laser pulses to measure distances and create detailed three-dimensional images of terrain and surface features
Unmanned aerial vehicles (UAVs), commonly called drones, have revolutionized remote sensing by providing high-resolution aerial imagery at lower cost than traditional aircraft or satellites. Drone imagery can be processed to create orthophotos (geometrically corrected photos that can be used as map bases), digital elevation models, and other GIS datasets.
Secondary Data Capture: Converting Existing Information
Often, the geographic information you need already exists—on paper maps, in scanned documents, or in other analog formats. Secondary data capture converts these sources into digital form suitable for GIS analysis.
Digitization Techniques
Digitization is the process of converting hard-copy maps or survey plans into digital data by tracing features on screen using specialized software. A digitizer might use computer-aided design (CAD) programs or GIS software to carefully trace roads, boundaries, or other features from a scanned map image. Geo-referencing—linking the digital data to a real-world coordinate system—ensures that the digitized features align with actual locations on Earth.
Scanning and Raster-to-Vector Conversion
A simpler way to digitize existing maps is to scan them, creating a raster image file. The scanned raster can then be processed using software algorithms that automatically trace features and convert them into vector format. This approach is faster than manual digitization for large-scale map conversion projects, though the results typically require manual editing for accuracy.
Accuracy Considerations: Relative vs. Absolute
When capturing data, you must decide what level of accuracy you need, which significantly affects cost and effort:
Relative accuracy means features are geometrically correct relative to each other within the dataset, even if they don't align perfectly with real-world coordinates. This is sufficient for many applications where you care about relationships between features more than their absolute position.
Absolute accuracy means features are positioned correctly according to a real-world coordinate system. This requires more careful measurement and is essential for applications like property boundaries, construction, or surveying.
The difference matters: creating a digitized map with good relative accuracy might cost far less than creating one with high absolute accuracy, but relative accuracy is insufficient for property records.
Post-Capture Editing and Topology
Once data is entered into a GIS, it typically requires editing to remove errors and ensure topological correctness. In a road network, for example, lines representing roads should intersect at actual nodes (intersections), not pass over or under each other. In a polygon dataset, polygon boundaries should not overlap or leave gaps. This editing process ensures that the spatial relationships in your data match the real world, making analysis meaningful.
Coordinate Systems, Projections, and Datum Models
To store geographic data in a database, every location must have a coordinate. This requires agreeing on how to represent Earth's positions—the role of coordinate systems, datums, and projections.
Earth Models: Sphere, Ellipsoid, and Datum Models
The Earth is not a perfect sphere. It's an ellipsoid—slightly flattened at the poles and bulging at the equator. But even an ellipsoid model isn't perfectly accurate everywhere. Datum models are mathematical representations that define a surface and provide a coordinate system for locating any point on Earth's surface.
Different regions often use different datum models optimized for accuracy in that area:
The North American Datum of 1983 (NAD83) is the standard coordinate system for mapping in the United States and Canada
The World Geodetic System (WGS84) is used globally and is the standard for GPS data
Geographic Coordinate Systems
When data is expressed in latitude and longitude without projection (as angles measured from the equator and prime meridian), it's described as being in a geographic coordinate system. You might see this described as "Geographic Coordinate System: North American Datum 1983" or "Geographic Coordinate System: WGS84."
Datum Transformations
A critical challenge in GIS work is that different datasets may be based on different datums. Converting coordinates from one datum to another requires a datum transformation. The most common approach is a Helmert transformation, which accounts for rotation, translation, and scale differences between datums. Sometimes, for nearby datums, a simple translation (shifting all coordinates by a fixed amount) is sufficient.
Failing to account for datum differences can introduce errors of tens to hundreds of meters—enough to make analysis meaningless or dangerous in applications like construction or surveying.
Data Quality: The Foundation of Reliable Analysis
A critical principle in GIS: geographic data should be sufficiently close to reality to produce results that correspond to real-world processes. No geographic dataset is perfectly accurate, but it must be accurate enough for your intended use.
Sources of Positional Accuracy Variation
Not all position data is equally accurate. Consider GPS:
A smartphone GPS might be accurate to 5-10 meters because it uses fewer satellites and simpler processing
High-end professional GNSS equipment might achieve centimeter-level accuracy using advanced techniques
Similarly, digital terrain models (representations of Earth's elevation) and aerial imagery are available at many levels of spatial precision. A freely available worldwide elevation dataset might have 30-meter resolution, while specialized lidar data from an airborne survey might have 1-meter resolution. Your choice of data source directly affects the quality of your results.
Understanding Error Propagation
Here's a crucial insight: all geographic data contain inherent inaccuracies, and these errors propagate through GIS operations in ways that are difficult to predict. If you measure building locations with ±5-meter accuracy, then calculate areas within 100 meters of buildings, your results' accuracy won't be ±5 meters—it will be worse.
The challenge is that different types of operations propagate error differently. A buffer operation (creating a zone around features) might increase error in one way, while an overlay operation (combining multiple layers) might increase it differently. This means you cannot always predict how accurate your final results will be, even if you know your input data accuracy.
Understanding these limitations is critical for responsible GIS work. You must consider data quality when interpreting results and communicating findings to others.
Summary
Geospatial data management and modeling form the foundation of GIS work. Databases store both geometric and attribute information about geographic phenomena—either discrete objects or continuous fields. Data arrives through primary capture (direct measurement), secondary capture (digitizing existing sources), or data transfer. Once in your system, data must be registered to a coordinate system and datum, edited for quality, and carefully assessed for accuracy before use in analysis. Understanding these fundamentals ensures that your GIS work produces reliable, meaningful results.
Flashcards
What two primary components of geographic phenomena are modeled and stored within a geographic database?
Geometry and attributes
In what two ways can geographic data be stored?
As separate files
Within a single spatially enabled relational database
What are the two most common types of geographic phenomena?
Discrete objects (e.g., houses, roads)
Continuous fields (e.g., rainfall, population density)
How do raster images store geographic data?
As a grid of cells
What types of spatial features are stored in vector data to reference attributes?
Points, lines, and polygons
What are the three main groups of spatial data acquisition methods?
Primary data capture (direct field measurement)
Secondary data capture (digitizing existing sources)
Data transfer (copying external GIS data)
What technique is used to enter survey data directly into a GIS from digital instruments?
Coordinate geometry
What are the two ways field computers can edit GIS data during mobile data collection?
Wireless connections or disconnected editing sessions
What is the purpose of digitization in a GIS context?
Transferring hard-copy maps or survey plans into a digital medium
What type of data is initially produced by scanning a map before it is converted to vector data?
Raster data
What are the two types of accuracy that influence the cost and interpretability of data capture?
Relative accuracy (consistency within the dataset)
Absolute accuracy (alignment to a real-world coordinate system)
What is the primary goal of post-capture editing regarding the relationship between features?
Ensuring topological correctness
What are three common ways the Earth is represented to provide coordinates?
Sphere
Ellipsoid
Datum models
Which datum model is specifically used for United States measurements?
North American Datum of 1983 (NAD 83)
What is the term for a coordinate system where data is projected in latitude and longitude?
Geographic coordinate system
How does high-end GNSS equipment generally compare to typical smartphone GPS?
It provides greater positional accuracy
What phenomenon occurs when inherent inaccuracies in data move through GIS operations?
Error propagation
Quiz
Geographic information systems - Data Acquisition and Management Quiz Question 1: What process transfers hard‑copy maps or survey plans into a digital GIS format using CAD programs and geo‑referencing?
- Digitization of maps or survey plans (correct)
- Photogrammetric stereo imaging of aerial photographs
- Scanning to produce raster data only
- Direct GPS data collection in the field
Geographic information systems - Data Acquisition and Management Quiz Question 2: Which Earth representation provides coordinates for any location and may be a sphere, ellipsoid, or more complex model?
- A datum model (correct)
- A geographic coordinate system
- A map projection
- A topographic layer
Geographic information systems - Data Acquisition and Management Quiz Question 3: According to GIS data‑quality principles, what condition must data meet to be useful for analysis?
- It must be sufficiently close to reality (correct)
- It must be completely error‑free
- It must be constantly updated in real time
- It must be stored only as raster layers
Geographic information systems - Data Acquisition and Management Quiz Question 4: Which pair correctly identifies the two primary types of geographic phenomena used in GIS?
- Discrete objects and continuous fields (correct)
- Raster images and vector layers
- Elevation models and satellite imagery
- Points and polygons only
Geographic information systems - Data Acquisition and Management Quiz Question 5: What is the main distinction between relative accuracy and absolute accuracy in GIS data capture?
- Relative accuracy is consistency within the dataset; absolute accuracy aligns data to real‑world coordinates (correct)
- Relative accuracy measures color fidelity; absolute accuracy measures file size
- Relative accuracy refers to temporal precision; absolute accuracy refers to attribute completeness
- Relative accuracy is about projection type; absolute accuracy is about datum selection
Geographic information systems - Data Acquisition and Management Quiz Question 6: When a paper map is scanned to be used in a GIS, what is the primary data format produced?
- Raster image (correct)
- Vector shapefile
- Attribute table
- Digital elevation model
Geographic information systems - Data Acquisition and Management Quiz Question 7: What type of data combines three‑dimensional points with RGB information to produce three‑dimensional color images?
- Point clouds (correct)
- Raster images
- Vector polygons
- Digital elevation models
Geographic information systems - Data Acquisition and Management Quiz Question 8: When data are projected in latitude and longitude, which type of coordinate system is being used?
- Geographic coordinate system (correct)
- Projected coordinate system
- Local Cartesian coordinate system
- Universal Transverse Mercator (UTM) system
Geographic information systems - Data Acquisition and Management Quiz Question 9: Which of the following sensors can be mounted on aircraft or satellites to collect remotely sensed data for a geographic information system?
- Lidar (correct)
- Soil moisture probe
- Seismic sensor
- Ground‑based weather station
Geographic information systems - Data Acquisition and Management Quiz Question 10: What term describes the way inherent inaccuracies in geographic data affect subsequent GIS analyses?
- Error propagation (correct)
- Data redundancy
- Spatial indexing
- Attribute normalization
Geographic information systems - Data Acquisition and Management Quiz Question 11: What advantage does using coordinate geometry (COGO) on digital survey instruments provide for GIS data entry?
- It allows direct capture of precise spatial coordinates into the GIS (correct)
- It automatically classifies land‑cover types from imagery
- It generates three‑dimensional terrain models without field measurements
- It streams real‑time satellite weather data into the GIS
Geographic information systems - Data Acquisition and Management Quiz Question 12: If two geographic datums differ only by a constant offset, which conversion method is usually sufficient?
- Simple translation (correct)
- Helmert transformation
- Map projection change
- Raster resampling
Geographic information systems - Data Acquisition and Management Quiz Question 13: Which category of spatial data acquisition involves converting existing maps or documents into digital GIS layers?
- Secondary data capture (correct)
- Primary data capture
- Data transfer
- Real‑time sensor acquisition
Geographic information systems - Data Acquisition and Management Quiz Question 14: During post‑capture editing, which operation is performed to eliminate dangling line ends?
- Connecting line endpoints at network nodes (correct)
- Converting raster imagery to vector format
- Applying color correction to aerial photos
- Compressing the dataset for storage efficiency
Geographic information systems - Data Acquisition and Management Quiz Question 15: Which data storage option in a GIS enables efficient management of large datasets and supports complex spatial queries?
- Spatially enabled relational database (correct)
- Separate individual files for each layer
- Plain text spreadsheets
- Image raster files without attribute tables
Geographic information systems - Data Acquisition and Management Quiz Question 16: What feature of field computers and mobile devices allows GIS data to be edited when a network connection is unavailable?
- Disconnected editing sessions (correct)
- Real‑time satellite imagery streaming
- Automatic attribute generation
- Built‑in coordinate transformation
Geographic information systems - Data Acquisition and Management Quiz Question 17: What does the acronym GNSS stand for, whose positions can be imported into a GIS?
- Global Navigation Satellite System (correct)
- Geographic Network Survey System
- Geospatial Numerical Sampling Service
- Generalized Navigation Signal Standard
Geographic information systems - Data Acquisition and Management Quiz Question 18: What term describes how close a GNSS‑measured position is to its true ground location?
- Positional accuracy (correct)
- Relative precision
- Temporal resolution
- Semantic consistency
What process transfers hard‑copy maps or survey plans into a digital GIS format using CAD programs and geo‑referencing?
1 of 18
Key Concepts
Geospatial Data Types
Raster data
Vector data
Geographic data quality
Geographic Information Systems
Geographic information system (GIS)
Data digitization
Map projection
Spatial Reference Systems
Global Navigation Satellite System (GNSS)
Coordinate reference system
Geodetic datum
Remote sensing
Definitions
Geographic information system (GIS)
A computer-based system for capturing, storing, analyzing, managing, and visualizing spatial or geographic data.
Raster data
A grid-based data structure where each cell (pixel) holds a value representing information such as imagery or continuous phenomena.
Vector data
A spatial data model that represents geographic features as points, lines, and polygons, each linked to attribute information.
Remote sensing
The acquisition of information about an object or area from a distance, typically using aircraft or satellite-mounted sensors.
Global Navigation Satellite System (GNSS)
A constellation of satellites that provides positioning, navigation, and timing information to GNSS receivers worldwide.
Coordinate reference system
A framework that defines how geographic coordinates relate to positions on the Earth, including geographic and projected systems.
Geodetic datum
A mathematical model of the Earth’s shape used as a reference for measuring locations and defining coordinate systems.
Map projection
A systematic transformation of geographic coordinates from a curved surface to a flat map, preserving certain spatial properties.
Data digitization
The process of converting analog maps, plans, or images into digital formats through scanning, tracing, or automated extraction.
Geographic data quality
The assessment of accuracy, precision, completeness, and reliability of spatial data relative to real‑world conditions.