Geographic information system - Data Acquisition and Management
Understand primary and secondary GIS data acquisition methods, data types and quality considerations, and how spatial databases manage and model geographic information.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What are the three main groups of spatial data acquisition methods?
1 of 17
Summary
Primary Data Capture for Geographic Information Systems
A Geographic Information System (GIS) is fundamentally a tool for storing, analyzing, and visualizing geographic data. Before you can do anything useful with a GIS, you need data to put into it. There are several ways to capture this data, and understanding these methods is essential to appreciating how GIS data is created and where it comes from.
Direct Field Measurement Methods
The most straightforward way to get geographic data into a GIS is to measure locations directly in the field. Modern survey instruments are equipped with coordinate geometry technology that allows surveyors to record positions and immediately transfer them into digital format. This approach is particularly useful for creating highly accurate representations of infrastructure, property boundaries, or other features that require precision.
Another primary method uses the Global Navigation Satellite System (GNSS), most commonly known as the Global Positioning System (GPS). These systems work by receiving signals from orbiting satellites to determine latitude, longitude, and elevation coordinates. Surveyors collect GPS positions in the field and import them directly into the GIS.
Mobile Data Collection
Field computers and mobile devices have revolutionized how geographic data is captured. Instead of collecting data in the field and processing it back in an office, workers can now edit GIS data in real-time using wireless connections. Alternatively, they can work in disconnected mode (without internet) and synchronize their data later when connectivity is available. This approach dramatically reduces the time between data collection and availability for analysis.
Remote Sensing: Data from Above
Remote sensing refers to capturing geographic information from a distance, typically from aircraft or satellites. Remote sensing platforms carry specialized sensors that collect data without touching the ground:
Cameras and digital scanners capture visible light imagery, similar to aerial photography
Lidar (Light Detection and Ranging) bounces laser pulses off the earth's surface to measure elevation with exceptional precision
These sensors produce massive amounts of data covering large geographic areas
Unmanned Aerial Vehicles (drones) have become increasingly important for capturing high-resolution aerial imagery. Because drones can fly at lower altitudes than traditional aircraft, they produce detailed images suitable for creating detailed GIS data of specific areas like farms, construction sites, or disaster zones.
Secondary Data Capture for Geographic Information Systems
Not all geographic data comes from new field measurements. Often, valuable information already exists in printed maps or survey plans. The process of converting these paper sources into digital GIS data is called secondary data capture.
Digitization: Converting Maps to Digital Format
Digitization is the process of transferring information from hard-copy maps into digital format. A technician traces features from a printed map using specialized software (often computer-aided design programs) and records their coordinates. This process must also include geo-referencing—linking the digitized features to a real-world coordinate system so they align properly with other geographic data.
Scanning and Raster Conversion
An alternative to manual digitization is scanning, which produces a raster image—a grid of cells, each containing a single value (like a photograph). Scanned maps are stored as raster data, but this format has limitations. To make scanned maps more useful in a GIS, technicians can process raster data to extract vector data (points, lines, and polygons), though this conversion is never perfect and requires careful quality control.
Accuracy Considerations in Data Capture
When capturing geographic data, projects must decide between two types of accuracy:
Relative accuracy ensures data is consistent within itself—features align properly with each other within the dataset—but may not match real-world coordinates perfectly. This is less expensive to achieve.
Absolute accuracy requires that recorded locations match actual ground positions and align with standard coordinate systems. This is more costly but essential when combining data from multiple sources.
This choice significantly affects both the cost of data capture and how useful the data will be in analysis.
Post-Capture Editing
Raw captured data always contains errors. After digitization or scanning, editing removes mistakes and ensures the data meets quality standards. A critical aspect of editing is enforcing topology—ensuring that geographic features connect properly. For example, street networks should have endpoints meeting at intersections (called nodes), not floating in space unconnected to adjacent streets.
Data Types and Geographic Phenomena
Understanding Different Data Representations
The same real-world information can be represented in different ways within a GIS, and understanding these representations is fundamental to effective geographic analysis.
Raster data organizes information as a grid of cells, like a photograph or checkerboard. Each cell holds a single value—elevation, temperature, land type, or any other measurement. Raster data works well for continuous phenomena like rainfall patterns or elevation changes across a landscape.
Vector data stores features as geometric shapes with associated information. Points represent discrete locations (a specific address, a utility pole), lines represent routes (roads, rivers, power lines), and polygons represent areas (building footprints, administrative boundaries). Each feature references an attribute table—a spreadsheet-like structure containing descriptive information about that feature.
Tabular data consists of traditional spreadsheet information—names, measurements, counts—that can be linked to spatial features through unique identifiers. For instance, census data (population counts by neighborhood) can be joined to polygon data (neighborhood boundaries) using a shared ID number.
Types of Geographic Phenomena
Not all geographic information represents the same kind of reality. Understanding what you're modeling is crucial for choosing the right approach:
Discrete objects are distinct, separate things you can count and locate individually. Houses, trees, cars, and road segments are discrete objects. They have clear boundaries and are most naturally represented as vector data.
Continuous fields represent phenomena that vary across space without gaps. Temperature, rainfall, elevation, and population density are continuous—they exist at every location on the landscape. These are often best represented as raster data, though they can be stored as vectors using lines connecting points of equal value (contour lines showing elevation, for example).
Confusing these two types is a common source of error in GIS analysis. A forest is a discrete object with a boundary, while forest density (a continuous measurement of how much of an area is forested) is a continuous field.
Major Geographic Data Sources
When starting a GIS project, you rarely need to capture all data from scratch. Several major sources provide geographic data that's already compiled and ready to use.
Government and Public Data
The U.S. Census Bureau is a major geographic data source, providing demographic and socioeconomic information (population counts, income levels, employment data) linked to geographic boundaries such as census tracts and counties. This data is freely available and essential for many planning and analysis projects.
The U.S. Geological Survey (USGS) provides extensive geographic data including elevation models, land cover maps, and remotely sensed imagery. These datasets have set standards for geographic data quality and availability in the United States.
Crowdsourced Data
OpenStreetMap is a global collaborative project that creates free vector data for roads, buildings, public services, and points of interest. While the accuracy and completeness varies by region (well-developed in cities, sparse in remote areas), it represents an important alternative to commercial geographic data sources and has become increasingly comprehensive.
Data Quality and Database Management
Evaluating Geographic Data Quality
Before using geographic data, you need to assess its quality. Two types of accuracy matter:
Positional accuracy measures how close a recorded location is to its true ground position. A point recorded within 5 meters of its actual location is more positionally accurate than one recorded 50 meters away. Positional accuracy requirements vary—a road network needs high accuracy, while a map showing general regions of forest cover can tolerate lower accuracy.
Attribute accuracy evaluates whether the descriptive information attached to features is correct. If a building is labeled as a hospital when it's actually a school, or a road is marked as two lanes when it's actually four, the attribute data is inaccurate even if the building's location and road's path are positioned correctly.
Database Architecture
The foundation of any GIS is a database—a structured storage system that maintains both the geometric information (coordinates, shapes) and attribute information about geographic features. These databases can be organized in different ways:
File-based systems store geographic data as separate files on a computer
Spatially enabled relational databases store both geometry and attributes in a single integrated database management system
Spatially enabled databases are increasingly common in professional GIS work because they enforce spatial indexing (organizing data for fast retrieval), topology rules (preventing invalid spatial relationships), and versioning (tracking changes over time).
For larger organizations and distributed work, distributed GIS architectures enable geographic data to be shared across networks and the internet, allowing multiple users to access and edit data simultaneously from different locations.
<extrainfo>
Point Clouds: Three-Dimensional Data
Modern remote sensing, particularly lidar, produces point clouds—massive collections of three-dimensional points with associated color information (RGB values). These can be processed into three-dimensional models and color images, representing terrain and structures with exceptional detail. However, point clouds are a specialized data type requiring different analytical approaches than traditional raster and vector data.
</extrainfo>
Putting It All Together: The GIS Data Workflow
Understanding how data flows through a GIS system brings together all these concepts. Geographic information systems operate by transforming real-world phenomena into digital representations that can be analyzed and visualized.
The complete data acquisition process includes three complementary methods:
Primary data capture: Direct field measurement using surveys, GPS, mobile devices, or remote sensing
Secondary data capture: Digitizing existing maps and documents
Data transfer: Obtaining geographic data from external sources (government agencies, OpenStreetMap, commercial providers)
Raw data from any of these sources requires quality control—editing, validation, and accuracy assessment—before it can reliably support analysis and decision-making.
The choice of which data capture method to use depends on your project's requirements: How accurate must the data be? How quickly is it needed? What budget is available? A project mapping property boundaries might require expensive, highly accurate GPS surveys, while a regional climate analysis might use freely available satellite imagery and weather station data.
Flashcards
What are the three main groups of spatial data acquisition methods?
Primary data capture
Secondary data capture
Data transfer
What technique allows survey data to be entered directly from digital instruments into a GIS?
Coordinate geometry (COGO)
What common satellite-based system provides position data for import into a GIS?
Global Navigation Satellite System (GNSS)
Which process transfers hard-copy maps into a digital medium using computer-aided design programs?
Digitization
What type of data is initially produced by scanning a map before it is converted to vector data?
Raster data
Which type of accuracy ensures a dataset is aligned to a real-world coordinate system?
Absolute accuracy
What is the primary goal of post-capture editing in a GIS?
Removing errors and ensuring topological correctness
How does the raster data model store geographic information?
As a grid of cells
What are the three geometric types used to store features in vector data?
Points
Lines
Polygons
Which metric measures how close a recorded location is to its true position on the ground?
Positional accuracy
Which metric evaluates the correctness of non-spatial information attached to geographic features?
Attribute accuracy
Which type of geographic phenomenon represents distinct entities like houses or roads?
Discrete objects
Which type of geographic phenomenon represents variables that vary over space, such as rainfall or population density?
Continuous fields
What data structure combines 3D points with RGB information to create color 3D images?
Point clouds
What three functions do spatial database management systems enforce to manage geographic data?
Spatial indexing
Topology rules
Versioning
Which organization provides demographic and socioeconomic data linked to geographic boundaries in the US?
The Census Bureau
Which crowd-sourced platform offers global vector data for roads and points of interest?
OpenStreetMap
Quiz
Geographic information system - Data Acquisition and Management Quiz Question 1: When a hard‑copy map is scanned, what type of GIS data is produced first?
- Raster data (correct)
- Vector data
- Tabular data
- Point‑cloud data
Geographic information system - Data Acquisition and Management Quiz Question 2: Which statement best describes raster data in a geographic information system?
- It stores information as a grid of cells. (correct)
- It stores features as points, lines, and polygons.
- It links attribute tables to spatial features via IDs.
- It combines three‑dimensional points with RGB values.
Geographic information system - Data Acquisition and Management Quiz Question 3: What process transfers hard‑copy maps or survey plans into a digital GIS format using CAD programs and georeferencing?
- Digitization (correct)
- Georeferencing
- Spatial interpolation
- Metadata cataloging
Geographic information system - Data Acquisition and Management Quiz Question 4: Which crowd‑sourced project provides worldwide vector data for roads, buildings, and points of interest?
- OpenStreetMap (correct)
- US Census Bureau
- Google Maps API
- National Geodetic Survey
Geographic information system - Data Acquisition and Management Quiz Question 5: Which type of sensors mounted on aircraft and satellites are commonly used to collect data for geographic information systems?
- Cameras, digital scanners, and LiDAR (correct)
- Ground‑penetrating radars, magnetometers, and seismographs
- Infrared thermometers, barometers, and hygrometers
- Acoustic microphones, radio transmitters, and GPS receivers
Geographic information system - Data Acquisition and Management Quiz Question 6: Which description best characterizes absolute accuracy in GIS data capture?
- Data are aligned to a real‑world coordinate system (correct)
- Measurements are consistent within the same dataset
- Attribute values are verified for correctness
- Positional errors are minimized relative to other features
Geographic information system - Data Acquisition and Management Quiz Question 7: What function do spatial database management systems provide to ensure GIS data integrity?
- Enforcement of spatial indexing, topology rules, and versioning (correct)
- Generation of three‑dimensional visualizations
- Remote sensing data acquisition from satellites
- Management of user authentication for web map services
Geographic information system - Data Acquisition and Management Quiz Question 8: Which GIS data structure combines three‑dimensional points with RGB values to produce colored 3‑D images?
- Point clouds (correct)
- Raster grids
- Vector polygons
- Digital elevation models
Geographic information system - Data Acquisition and Management Quiz Question 9: GNSS devices provide location data that can be loaded into a GIS. Which of the following is the most common format for these positions?
- Coordinate (X,Y) pairs (correct)
- Raster image tiles
- Polygon shapefiles
- Attribute tables without geometry
Geographic information system - Data Acquisition and Management Quiz Question 10: In GIS terminology, what does positional accuracy refer to?
- How close a recorded location is to its true ground position (correct)
- The correctness of attribute values attached to a feature
- The visual resolution of a raster image
- The consistency of data formatting across layers
Geographic information system - Data Acquisition and Management Quiz Question 11: Which technique allows survey measurements to be entered directly into a GIS without separate digitization?
- Coordinate geometry on digital survey instruments (correct)
- Scanning paper maps to create raster images
- Manual entry of coordinates from field notes after the survey
- Importing GPS log files and then digitizing them
Geographic information system - Data Acquisition and Management Quiz Question 12: What capability do field computers and mobile devices provide for GIS data collection in the field?
- Live editing of GIS data via wireless or disconnected sessions (correct)
- Automatic generation of 3‑D terrain models from satellite imagery
- Batch uploading of scanned paper maps after the fieldwork
- Real‑time weather forecasting for project planning
When a hard‑copy map is scanned, what type of GIS data is produced first?
1 of 12
Key Concepts
Geospatial Technologies
Geographic Information System (GIS)
Global Navigation Satellite System (GNSS)
Remote Sensing
Unmanned Aerial Vehicle (UAV)
Spatial Data Types
Digitization (GIS)
Raster Data
Vector Data
Spatial Database Management System (SDBMS)
OpenStreetMap
Positional Accuracy
Definitions
Geographic Information System (GIS)
A computer system for capturing, storing, analyzing, and visualizing spatial or geographic data.
Global Navigation Satellite System (GNSS)
A constellation of satellites that provides worldwide positioning, navigation, and timing information.
Remote Sensing
The acquisition of information about the Earth’s surface from a distance using aircraft or satellite sensors.
Unmanned Aerial Vehicle (UAV)
A remotely piloted aircraft used to collect high‑resolution aerial imagery and other geospatial data.
Digitization (GIS)
The process of converting analog maps or survey documents into digital spatial data.
Raster Data
Spatial data represented as a grid of cells, each containing a value such as imagery or elevation.
Vector Data
A spatial data model that represents geographic features as points, lines, and polygons with associated attributes.
Spatial Database Management System (SDBMS)
A database system optimized for storing, indexing, and querying spatial data.
OpenStreetMap
A collaborative project that creates a free, editable map of the world using crowd‑sourced vector data.
Positional Accuracy
A measure of how closely a recorded geographic coordinate matches its true ground location.