RemNote Community
Community

Geographic information system - Data Acquisition and Management

Understand primary and secondary GIS data acquisition methods, data types and quality considerations, and how spatial databases manage and model geographic information.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz

Quick Practice

What are the three main groups of spatial data acquisition methods?
1 of 17

Summary

Primary Data Capture for Geographic Information Systems A Geographic Information System (GIS) is fundamentally a tool for storing, analyzing, and visualizing geographic data. Before you can do anything useful with a GIS, you need data to put into it. There are several ways to capture this data, and understanding these methods is essential to appreciating how GIS data is created and where it comes from. Direct Field Measurement Methods The most straightforward way to get geographic data into a GIS is to measure locations directly in the field. Modern survey instruments are equipped with coordinate geometry technology that allows surveyors to record positions and immediately transfer them into digital format. This approach is particularly useful for creating highly accurate representations of infrastructure, property boundaries, or other features that require precision. Another primary method uses the Global Navigation Satellite System (GNSS), most commonly known as the Global Positioning System (GPS). These systems work by receiving signals from orbiting satellites to determine latitude, longitude, and elevation coordinates. Surveyors collect GPS positions in the field and import them directly into the GIS. Mobile Data Collection Field computers and mobile devices have revolutionized how geographic data is captured. Instead of collecting data in the field and processing it back in an office, workers can now edit GIS data in real-time using wireless connections. Alternatively, they can work in disconnected mode (without internet) and synchronize their data later when connectivity is available. This approach dramatically reduces the time between data collection and availability for analysis. Remote Sensing: Data from Above Remote sensing refers to capturing geographic information from a distance, typically from aircraft or satellites. Remote sensing platforms carry specialized sensors that collect data without touching the ground: Cameras and digital scanners capture visible light imagery, similar to aerial photography Lidar (Light Detection and Ranging) bounces laser pulses off the earth's surface to measure elevation with exceptional precision These sensors produce massive amounts of data covering large geographic areas Unmanned Aerial Vehicles (drones) have become increasingly important for capturing high-resolution aerial imagery. Because drones can fly at lower altitudes than traditional aircraft, they produce detailed images suitable for creating detailed GIS data of specific areas like farms, construction sites, or disaster zones. Secondary Data Capture for Geographic Information Systems Not all geographic data comes from new field measurements. Often, valuable information already exists in printed maps or survey plans. The process of converting these paper sources into digital GIS data is called secondary data capture. Digitization: Converting Maps to Digital Format Digitization is the process of transferring information from hard-copy maps into digital format. A technician traces features from a printed map using specialized software (often computer-aided design programs) and records their coordinates. This process must also include geo-referencing—linking the digitized features to a real-world coordinate system so they align properly with other geographic data. Scanning and Raster Conversion An alternative to manual digitization is scanning, which produces a raster image—a grid of cells, each containing a single value (like a photograph). Scanned maps are stored as raster data, but this format has limitations. To make scanned maps more useful in a GIS, technicians can process raster data to extract vector data (points, lines, and polygons), though this conversion is never perfect and requires careful quality control. Accuracy Considerations in Data Capture When capturing geographic data, projects must decide between two types of accuracy: Relative accuracy ensures data is consistent within itself—features align properly with each other within the dataset—but may not match real-world coordinates perfectly. This is less expensive to achieve. Absolute accuracy requires that recorded locations match actual ground positions and align with standard coordinate systems. This is more costly but essential when combining data from multiple sources. This choice significantly affects both the cost of data capture and how useful the data will be in analysis. Post-Capture Editing Raw captured data always contains errors. After digitization or scanning, editing removes mistakes and ensures the data meets quality standards. A critical aspect of editing is enforcing topology—ensuring that geographic features connect properly. For example, street networks should have endpoints meeting at intersections (called nodes), not floating in space unconnected to adjacent streets. Data Types and Geographic Phenomena Understanding Different Data Representations The same real-world information can be represented in different ways within a GIS, and understanding these representations is fundamental to effective geographic analysis. Raster data organizes information as a grid of cells, like a photograph or checkerboard. Each cell holds a single value—elevation, temperature, land type, or any other measurement. Raster data works well for continuous phenomena like rainfall patterns or elevation changes across a landscape. Vector data stores features as geometric shapes with associated information. Points represent discrete locations (a specific address, a utility pole), lines represent routes (roads, rivers, power lines), and polygons represent areas (building footprints, administrative boundaries). Each feature references an attribute table—a spreadsheet-like structure containing descriptive information about that feature. Tabular data consists of traditional spreadsheet information—names, measurements, counts—that can be linked to spatial features through unique identifiers. For instance, census data (population counts by neighborhood) can be joined to polygon data (neighborhood boundaries) using a shared ID number. Types of Geographic Phenomena Not all geographic information represents the same kind of reality. Understanding what you're modeling is crucial for choosing the right approach: Discrete objects are distinct, separate things you can count and locate individually. Houses, trees, cars, and road segments are discrete objects. They have clear boundaries and are most naturally represented as vector data. Continuous fields represent phenomena that vary across space without gaps. Temperature, rainfall, elevation, and population density are continuous—they exist at every location on the landscape. These are often best represented as raster data, though they can be stored as vectors using lines connecting points of equal value (contour lines showing elevation, for example). Confusing these two types is a common source of error in GIS analysis. A forest is a discrete object with a boundary, while forest density (a continuous measurement of how much of an area is forested) is a continuous field. Major Geographic Data Sources When starting a GIS project, you rarely need to capture all data from scratch. Several major sources provide geographic data that's already compiled and ready to use. Government and Public Data The U.S. Census Bureau is a major geographic data source, providing demographic and socioeconomic information (population counts, income levels, employment data) linked to geographic boundaries such as census tracts and counties. This data is freely available and essential for many planning and analysis projects. The U.S. Geological Survey (USGS) provides extensive geographic data including elevation models, land cover maps, and remotely sensed imagery. These datasets have set standards for geographic data quality and availability in the United States. Crowdsourced Data OpenStreetMap is a global collaborative project that creates free vector data for roads, buildings, public services, and points of interest. While the accuracy and completeness varies by region (well-developed in cities, sparse in remote areas), it represents an important alternative to commercial geographic data sources and has become increasingly comprehensive. Data Quality and Database Management Evaluating Geographic Data Quality Before using geographic data, you need to assess its quality. Two types of accuracy matter: Positional accuracy measures how close a recorded location is to its true ground position. A point recorded within 5 meters of its actual location is more positionally accurate than one recorded 50 meters away. Positional accuracy requirements vary—a road network needs high accuracy, while a map showing general regions of forest cover can tolerate lower accuracy. Attribute accuracy evaluates whether the descriptive information attached to features is correct. If a building is labeled as a hospital when it's actually a school, or a road is marked as two lanes when it's actually four, the attribute data is inaccurate even if the building's location and road's path are positioned correctly. Database Architecture The foundation of any GIS is a database—a structured storage system that maintains both the geometric information (coordinates, shapes) and attribute information about geographic features. These databases can be organized in different ways: File-based systems store geographic data as separate files on a computer Spatially enabled relational databases store both geometry and attributes in a single integrated database management system Spatially enabled databases are increasingly common in professional GIS work because they enforce spatial indexing (organizing data for fast retrieval), topology rules (preventing invalid spatial relationships), and versioning (tracking changes over time). For larger organizations and distributed work, distributed GIS architectures enable geographic data to be shared across networks and the internet, allowing multiple users to access and edit data simultaneously from different locations. <extrainfo> Point Clouds: Three-Dimensional Data Modern remote sensing, particularly lidar, produces point clouds—massive collections of three-dimensional points with associated color information (RGB values). These can be processed into three-dimensional models and color images, representing terrain and structures with exceptional detail. However, point clouds are a specialized data type requiring different analytical approaches than traditional raster and vector data. </extrainfo> Putting It All Together: The GIS Data Workflow Understanding how data flows through a GIS system brings together all these concepts. Geographic information systems operate by transforming real-world phenomena into digital representations that can be analyzed and visualized. The complete data acquisition process includes three complementary methods: Primary data capture: Direct field measurement using surveys, GPS, mobile devices, or remote sensing Secondary data capture: Digitizing existing maps and documents Data transfer: Obtaining geographic data from external sources (government agencies, OpenStreetMap, commercial providers) Raw data from any of these sources requires quality control—editing, validation, and accuracy assessment—before it can reliably support analysis and decision-making. The choice of which data capture method to use depends on your project's requirements: How accurate must the data be? How quickly is it needed? What budget is available? A project mapping property boundaries might require expensive, highly accurate GPS surveys, while a regional climate analysis might use freely available satellite imagery and weather station data.
Flashcards
What are the three main groups of spatial data acquisition methods?
Primary data capture Secondary data capture Data transfer
What technique allows survey data to be entered directly from digital instruments into a GIS?
Coordinate geometry (COGO)
What common satellite-based system provides position data for import into a GIS?
Global Navigation Satellite System (GNSS)
Which process transfers hard-copy maps into a digital medium using computer-aided design programs?
Digitization
What type of data is initially produced by scanning a map before it is converted to vector data?
Raster data
Which type of accuracy ensures a dataset is aligned to a real-world coordinate system?
Absolute accuracy
What is the primary goal of post-capture editing in a GIS?
Removing errors and ensuring topological correctness
How does the raster data model store geographic information?
As a grid of cells
What are the three geometric types used to store features in vector data?
Points Lines Polygons
Which metric measures how close a recorded location is to its true position on the ground?
Positional accuracy
Which metric evaluates the correctness of non-spatial information attached to geographic features?
Attribute accuracy
Which type of geographic phenomenon represents distinct entities like houses or roads?
Discrete objects
Which type of geographic phenomenon represents variables that vary over space, such as rainfall or population density?
Continuous fields
What data structure combines 3D points with RGB information to create color 3D images?
Point clouds
What three functions do spatial database management systems enforce to manage geographic data?
Spatial indexing Topology rules Versioning
Which organization provides demographic and socioeconomic data linked to geographic boundaries in the US?
The Census Bureau
Which crowd-sourced platform offers global vector data for roads and points of interest?
OpenStreetMap

Quiz

When a hard‑copy map is scanned, what type of GIS data is produced first?
1 of 12
Key Concepts
Geospatial Technologies
Geographic Information System (GIS)
Global Navigation Satellite System (GNSS)
Remote Sensing
Unmanned Aerial Vehicle (UAV)
Spatial Data Types
Digitization (GIS)
Raster Data
Vector Data
Spatial Database Management System (SDBMS)
OpenStreetMap
Positional Accuracy