RemNote Community
Community

Geographic information systems - Data Acquisition and Management

Understand GIS data structures, acquisition methods (primary and secondary), and the impact of coordinate systems and data quality on spatial analysis.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz

Quick Practice

What two primary components of geographic phenomena are modeled and stored within a geographic database?
1 of 17

Summary

Geospatial Data Management and Modeling Introduction to GIS Databases At the heart of any geographic information system (GIS) is a database that stores representations of geographic phenomena—the real-world features we care about studying, such as buildings, rivers, climate zones, or population centers. These databases must capture both the geometry (where things are located) and attributes (what those things are and their properties). Geographic data can be organized in two main ways: as separate files (like individual shapefiles or rasters) or within a single spatially enabled relational database. Modern GIS increasingly uses the database approach because it allows for more efficient storage, querying, and management of large, complex datasets. The diagram above shows the fundamental workflow: real-world phenomena are captured and converted into raw data, which is then processed into a data model suitable for analysis and visualization. This process is the focus of this lesson. Understanding Geographic Phenomena To model geographic data effectively, we need to recognize that the real world contains two fundamentally different types of phenomena: Discrete Objects are distinct, individual things with clear boundaries. Examples include houses, roads, trees, and city boundaries. When you map these features, you're identifying specific things that exist at particular locations. Continuous Fields represent phenomena that vary across space without clear boundaries between units. Rainfall amount, temperature, population density, and elevation are all continuous fields—they exist everywhere across a region and change gradually from one location to another. This distinction is critical because it affects how you'll choose to represent and analyze the data. Discrete objects work well in vector data structures (discussed next), while continuous fields are often better represented as rasters. Data Structures: How Geographic Data is Stored Once you've identified whether you're dealing with discrete objects or continuous fields, you need a way to store that information digitally. GIS uses three primary data structures: Raster Data represents the world as a regular grid of cells (also called pixels). Each cell contains a single value—perhaps the temperature at that location, the elevation, or a land cover classification. Raster data is naturally suited for continuous phenomena because every location has a value. The resolution (cell size) determines the level of detail you capture. A raster with 1-meter cells will show more detail than one with 100-meter cells, but will require much more storage. Vector Data represents the world using discrete geometric shapes: Points mark specific locations (a building, a tree, a GPS waypoint) Lines connect points to represent linear features (roads, rivers, power lines) Polygons are closed shapes that represent areas (a park boundary, a forest, a neighborhood) Each geometric feature in vector data is associated with attributes—additional information stored in a database table. For example, a road line might have attributes for road name, speed limit, and surface type. Point Clouds combine three-dimensional coordinate data with color information (RGB values) to create three-dimensional colored point images. These are generated from technologies like lidar (light detection and ranging), where sensors measure distances to millions of points to create a detailed 3D representation of terrain and structures. The choice between raster and vector depends on your data type, the precision you need, and the analyses you plan to perform. Vector data excels at precise representation of discrete objects and efficient spatial queries, while raster data is simpler for continuous phenomena and grid-based analysis. Data Acquisition: Getting Data Into Your GIS Geographic data doesn't appear in your GIS by magic—it must be captured through one of three methods: Primary Data Capture involves direct measurement of the real world in the field. This includes GPS measurements, surveying, and remote sensing. Secondary Data Capture converts existing geographic information (like paper maps or aerial photos) into digital form through digitization or scanning. Data Transfer involves importing geographic data that already exists in digital form, such as datasets from government agencies or commercial providers. Each method has different costs, accuracy characteristics, and appropriate use cases. Understanding the strengths and limitations of each will help you choose the right data sources for your project. Primary Data Capture: Direct Field Measurement Primary data capture methods involve going into the field and measuring geographic phenomena directly, creating new digital geographic data from scratch. Survey Instruments and Coordinate Geometry Traditional surveying—using instruments like theodolites and total stations to measure angles and distances—remains a precise method for capturing geographic data. Modern digital survey instruments can record measurements using coordinate geometry techniques, which directly convert angles and distances into precise coordinates. These coordinates can be entered directly into a GIS, allowing surveyors to create accurate digital maps of properties, utilities, and infrastructure. Global Navigation Satellite System (GNSS) Positioning The Global Positioning System (GPS) and other global navigation satellite systems allow users to determine their precise location anywhere on Earth by receiving signals from satellites. You've likely used GPS on a smartphone, but professional-grade GNSS equipment provides much higher accuracy—down to centimeters in some cases. Positions obtained from GNSS can be imported directly into a GIS. This makes GNSS incredibly valuable for tasks like mapping property boundaries, tracking vehicle movements, or recording wildlife observations. Mobile Data Collection Modern field work increasingly uses field computers and mobile devices (tablets and smartphones) that enable live editing of GIS data. A surveyor or field worker can view GIS data on their device, make edits or add new observations, and synchronize those changes back to a central database through wireless connections. Some systems support disconnected editing, where field workers collect data offline and synchronize later when they reconnect—essential for remote areas without consistent cellular coverage. Remote Sensing Platforms Remote sensing refers to collecting data about the Earth from a distance, typically using sensors on aircraft or satellites. These sensors include: Cameras that collect visible light imagery (like aerial photography) Digital scanners that record multiple wavelengths of light, revealing information not visible to the human eye (like vegetation health) Lidar (Light Detection and Ranging), which uses laser pulses to measure distances and create detailed three-dimensional images of terrain and surface features Unmanned aerial vehicles (UAVs), commonly called drones, have revolutionized remote sensing by providing high-resolution aerial imagery at lower cost than traditional aircraft or satellites. Drone imagery can be processed to create orthophotos (geometrically corrected photos that can be used as map bases), digital elevation models, and other GIS datasets. Secondary Data Capture: Converting Existing Information Often, the geographic information you need already exists—on paper maps, in scanned documents, or in other analog formats. Secondary data capture converts these sources into digital form suitable for GIS analysis. Digitization Techniques Digitization is the process of converting hard-copy maps or survey plans into digital data by tracing features on screen using specialized software. A digitizer might use computer-aided design (CAD) programs or GIS software to carefully trace roads, boundaries, or other features from a scanned map image. Geo-referencing—linking the digital data to a real-world coordinate system—ensures that the digitized features align with actual locations on Earth. Scanning and Raster-to-Vector Conversion A simpler way to digitize existing maps is to scan them, creating a raster image file. The scanned raster can then be processed using software algorithms that automatically trace features and convert them into vector format. This approach is faster than manual digitization for large-scale map conversion projects, though the results typically require manual editing for accuracy. Accuracy Considerations: Relative vs. Absolute When capturing data, you must decide what level of accuracy you need, which significantly affects cost and effort: Relative accuracy means features are geometrically correct relative to each other within the dataset, even if they don't align perfectly with real-world coordinates. This is sufficient for many applications where you care about relationships between features more than their absolute position. Absolute accuracy means features are positioned correctly according to a real-world coordinate system. This requires more careful measurement and is essential for applications like property boundaries, construction, or surveying. The difference matters: creating a digitized map with good relative accuracy might cost far less than creating one with high absolute accuracy, but relative accuracy is insufficient for property records. Post-Capture Editing and Topology Once data is entered into a GIS, it typically requires editing to remove errors and ensure topological correctness. In a road network, for example, lines representing roads should intersect at actual nodes (intersections), not pass over or under each other. In a polygon dataset, polygon boundaries should not overlap or leave gaps. This editing process ensures that the spatial relationships in your data match the real world, making analysis meaningful. Coordinate Systems, Projections, and Datum Models To store geographic data in a database, every location must have a coordinate. This requires agreeing on how to represent Earth's positions—the role of coordinate systems, datums, and projections. Earth Models: Sphere, Ellipsoid, and Datum Models The Earth is not a perfect sphere. It's an ellipsoid—slightly flattened at the poles and bulging at the equator. But even an ellipsoid model isn't perfectly accurate everywhere. Datum models are mathematical representations that define a surface and provide a coordinate system for locating any point on Earth's surface. Different regions often use different datum models optimized for accuracy in that area: The North American Datum of 1983 (NAD83) is the standard coordinate system for mapping in the United States and Canada The World Geodetic System (WGS84) is used globally and is the standard for GPS data Geographic Coordinate Systems When data is expressed in latitude and longitude without projection (as angles measured from the equator and prime meridian), it's described as being in a geographic coordinate system. You might see this described as "Geographic Coordinate System: North American Datum 1983" or "Geographic Coordinate System: WGS84." Datum Transformations A critical challenge in GIS work is that different datasets may be based on different datums. Converting coordinates from one datum to another requires a datum transformation. The most common approach is a Helmert transformation, which accounts for rotation, translation, and scale differences between datums. Sometimes, for nearby datums, a simple translation (shifting all coordinates by a fixed amount) is sufficient. Failing to account for datum differences can introduce errors of tens to hundreds of meters—enough to make analysis meaningless or dangerous in applications like construction or surveying. Data Quality: The Foundation of Reliable Analysis A critical principle in GIS: geographic data should be sufficiently close to reality to produce results that correspond to real-world processes. No geographic dataset is perfectly accurate, but it must be accurate enough for your intended use. Sources of Positional Accuracy Variation Not all position data is equally accurate. Consider GPS: A smartphone GPS might be accurate to 5-10 meters because it uses fewer satellites and simpler processing High-end professional GNSS equipment might achieve centimeter-level accuracy using advanced techniques Similarly, digital terrain models (representations of Earth's elevation) and aerial imagery are available at many levels of spatial precision. A freely available worldwide elevation dataset might have 30-meter resolution, while specialized lidar data from an airborne survey might have 1-meter resolution. Your choice of data source directly affects the quality of your results. Understanding Error Propagation Here's a crucial insight: all geographic data contain inherent inaccuracies, and these errors propagate through GIS operations in ways that are difficult to predict. If you measure building locations with ±5-meter accuracy, then calculate areas within 100 meters of buildings, your results' accuracy won't be ±5 meters—it will be worse. The challenge is that different types of operations propagate error differently. A buffer operation (creating a zone around features) might increase error in one way, while an overlay operation (combining multiple layers) might increase it differently. This means you cannot always predict how accurate your final results will be, even if you know your input data accuracy. Understanding these limitations is critical for responsible GIS work. You must consider data quality when interpreting results and communicating findings to others. Summary Geospatial data management and modeling form the foundation of GIS work. Databases store both geometric and attribute information about geographic phenomena—either discrete objects or continuous fields. Data arrives through primary capture (direct measurement), secondary capture (digitizing existing sources), or data transfer. Once in your system, data must be registered to a coordinate system and datum, edited for quality, and carefully assessed for accuracy before use in analysis. Understanding these fundamentals ensures that your GIS work produces reliable, meaningful results.
Flashcards
What two primary components of geographic phenomena are modeled and stored within a geographic database?
Geometry and attributes
In what two ways can geographic data be stored?
As separate files Within a single spatially enabled relational database
What are the two most common types of geographic phenomena?
Discrete objects (e.g., houses, roads) Continuous fields (e.g., rainfall, population density)
How do raster images store geographic data?
As a grid of cells
What types of spatial features are stored in vector data to reference attributes?
Points, lines, and polygons
What are the three main groups of spatial data acquisition methods?
Primary data capture (direct field measurement) Secondary data capture (digitizing existing sources) Data transfer (copying external GIS data)
What technique is used to enter survey data directly into a GIS from digital instruments?
Coordinate geometry
What are the two ways field computers can edit GIS data during mobile data collection?
Wireless connections or disconnected editing sessions
What is the purpose of digitization in a GIS context?
Transferring hard-copy maps or survey plans into a digital medium
What type of data is initially produced by scanning a map before it is converted to vector data?
Raster data
What are the two types of accuracy that influence the cost and interpretability of data capture?
Relative accuracy (consistency within the dataset) Absolute accuracy (alignment to a real-world coordinate system)
What is the primary goal of post-capture editing regarding the relationship between features?
Ensuring topological correctness
What are three common ways the Earth is represented to provide coordinates?
Sphere Ellipsoid Datum models
Which datum model is specifically used for United States measurements?
North American Datum of 1983 (NAD 83)
What is the term for a coordinate system where data is projected in latitude and longitude?
Geographic coordinate system
How does high-end GNSS equipment generally compare to typical smartphone GPS?
It provides greater positional accuracy
What phenomenon occurs when inherent inaccuracies in data move through GIS operations?
Error propagation

Quiz

What process transfers hard‑copy maps or survey plans into a digital GIS format using CAD programs and geo‑referencing?
1 of 18
Key Concepts
Geospatial Data Types
Raster data
Vector data
Geographic data quality
Geographic Information Systems
Geographic information system (GIS)
Data digitization
Map projection
Spatial Reference Systems
Global Navigation Satellite System (GNSS)
Coordinate reference system
Geodetic datum
Remote sensing