Chapter 6
Preprocessing
Preprocessing procedures are used to convert a dataset into a form suitable for permanent storage within the GIS database. Often, a large proportion of the data entered into a GIS requires some kind of processing and manipulation in order to make it conform to a data type, georeferencing system, and data structure that is compatible with the system. The end result of the preprocessing phase is a coordinated set of thematic data layers.
The essential preprocessing procedures include:
format conversion;
data reduction and generalization;
error detection and editing;
merging of points into lines and of points and lines into polygons, where appropriate;
edge matching;
rectification/registration;
interpolation; and
photointerpretation.
We'll examine each of these in turn in this chapter. At the same time, many of these procedures are valuable at other stages in the end-to-end spatial analysis problem. We will point out these additional uses as they come up in later chapters.
6.1 Format Conversion
Format conversion covers many different problems, but can be discussed in terms of two families: conversion between different digital data structures and conversion between different data media. The former is the problem of modifying one data structure (such as those discussed in Chapter 4) into another. The latter typically involves converting source material such as paper maps, photographic prints, and printed tables into a useful computer-compatible form. Note that the reverse of data medium conversion -- changing the data stored within the digital database into maps, prints and tables -- is the problem of generating output products, which we discuss in Chapter 9.
6. 1. 1 Data Structure Conversions
In Chapter 4, we discussed several different data structures. There are many times when different datasets gathered for the same project are expressed in different data structures. Part of the cause of this lack of homogeneity is the nature of the datasets themselves, some data structures being more suitable to some kinds of data than others. This problem is becoming more acute as increasing amounts of current data are created and maintained in various digital forms, while historical records are almost universally stored on paper or film. Another kind of problem that arises frequently is when we have raster datasets, such as digitized photography and multispectral scanner data (discussed in Chapter 10), and our GIS is based on a vector data structure. In these cases, we must be able to inter-convert between the data structures. Similar conversions will be required in order to develop final output products as well (see Chapter 9).
The simplest forms of conversion are between members of a family of structures. For example, there are several common raster formats for raster data. Typical raster GIS datasets include arrays of elevation, rainfall, and classes of land cover. This is also the kind of data that is produced by multispectral scanners, which are common sensors on both aircraft and spacecraft platforms (some applications of these systems are discussed in Chapters 10 and 12). The data produced by such systems may be thought of as an array of brightness values for each wavelength band in the sensor. These systems generate datasets that are comparable to any other multivariate collection of raster data, including problems of geometric registration between the wavelength bands or between different dates of acquisition (Welch, 1985).
There are, however, several ways to organize such datasets. Keeping each variable's data (for example, elevation and annual rainfall, or the different spectral channels from a multispectral scanner) as a separate array is one common method. This method is often called band sequential (BSQ), since each array is kept as a separate file on the magnetic disk or tape. In this case, one data file would contain the elevation array, and a separate file would contain rainfall values. A common alternative, called band interleaved by pixel (BIP), places all of the different measurements from a single pixel together. This organization may be thought of as a single array containing multivariate pixels (Figure 6.1). The first element in this second format would contain the elevation value for the pixel in the first row and column; the second element would contain the rainfall value for the same pixel.
When operations on the data involve a single theme or layer at a time, a BSQ raster database can be the most efficient organization. This is because the specific theme of interest (or spectral channel) can be analyzed and manipulated as a physically independent entity. Conversely, when working with more than one data theme at a time, the BIP organization can be the most efficient. For example, consider a raster dataset with two themes: elevation data points and classes of forest cover. If a principal activity is to operate on a single data layer at a time, such as deriving slope from the elevation data points, the BSQ organization makes the elevation data directly available, without having to read the data files past land-use values. If, on the other hand, an analytic operation requires comparison of both themes on a pixel-by-pixel basis, such as finding the location of the highest elevation for each of several forest-type classes, the BIP organization makes good sense, as the values of the two themes for each pixel are adjacent in the database.
The band interleaved by line (BIL) raster organization is a middle-ground between the extremes of BSQ and BIP. In this form, adjacent ground locations (in the row direction) for a single theme are adjacent in the data file, and subsequent themes are then recorded in sequence for that same line. In this way, the different themes corresponding to a row in the file are relatively near each other in the file. Thus, one expects that its performance on specified tasks will fall between the pure sequential or pure pixel-interleaved forms. This intermediate type of multivariate raster is used in some commercial raster systems.
For those readers with a data processing background, we will briefly add to the complexity. There are two common physical data organizations for BIL-structured data. In one, the physical records hold all the themes from a single row in the array, ordered as described in the previous paragraph. Thus, the number of physical records is the same as the number of rows in the array. In the other common BIL format, a physical record corresponds to a single thematic category. Thus in this second case, the number of physical records in the dataset is the product of the number of rows in the array times the number of unique themes.
The problem of converting between the different raster data formats described above is relatively simple. Typically, a portion of the data in one format is read into a memory-based storage array, and the appropriate pointers created to extract the data values in whatever sequence is required for the new format. Optimizing such conversion software on a given computer is straightforward. We discuss another family of data format complications in Chapter 10.
There are a wide variety of problems that develop when converting datasets between different vector data structures. As we noted in Chapter 4, there are many different vector organizations. As a guiding principle, it is expensive to generate topological information when it is not explicitly present in the vector data structure. To illustrate this point, converting data from an arc-node organization to a relational one is very easy. As we observed in the last chapter, these are very similar data structures, in terms of the way topological information about spatial objects are organized. In effect, these two data structures are really storing the exact same semantic information, with a slightly different syntax. .
At the other end of the spectrum, consider the problem of converting data in a whole polygon structure to an arc-node structure. In a dataset stored in whole polygon structure, there is very little explicitly identified topology. The list of nodes that form the boundaries of each individual polygon is stored. Consider just the problem of extracting the arc-node node list. We must go through the entire list of polygons, and create a list of the unique nodes. This might require a double-sort of all the points in the polygon file, and then a pass through the sorted list to identify the unique nodes. Creating the arc list requires another pass through the whole polygon file, this time cross- referencing edges of each polygon to the corresponding elements in the node list and generating the appropriate pointers. Furthermore, to identity all the polygons that border a given vector requires another complex sorting operation to identify all the shared edges.
Converting vector data into a raster data structure is conceptually straightforward, although practically difficult. For point data elements, the cell or pixel in the raster array whose center is closest to the geographic coordinate of the point is coded with the attribute of the point. Thus, the elevation value from a surveyed benchmark is transferred to the raster cell whose location is closest to that of the original point. Of course, this operation usually changes the stored location of the point -- it is unlikely that the original point location exactly coincides with the center of a raster cell. This approach also ignores the problem of different objects occupying the same cell. Because of these important limitations, the conversion from vector to a raster data structure is not normally reversible: we cannot retrieve the original data points from the derived raster data without error. For some operations, this can be a fatal flaw.
For linear data elements, the data structure conversion can be visualized by overlaying the vector or linear element on the raster array (see Figure 6.2a). The simplest conversion strategy would be to identify those raster elements that are crossed by the line, and to then code these cells with the attribute or class value associated with the line. For lines that are not oriented along the rows or columns of the array, the raster representation shows a stair-step distortion (see Figure 6.2, as well as the discussion of aliasing in Chapter 9). In this first discussion we have ignored the problem of specifying the thickness or width of rasterized line (Peuquet, 1981b).
Polygons can be converted to a raster structure in two steps. First, the line segments that form the boundaries of the polygon can be converted as in the last example, producing what is sometimes called the skeleton or hull of the polygon. Second, those raster elements contained by the polygonal boundaries are recoded to the appropriate attribute value (Figure 6.2b).
Converting raster data to a vector data structure can be a great deal more complex. To keep the discussion brief, we will first examine data where object boundaries are a single pixel wide, though this is rarely the case in real life. A simplistic approach for a trivial binary image (where the data are either in class 0 or class 1 and lines are only a single pixel wide) is illustrated in Figure 6.3. Consider that each raster cell can be represented by a point in its center. We can then draw vectors between all the non-zero 4-connected raster elements. Such an algorithm models all possible vectors as orthogonal vectors, parallel to the rows or columns of the raster. Further, this algorithm constrains all vectors to be of discrete lengths that are equal to an integer multiple of the raster spacing. Of course, vectors in real life do not have either of these restrictions.
This approach will not be able to recover all vectors that have been converted to a raster. Consider the raster-coded vector in Figure 6.2a. Since the straight-line nature of this data element has been lost in the process of conversion, we will not be able to recover the straight line without ancillary information. However, there are algorithms that can be used to extract straight lines from raster data sets under restrictive circumstances, at an increased computational cost.
To illustrate a somewhat more sophisticated approach to raster/vector conversion, let’s start with data that might have come from a flatbed digitizing tablet. The following table describes the (x-y) coordinates of three graphic objects, which we will convert to a raster, and then attempt to convert back to vectors. By returning to a vector representation, we can begin to develop an understanding of the limitations of the algorithms.
Object #l -- a vector, defined by connecting (x-y) points:
1,62
12,62
12,51
1,51
Object #2 -- a vector, connecting (x-y) points:
16,57
20,60
27,54
34,57
31,48
22,52
15,52
Object #3 -- a complex closed region, bounded by the closed (x-y) pattern:
8,46
21,46
11,38
15,48
20,38
These three objects -- two simple open paths and a complex closed region -- are plotted in Figure 6.4a. We'll explore this data by starting with these three vector objects, and converting them to a raster representation as in Figure 6.2. Converting these intermediate raster data back to a vector form, based on the 4-connected neighborhood approach described in Figure 6.3, gives us Figure 6.4b. Notice that the upper left object in the figures is perfectly reconstructed in a 4-connected algorithm, but the other two objects are badly distorted. In particular, elemental line segments between adjacent raster cells that are not parallel to the raster axes have disappeared.
A more complete (but still simple) approach would be to search an 8- connected neighborhood around every raster element, searching for possible vector connections. In addition to the unit vectors orthogonal to the raster axes, this new algorithm permits diagonal vectors to be found. The output of this algorithm is shown in Figure 6.4c. The general characteristics of all three objects are retained. In this case, all the connections between the original points are preserved. However, this algorithm finds many connections between raster cell elements that were not in the original vector dataset.
A further improvement, involving significantly more computation, would be to draw only diagonal lines between elements in any 3-by-3 8-connected region when they are not already connected by 4-connected orthogonal vectors. This additional rule gives us the objects in Figure 6.4d. Where two lines derived from the rasterized boundaries of the objects are close, this algorithm draws an extra line, compared to the original (Figure 6.4a). However, this is a relatively successful recovery of the original objects.
Beyond these theoretical models, practical datasets require several additional functions. As Peuquet (1981a) explains, operational systems require two general functions for converting a raster to a vector dataset. The first, skeletonizing or thinning (Figure 6.5), is required because the input data are not generally as simple as those we have discussed: single pixels for point data and unit-width vectors for both vectors and polygon boundaries. Algorithms for determining the skeleton of an object are sometimes described as a peeling process, where the outside edges of thick lines are "peeled" away, ultimately leaving a unit-width vector. A symmetrical alternative approach is to expand the areas between lines, with the same ultimate goal. In either of these cases, the process is a sequence of passes through the data with each pass producing narrower vector outlines. A third alternative, the medial axis approach, is designed to directly identity the center of a line, by finding the set of interior pixels that are farthest from the outside edges of the original line.
After the raster data have had such thinning operations applied, the vectors implicitly stored in the raster are extracted. The extraction process may be based on the models discussed above. Finally, the topological structure of the lines is determined, by recognizing line junctions and assembling the separate segments into connected vectors and polygons.
Crapper (1984) presented an interesting analysis of the relations between vector and raster thematic data. His concern was understanding the accuracy of thematic maps. Consider an original dataset that is a map of photograph. Overlaying a grid on the original data and making assignments of thematic category, is relatively simple for cells in the interior of a homogeneous polygon. . The difficulty arises at the boundaries between polygons, where single grid cells cover more than one category. As we have mentioned, we could use a plurality rule to assign classes to the boundary cells, or alternatively, we could label them as a unique class. Crapper derives a relationship to estimate the number of boundary cells in this process, and thus gives us insight into the total area of boundary cells.
6.1.2 Data Medium Conversion
Most of the spatial data available today are not in computer-compatible formats. These include maps of many kinds and scales, printed manuscripts, and imagery (based on photographic processes, or generated by non-photographic instruments). Converting these materials into a format compatible with a digital geographic information system can be very expensive and time-consuming. According to the U.S. Geological Survey's Technology Exchange Working Group (U.S. Department of the Interior, 1985, p. iii):
The digitizing of conventional cartographic data is perhaps the most resource intensive phage of constructing a digital cartographic data base or utilizing a geographic information system.
The most common means of converting maps and other graphic data to a digital format is to use a digitizing tablet (Figure 6.6a). A digitizing tablet system consists of several parts, among which is a flat surface on which the map or graphic to be digitized is placed. This flat surface is typically from 1 to 20 square feet in area, and may be back-lit or even transparent to permit digitizing from transparencies. The user traces the features of interest with either a pen-like stylus or a flat cursor. The electronics in the tablet system convert the position of the stylus or cursor to a computer-compatible digital signal, with a typical precision of 100 to 1000 points per inch.
There are several technologies used in commercial digitizing systems. Acoustic systems are relatively low in cost, and often able to work with large- format materials. Such systems use acoustic transducers to triangulate positions on the map or graphic. Electromagnetic and electrostatic systems are also available, and are generally preferred when high accuracy and precision are required. When the tablet has a cursor for tracing data elements, there are often buttons or switches on the cursor itself. This is particularly helpful, since it permits the analyst to select functions without moving from the map or graphic to a computer keyboard.
When it is necessary to digitize a map or other graphic with very high precision, the dimensional stability of the medium can become important. Photographic films are considered very stable with respect to changes in temperature and humidity, with distortions below 0.2 percent (Wolfe, 1983). In contrast, paper shrinkage or expansion can range up to 3 percent, depending on paper type and thickness, as well as processing methods. These may be important considerations.
When beginning a session with a digitizing tablet, the user must specify a number of attributes of the map, as well as the map's location on the digitizing tablet. Typically, the user will be prompted by the system for information about a map's scale and projection; menus with common choices can help the user to enter this information quickly and accurately (see Figure 6.7). After entering this information, the tablet or stylus is used to specify both georeferencing information (for example, by placing the cursor at locations known latitude/longitude) and a region of interest. For well-known map projections on most systems, these procedures permit any subsequent location of the stylus or cursor to be converted unambiguously into a geodetic location.
One of the functions a user should be able to select is the mode of digitizing. In point mode, individual locations on the map (such as elevation benchmarks, road intersections, or water wells) can be entered by placing the cursor over the relevant location and pressing a button. In line mode, straight line segments (such as short segments along political boundaries, straight road sections, or lines of constant bearing on appropriate map projections) are entered by moving the cursor to one end of the line, pressing a button on the cursor, then moving to the other end and pressing a button again. The system automatically converts these two entered points into an appropriate vector. Digitizing curved line segments in this manner can be very exacting work. In stream mode, the location of the cursor on the map surface is determined automatically at equal intervals of time, or after a specified displacement of the cursor (so that points are approximately evenly spaced; Nagy and Wagle, 1979). Stream mode is particularly useful when digitizing curved line segments, such as the boundaries of waterways. However, in stream mode it is often too easy to create very large data files, since data points are entered into the system so quickly. Further, stream mode can be very demanding on the operator.
Neglecting the question of accuracy, digitizing tablets have finite resolution. We normally consider these devices as operating fundamentally for vector input, since we can ostensibly locate any point on the surface of the tablet. However, the finite limits on the precision of these devices provide an underlying raster-like limiting resolution element.
Scan digitizing systems, generally called optical scanners or scanning densitometers, are typically larger and more expensive than digitizing tablets (Figure 6.6b). With many high-precision systems, the map (or graphic) to be digitized is attached to a drum. A light source and an optical detector are focused on the surface of the drum. The drum rotates at a constant speed around its major axis. Because of the rotation of the drum, the detector traces a line across the map, and the electronics in the associated computer system record numerical values that represent brightness (or color) on the map. This traced line across the map corresponds to a single row in a raster of data values. The detector then steps along the axis of the drum, in effect moving
DIGPOL -- Digitize Polygons, Vectors, Points Version 7. 2.06.67
Copyright (C) 1986 ERDAS, Inc. All rights reserved.
Installation: U. of California (Santa Barbara) Remote sensing Unit
--------------------------------------------------------------------------------------------------
Enter Output filename: test1
Setup NEW map or use PRSVIOUS setup? (N, P) [New] :
Select type of coordinates to use:
- UTM - Longitude / latitude
- State Plane - Other
(U, S,L,O) [UTM] : UTM
Select SCALE of this map:
1) 1:24,000 (1”=2000 Feet) 5) 1:63,360 (1”=1 Mile)
2) 1:25,000 6) 1:100,000
3) 1:50,000 7) 1:125,000
4) 1:62,500 8) 1:250,000 (1”=Approx 4 Miles)
9) Other (i.e. 1: )
? 4
Enter UTM X of Reference Point ? 315360
Enter UTM Y of Reference Point ? 1625475
Digitize LEFT Reference Coordinate
Digitize RIGHT Reference Coordinate
* Digitize BOTTOM Reference Coordinate
-------------------------------------------------------------------------
Number of Digitized Points = 11
Data Value = 10
POLYGON Mode
X, Y= 317141.25 1624303.25
Use “A”, “B” or “C” on Keypad
to indicate new data value.
Polygon Mode= AnnnA
Vector= BnnnB Point= CnnnC
Digitize points by pressing
button “1”. Back up by pressing
the “E” button. After the last
point, press button “2” .
If “A”, “B” or “C” not entered,
the previous data value is used.
Continue digitizing polygons
using buttons “1” and “2”.
Signal end of job with key “D”.
Figure 6.7 A digitizing menu. (Courtesy ERDAS, Inc. Copyright 1986. Used by permission.) Setting up a new session requires specifying the map coordinate system (UTM in this case), scale (1:62500), and then indicating a control point coordinate location. Left, right, and bottom reference coordinates then are indicated to calibrate the system for any relative rotation between the digitizer surface and the map's coordinate axes. Points, lines, and polygons are then digitized, with the user controlling the system from the 16-button cursor on the tablet.
the detector to a new row in the raster, and the process repeats. In this way, the original map is converted to a raster of brightness values.
An alternate mechanism uses a line array of photodetectors, which sweeps across the map in a direction perpendicular to the array axis (comparable to the system illustrated in Figure 10.4). Such a device has a much simpler mechanical design than the drum systems mentioned above. In modern systems of either type, the map or graphic to be scanned can be on the order of one meter square, and the scanning step size as small as 20 micrometers. Based on these extreme value, the resulting raster dataset for a one-meter-square map could contain 2.5 ×10 pixels, before additional operations are begun. The resulting scanned raster of brightness (or color) may be used directly, or software can convert this initial raster dataset into other forms.
A scanning system may be sensitive to 100 or more shades of brightness or color in the source document. These shades of brightness, along with information about spatial patterns in the raster, can be processed by the appropriate software so that the various graphic objects in the source documents can be distinguished. When required, the system may convert the raster dataset to a specified vector format (as discussed in section 6.1.1) as well as compress the dataset size in various ways.
A normal sequence of steps in the use of a scanning digitizer would start with the actual scanning process. According to one set of figures, scanning a 24-by-36 inch document at 20 lines per millimeter takes 90 minutes. The raster files are then interactively edited, to ensure that the skeletonizing process accurately extracts the graphic elements in the original dataset. Next, the preprocessed raster data are converted to a vector dataset. The vector data are then structured, to build whatever composite elements (e.g., chains as connected line segments) and topological relations (e.g., containment and adjacency) are required in the ultimate datasets. Finally, the data are interactively edited for quality assurance requirements, and any additional attributes entered and verified.
In many cases, there is tremendous redundancy in the raster digital data developed by an automated scanner. One way to store the data with less redundancy is to use run-length encoding, which we discussed in Chapter 4. For example, if we were to scan a page from this book, there would be a long sequence of identical white pixels from the row at the extreme top of the page, since there is no text at the top. Rather than placing many such sequences of identical pixel values in the data file, it is more efficient to use a run-length encoded file, where we store a single copy of the brightness value and the number of repeated pixels of this value.
Map-like images are often successfully compressed in this way, due to two characteristics of maps: (1) there are generally few unique brightness (or color) values, and (2) there are often horizontally homogeneous sections. In contrasts this strategy rarely works as efficiently with photograph-like images, since the broader dynamic range and generally high texture of these data types provide a smaller opportunity for many horizontally homogeneous runs. A side benefit of a run-length encoded image is that this data structure explicitly codes for the boundaries in a raster array.
Peuquet (1981a) presents case studies of several systems that have been used operationally for converting data from conventional maps to digital datasets. In these systems, the overall conversion process includes scanning the maps, which converts the continuous maps into a discrete raster, extracting the line segments and developing the topologically structured dataset.
In section 6.8, we briefly mention more sophisticated means of extracting three-dimensional data from groups of photographs. These techniques are the domain of photogrammetry, and beyond the scope of this text.
6.2 Data Reduction and Generalization
An input record may require data reduction of various kinds. For example, field crew records of tree-crown diameter may contain several measurements at different points around the circumference of the tree. In a given application, we may need to average the measurements of a single tree, and enter the average into our database. As another example, existing datasets may record vegetation by species, and our requirements may be satisfied by simply recording by species groupings (i.e., deciduous versus coniferous trees).
A more complex form of data reduction involves changes of scale in spatial data. We may need to assemble property ownership records in an area, and the original surveyor's records may be more detailed than we require. There are two obvious options: either accept the level of detail (and thus, incorporate a greater volume of data than necessary, with the attendant increased processing and storage costs), or develop a less precise representation from the original source data. The latter is called generalization (Figure 6.8). For vector data, consider a locus of points, connected by straight line segments, which describe a coastline. In this case, a naive generalization procedure could involve simply replacing pairs of neighboring points with their average geographic coordinates. In this way, we reduce the total number of points by a factor of two, and in the process, reduce the level of detail in the dataset.
A more sophisticated approach would involve modeling the behavior of a modest number of successive points by a numerical algorithm. For example, a polynomial curve of specified order can be least-squares fitted to a sequence of data points, and then a smaller number of points along the path of the fitted polynomial recorded in their place. Another alternative is to eliminate points from the original dataset that are too close together for a specified purpose, or fall along a straight line (within a specified tolerance). The former procedure is a common operation in many computer graphics systems, when points in the original dataset are too close to resolve on the desired graphics device. In a plotter, for example, the limiting resolution could be a simple function of the width of a pen and the repeatability of the plotter's positioning machinery. By repeating these processes, depending on the dataset and the desired resolution, we can derive a lower-resolution representation of the original vector dataset (Figure 6.8b). This approach is often termed thinning.
For continuous raster data (such as millimeters of rainfall per year, or elevation), we can similarly create a more generalized dataset in two steps. First, compute the average value of the attribute in a two-by-two neighborhood. Second, record this average value in a new raster cell at the geographic location of the point shared by the four original raster cells. This kind of procedure, called resampling, can also be used when the required new raster cells are not of a length that is an integer multiple of the initial cell length. This is directly related to the problem of rectification discussed in section 6.6; see Moik (1980) for further details. For nominal and ordinal raster data, rules for aggregation must be developed. Common aggregation rules include determining the majority or plurality class in the averaging neighborhood, as well as defining hierarchies of classes (for example, aggregating records of individual plant species at one raster resolution into species groupings at a coarser resolution).
6.3 Error Detection and Editing
In any information system, facilities must be provided to detect and correct errors in the database. Different kinds of errors are common in different data sources. To illustrate some of the common varieties, we will discuss some of the errors that must be detected when generating a vector dataset, whether developed from a manual session with a digitizing tablet or an automated scanner.
Figure 6.9 illustrates some common errors of the digitizing process. Polygonal areas, by definition, have closed boundaries. If a graphic object has been encoded as a polygon (rather than a vector or point), the boundary must be continuous. Software should be able to detect that a polygon is not closed. Causes for this kind of error included encoding error (the object is a vector, rather than a polygon) and digitizing error (either points along the boundary of the polygon, or connections between the points, are missing).
With some systems, the boundaries between adjacent polygons may have to be digitized twice -- once for each polygon on either side of the shared vector. This also happens on more advanced systems when the document cannot be digitized in a single session. Such systems create slivers and gaps. Slivers occur when the adjacent polygons apparently overlap; gaps occur when there is apparent space between adjacent polygons. Similar problems occur when several maps are required to cover the area of interest. In this case, spatial objects may be recorded twice because of overlapping regions of coverage. An arc-node database structure avoids this problem, since points and lines may be recorded once and then referenced by many different spatial objects.
Errors of several kinds can appear in the polygon attributes. An attribute may be missing or may be inappropriate in a specified context. The latter kind of error is often related to problems in coding and transcription, such as recording an elevation value into a land-use data layer. Software to detect such errors is vital, and it can be very straightforward to develop and use.
In the practical use of a geographic information system, where there is a great deal of attribute information to include in the database, an interactive capability of correcting errors is vital. A development effort at one of the U.S. federal agencies several years ago suggested that the quality of the editing subsystem (in terms of the user interface, speed, and capabilities) is as important to the overall flow of information through a GIS as the choice of a digitizing system.
6.4 Merging
With some modern geographic information systems, the essential elements to digitize from a source map or graphic are the points. If the system describes lines and polygons hierarchically, based on points, we must be able to build these hierarchically-defined spatial objects. Merging is the process of building more complex objects from the elemental points. Based on the data acquired during a digitizing session, vectors are constructed by connecting the appropriate points, and polygons constructed by linking the appropriate vectors. During this process, attribute values and non-spatial information can often be entered.
An important component of the merging process is the identification of objects that are very close to each other. For example, during digitizing we may have entered a line and a point that are less than 1 millimeter apart. By specifying a tolerance of 1 millimeter, we are instructing the system to “snap” the point to the line. Similarly, we may wish to have the system connect line segments that have end nodes within a specified tolerance. These processes have unfortunate side effects, however. Consider snapping the ends of two lines together: the result is that the location of the joining node is between the original line ends. If we now identify a third line, whose end is within the snapping tolerance of the same joining node, the system may again move the stored location of the joining node. Thus, the node locations may migrate through time!
Many of the consistency checks mentioned in section 6.3 can occur at this point in the operation of a geographic information system. This is often the procedure for systems based on arc-node types of vector data structures. This process can be considered developing a topological structure from the raw input data. The importance of an effective man-machine interface, where the computer hardware and software provide a productive working environment, cannot be overemphasized.
6.5 Edge Matching
Often, a region of interest is not completely contained on a single map sheet or a single aerial photograph. Similar problems occur with digital datasets that are stored with arbitrary boundaries (as in the quarter-scene format used in some satellite data, or when dataset coverages correspond to municipal or other political boundaries). In these cases, we must be able to extract information from each of the relevant map sheets, and process the information so that the edges between map sheets disappear to form a seamless dataset. This is particularly important when we are working with digitizing tablets to create digital databases from maps.
Consider the example sketched in Figure 6.10, where a region we wish to digitize into a GIS dataset lies across the boundaries of two map sheets. In this example, we see a water body and roads, both of which cross the artificial boundary between the two map sheets. One option is to place both map sheets on the digitizing tablet or scanning system at the same time, after manually creating a mosaic of the two. This is often impractical, both because the resulting composite sheet is too large for the available digitizing equipment and storage facilities, and because this can destroy the maps for other uses. The more common procedure is to digitize or scan each sheet separately.
Frequently, when the two map sheets are digitized or scanned separately, features that cross the boundary do not align properly (Figure 6.10b). This distortion can come from several causes. Even when maps are printed with no discernible error, the physical size of the map can change with temperature and humidity -- as mentioned before, this can be a significant problem with maps printed on paper. Errors at the margins can also be caused by georeferencing errors during the digitizing process, extrapolations and numerical round-off errors in the georeferencing algorithms, accuracy errors in the digitizing tablet itself, and slivers and gaps caused by overlapping map coverage. In the figure, it is not possible to simply move one of the maps to make both roadway and the water body match properly across the map boundary.
Under perfect circumstances, problems related to map shrinkage would be solved before entering the data into the GIS database. However, this is not always possible. There are two families of adjustments to correct these errors at the edges between map sheets or between different digital data files. In the first, the analyst manually adjusts the locations of points and vectors to maintain the continuity of the dataset. A graphics device is used to display the general area of the boundary between data sets, and the analyst uses “cartographic license” to manually adjust vectors that cross the boundary. This is the case in Figure 6.10c. Such a process implies that we have less information about the spatial data in question than we would prefer; be careful! In the second family of adjustments, automated means are derived to reduce the edge effects. Line attributes and the spatial distribution of the lines on either side of the boundary are first matched. Then, appropriate locations from the line on each side are modified slightly to match exactly at the boundary. The latter is much like the adjustment a surveyor applies to balance traverse data (for details, see Anderson and Mikhail, 1985).
6.6 Rectification and Registration
Rectification and registration are based on a family of mathematical tools that are used to modify the spatial arrangement of objects in a dataset into some other spatial arrangement. Their purpose is to modify these geometric relationships, without substantively changing the contents of the data itself. Rectification involves manipulating the raw dataset so that the spatial arrangement of objects in the data corresponds to a specific geocoding (Or geodetic coordinate) system. As an example, a rough sketch of an undeveloped lot, made on-site by a landscape architect, typically does not show objects in their true spatial relationships. Once the sketch has been brought back into the office, the sketch can be redrafted (perhaps as a overlay on a reproduction of an existing map that shows legal boundaries and easements) so that it conforms to a useful coordinate system.
Registration is similar to rectification, but not based on an absolute georeferencing scheme. Both sketches from a field team and uncontrolled aerial photography have distorted views of the Earth's surface. The registration process involves changing one of the views of surface spatial relationships to agree with the other, without concern about any particular geodetic referencing system.
6.6.1 Map Projections and Coordinate Systems
Before considering the details of rectification and registration, let us briefly review the problems of extracting spatial information from the Earth's surface. The Earth's shape is roughly spherical, deviating from a perfect sphere due to gravity, centrifugal force, the heterogeneity of the Earth's composition, tectonic forces, and the influence of man's activities. A spherical globe can be a reasonably accurate representation of the Earth, since the oblateness of the general form of the Earth is less than one part in 300 (Robinson et al., 1978). However, a globe is inconvenient to manipulate and stores and expensive. Therefore, we must be able to work with maps: flat representations of the Earth's surface.
There are an infinite number of different map projections, since there are an infinite number of ways to project the features of the Earth's surface onto a plane. A variety of projections are in common use, since no single projection can be perfect for all users (U.S. Department of the Interior, 1984). Furthermore, there are a variety of commonly used systems for specifying locations on the Earth's surface.
The most common way to specify locations on the Earth is the latitude/longitude system, in which locations are specified in terms of the angular deviation north or south of the equator (to derive latitude) and the angular deviation around the circumference of the Earth at the equator (measured most often from Greenwich, England, to derive longitude). Unfortunately, it is not always convenient to convert these angular specifications into such familiar things as distances and areas.
Another family of common reference systems uses distances from a specified reference point as the basis of all locations. Such systems of reference are called plane coordinate systems. The Universal Transverse Mercator system is the best-known plane coordinate system, and is based on a transverse Mercator projection. In this system, the projection is secant to the Earth's surface, to balance scale variations. Further, the UTM system divides the Earth's surface into zones that are 6 degrees of longitude wide. Each zone is numbered, and the quadrilaterals 8 degrees of latitude high within a zone are lettered. This scheme of lettering and numbering provides a simple mechanism for locating areas on a coarse grid. Precise locations on the Earth are described in terms of north-south and east-west distances, measured in meters from the origin of the appropriate UTM zone.
Map projections are categorized in several ways (Monmonier and Schnell, 1988). A primary differentiation of classes of projections is based on the geometrical model of the projection (see Figure 6.11). In the simplest azimuthal projections, the map is constructed by placing a plane tangent to a point on the surface of the Earth. Features on the map are generated by systematically transferring the locations of these features from the sphere to the plane. We can imagine a transparent Earth, with an azimuthal plane tangent to the north pole, and a point light source at the south pole. The rays of light from the source may be traced in a straight line through points on the Earth's surface, to the tangent plane. Distortions are minimal in the immediate vicinity of the tangent point. These maps are popular for air navigation and radio transmission, with the destination or origin placed at the tangent point, since great circles passing through the tangent will be straight lines on this map. A more sophisticated version of the azimuthal projection is based on an intersecting or secant plane. In this case, the intersection of the plane and sphere describes a small circle, and the patterns of deformation are different than those of a tangent plane.
Conic projections are based on a cone placed over the Earth, oriented so that the intersection of cone and the Earth forms one or two small circles. Polyconic projections involve the use of a series of cones, each used to map a small fraction of the surface of the globe. For cones oriented conventionally, the axis of the cone is the same as the axis of the Earth's rotation. In this case, the small circles at the intersection of the cone and the Earth correspond to latitude lines. This family of projections is commonly used for displaying areas that extend a greater distance in the east-west direction than in the north-south, to take advantage of the orientation of the areas on the map of minimum deformation.
Cylindrical projections are based on a cylinder placed over the Earth. Again, in the simplest case, the cylinder is tangent to the sphere, with the intersection forming a great circle. In the most common orientation, the intersection is the equator. By reducing the diameter of the cone, the cylinder is secant to the Earth, and the cylinder intersects the Earth at parallel small circles.
The second set of classifications of map projections involves the patterns of deformations or distortions in the map. A projection is called conformal if there are no angular deformations. In order to maintain the accuracy of the angles, shapes of objects on the map are distorted. Thus, conformal projections are not good choices for the measurement of distance, since the scale changes across the map. Of course, the smaller the area covered by the map, the less important this problem becomes. The most well-known conformal map is Mercator's. A Mercator projection is based on the cylindrical projection, with mathematical adjustments applied so that all rhumb lines (lines of constant direction) appear as straight lines. Such a map has obvious applications in navigation.
Equivalent projections display relative areas correctly. Equivalence is an important consideration for data used as input to a geographic information system, so that inferences about areas are correct. By modifying the features on a map so that areas are correct, directions become unreliable. Common equivalent projections are the Albers conic, a frequent choice for maps that cover a large east-west distance, and the Lambert azimuthal equal-area projection.
Perspective projections represent the way a camera views its world. They are based on straight lines that pass from the object at a distance through a specified point of intersection at a finite distance from the plane of the projection (Thompson, 1966). Perspective projections present many problems to the user of a geographic information system. One of the simplest problems is that the scale in a perspective representation of the Earth's surface varies as a function of distance from the viewer: objects of the same size appear smaller when they are farther away. There are photogrammetric tools that can take this common but complicated view of the Earth and convert it to a planimetric representation (see, for example, Wolfe 1983). These are briefly introduced in section 6.8.
In practice, many geographic information system applications use plane coordinate systems such as the Universal Transverse Mercator system. These have the advantage that distances as recorded in the database correspond to distances one might measure on the (hypothetically flat) ground. When applications require the accurate measurement of area over very large distances, coordinates are often stored in the latitude-longitude system, with calculations based on an equal-area projection of the globe.
6.6.2 Exact Numerical Approaches to Rectification
There are instances when the input and desired output datasets are well-enough characterized that an exact numerical or algorithmic solution to the rectification problem exists. This is most often the case when the source data is a map of the Earth, in a common and specified projection and coordinate system, and the desired output is another map, in another coordinate system and projection. Another example is in the field of photogrammetry, where detailed calibrations are available for the camera system.
What is the INPUT Coordinate Type?
0 = Geographic (Lon/Lat)
1 = UTM 11 = Lambert Azimuthal Equal Area
2 = State Plane 12 = Azimuthal Equidistant
3 = Albers conical Equal Area 13 = Gnomonic
4 = Lambert Conformal conic 14 = Orthographic
5 = Mercator 15 = General Vertical Near-Side Persp
6 = Polar Stereographic 16 = Sinusoidal
7 = Polyconic 17 = Equirectangular
8 = Equidistant Conic 18 = Miller Cylindrical
9 = Transverse Mercator 19 = Van der Grinten
10 = Stereographic 20 = Oblique Mercator (Hotine)
? 4
Lambert Conformal Conic
Enter LATITUDE of FIRST STANDARD PARALLEL? 60
Enter LATITUDE of SECOND STANDABD PARALLEL? 40
Enter LONGITUDE of CENTRAL MERIDIAN? -105
Enter LATITUDE of ORIGIN of PROJECTION? 0
Enter FALSE EASTING at CENTRAL MERIDIAN? 2000000
Enter FALSE NORTHING at ORIGIN? 0
Enter LATITUDE of ORIGIN of PROJECTION? 0
Enter FALSE EASTING at CENTRAL MERIDIAN? 2000000
Enter FALSE NORTHING at ORIGIN? 0
SPHEROID selection :
1 = Clarke 1866 11 = Modified Everest
2 = Clarke 1880 12 = Modified Airy
3 = Bessel 13 = Walbeck
4 = New International 1967 14 = Southeast Asia
5 = International 1909 15 = Australian National
6 = WGS 72 16 = Krasovsky
7 = Everest 17 = Hough
8 = WGS 66 18 = Mercury 1960
9 = GRS 1980 19 = Modified Mercury 1968
10 = Airy 20 = Sphere of Radius 6370977m
>>> Enter spheroid number ? [1] : 1
Figure 6.12 Rectify menu entries. (Courtesy ERDAS, Inc. Copyright 1986. Used by permission.) This shows a session on an ERDAS system, running the function that converts vector data from one map projection to another. We present only the portion of the session in which the user specifies the map projection of the original dataset; specifying the desired projection for the resulting dataset is essentially the same.
When dealing with maps, we must have explicit information about the projection. For example, let's examine the information required to understand a common equal-area projection, the Albers conical projection. The first consideration, of course, is to find out which map projection is used for the map base, usually from information in the map legend or accompanying text. Next, determine the map scale, either from a printed notation on the map, or from a scale bar, or by using known distances between points to measure the scale. Finally, we must find the characteristics of this particular projection: in this case, the principal latitudes and the central meridian. The U.S. Department of the Interior reference (1984) provides very useful information on 20 common map projections.
A portion of some commercial software, based on this document, is shown in Figure 6.12. In this example, one can see the details required when specifying a map projection. From the menu, the user has selected Lambert's Conformal Conic projection. Latitudes of two standard parallels are required, indicating the small circles of minimum deformation. A meridian of longitude is selected, indicating the center of the map projection in the east-west direction. False northings and eastings are entered, at a specified geodetic coordinate location, to indicate the local reference axes. And finally, a choice must be made between alternative spheroidal representations of the Earth's shape. All of these components must be specified to determine the one specific map projection of interest, from the infinite number of possibilities.
An example of rectification, using an exact numerical approach, is shown in Figure 6.13. This example is taken from recent research in the boreal forests of North America. As part of a background study, we were interested in comparing several different maps that were developed to show the spatial distribution of species group's. As a part of this work we needed to take the different maps and convert them all to a common projection and scale. The figure is a copy of a plotter output from a geographic information system, after a map depicting vegetation patterns in Canada was digitized (Rowe, 1959). The map is based on a conic projection; this is clear from the shape of the Canada-United States border, which corresponds over much of its length to the 49 degrees north latitude line. The research project required the rectification of this dataset to a latitude-longitude grid, so that it could be easily compared to several other datasets. The results of the rectification procedure, based on an algorithmic conversion between projections, are shown in Figure 6.13b.
The details of exact numerical rectification are based on equations of three-dimensional geometry, and are beyond the scope of this introductory discussion. For further details, see Snyder (1985). Modern geographic information systems usually include software to convert datasets among a set of specified map projections and geocoding systems.
6.6.3 Approximation Approaches to Rectification
Frequently, we do not have an exact solution to the problem of rectification, either because we do not know the details of the map projection in the datasets, or because the data do not conform to a standard projection or georeferencing system. An example of the latter situation may be found in an oblique aerial photograph, in which the changes in scale across the image are due to the particular configuration of platform altitude, camera system alignment, and topography. A common approach, based on statistical operations, is called a ground control point rectification. In this technique, locations of objects in the source data (often based on the arbitrary reference grid of raster row and column) are compared to their locations on a trustworthy map (Figure 6.14). A statistical model is then used to convert the data into a new set, with the desired geometrical characteristics (Ford and Zanelli, 1985).
This procedure is often called a rubber sheeting operation, based on the following argument. Imagine starting with an uncontrolled off-nadir air photograph. This image will have many kinds of distortion, due to aircraft motion, perspective problems, the curvature of the Earth, scale changes due to topography, the distortions in the camera lens, and so forth. Imagine further that the film in the camera be stretched a great deal without tearing.
Next, place the film over a “correct” map of the Earth, selected based on the desired map projection and scale. Next, identify a location on the map that is easy to distinguish on the image, and run a pin vertically through the location on the image, down to the corresponding location on the map. This fixes the first location on the film to the corresponding location on the map. Next, identify another unambiguous location, place a pin through the photograph at this point, and run the pin down to the corresponding point on the map, stretching the film as required (Figure 6.14). This is repeated for each additional location as required. The identified image points which have been attached to the map are the ground control points, or tie points, and the number required depend on the geometric properties of both the desired map projection as well as the image. Note that the final desired spatial organization does not need to be a well-understood map projection -- we may well be registering one photograph to another, where neither is an orthographic representation of the Earth's surface.
The rectification process involves building a numerical coordinate transformation between the original image coordinates and the rectified (or “true”) coordinate. Frequently, these transformations are based on polynomials, whose coefficients are computed by regression on the coordinates (see, for example, Welch et al., 1985). For example, we might use the pairs of coordinates (X,Y) as measured on the photograph versus (longitude, latitude) for a dozen ground control points to compute:
longitude = a + b*Y + c*X + d*Y + e*Y*X + f*X
latitude = g + h*Y + i*X + j*Y + k*Y*X + I*X
In these equations, a and g are called intercepts, and b to f and h to I are the coefficients, of the regression equations. The calculated intercepts and coefficients of the polynomials could be generated from the geographic information system itself or from a separate statistical package, and applied to the input (X,Y) data to produce (longitude, latitude) data. Moik (1980), among others, goes through the details of the process. Another numerical model for rectification uses only the nearest few ground control points to calculate the interpolation functions. This kind of procedure can provide reasonable performance, balanced by a need for many ground control points.
Once the appropriate numerical model for converting the source data to a useful geometrical form has been developed, the model is applied to the source data. When the data are in a vector form, the procedure is straightforward. The geodetic coordinates of the points or nodes are converted from the original reference system to the desired one, and all links between points and the accompanying vectors, polygons, and ancillary data are left in place.
The closest relative to a rectification procedure on vector data for raster arrays is termed nearest-neighbor interpolation. In this operation on raster data, we can convert the location of the center of a specified cell in the output image to a location in the input array, using the numerical model developed above. The attribute (or multispectral brightness) used for the output dataset is the precise value found in the cell nearest to the calculated input array location. In this way, the resulting data values in the derived dataset correspond exactly to values in the input data. This movement of raster element values from input to output arrays often causes a blocked appearance in the derived data (Figure 6.15), particularly when a rotation is involved.
A more complex means of rectifying raster data is to use a larger number of raster cells in the input dataset to derive the value of a single cell in the output data. In a bilinear interpolation, the values of the four cells in the input array whose locations most closely correspond to a cell in the output array are used. Linear weights are applied to the input cell values, based on their distances from the computed location that corresponds to the center of the output cell in the input coordinate system. This method requires substantially more effort than a nearest-neighbor algorithm, but the resulting dataset has a less blocked appearance. The linear interpolation performed in this algorithm does cause a loss in dataset resolution, however.
Cubic interpolation is similar to bilinear interpolation, in that multiple data points in the original dataset are used to derive a single cell. However, in a cubic interpolation, a cubic polynomial is used to model the local behavior of the source data, and this local model provides an estimate of the characteristic value of the derived cell. This third algorithm often avoids the blocked appearance of a nearest-neighbor approach, and may degrade the spatial resolution of the data less than that of a bilinear interpolation. In exchange, there are increased costs of the computations due to the more complex calculations as well as the need to examine 16 cells in the input raster for each output cell.
For both bilinear and cubic interpolations, there is one important additional caution. In each case, the interpolation procedure averages a number of input data cells to calculate an attribute value for an output cell. Thus, the digital values in the derived data array may be values that do not appear in the input dataset. If the data values do not behave reasonably after a spatial averaging function, the derived datasets can be very misleading. Consider an area at the border of a lake, where we may use class 1 to indicate the sand on the beach, class 2 to indicate the paved roads and parking lots, and class 3 for the water. At one place along the edge of the lake, we find class 1, the sand, adjacent to class 3, the water. If we use an interpolation function to rectify these data to a specified map projection, which averages some of the class 1 pixels and some of the class 3 pixels, we may infer from the calculations that there is a class 2 pixel at the lake border -- in this case, we have spontaneously created a road that does not exist! In other words, for nominal and ordinal data types (as defined in Chapter 3), these numerical procedures are not appropriate since the averaging process is not well behaved.
A classic exercise in rectification and registration consists of taking a small dataset in one coordinate system, rectifying it to a second, and then rectifying back again to the original coordinate system. This can be very instructive to a new user, and can help the user develop an understanding of the capabilities and applications of the algorithms. For example, convert a small dataset in the UTM coordinate system to one in which distances are measured in miles and local magnetic north provides the reference direction. Such a conversion involves both rotation and scaling of the dataset. For vector data, the first (or forward) transformation should produce a reasonable representation of the initial data, and the second (or reverse) transformation back to the original coordinate system will be quite successful. For raster datasets, however, the choice of nearest-neighbor, bilinear, or cubic interpolation schemes will dramatically alter both the visual quality of the derived data as well as the success of restoring the data to the original georeferencing system.
6.6.4 Rotation, Translation, and Scaling of Coordinates
Rotation, translation, and scaling are mathematical components of the rectification and registration procedures discussed in the last sections. Rotation might be used to reorient data taken relative to magnetic north, so that the dataset is oriented relative to true north. Translation of the origin of a coordinate system is very useful, particularly when considering field measurements. A simple example of the use of translation of vector data involves a field survey, in which all recorded locations are determined relative to a local control point, such as the starting point of the survey. Thus, the survey coordinates are recorded as bearings and distances in meters from an arbitrary local origin. After the survey is completed, the locations can be biased by, for example, the Universal Transverse Mercator coordinates of the starting point, thereby changing all the measured locations from a convenient but arbitrary system of notation to a standard coordinate system. Scaling -- changing the values used to describe a location by a factor -- might be used to change from metric to English units of distance.
The computer graphics industry has developed a formalism for modifying coordinate locations to take rotation, translation, and scaling into account. While geographic data often requires higher-order adjustment as well, this matrix algebra development is useful. The following discussion is limited to two-dimensional data; it is taken mainly from Foley and Van Dam (1982), who also consider the three-dimensional case. Kemper and Wallrath (1987) also present a similar discussion in their article on geometric modeling in database systems.
Coordinate locations can be translated by simply adding the appropriate offsets to the coordinate pairs. Algebraically,
x = x + xoffset y = y + yoffset
This can be conveniently rewritten in matrix notation as:
P = original point location = [x y]
P = translated point location = [x y]
O = offset vector = [xoffset yoffset]
and thus,
P = P + O.
This operation is appropriate for either the nodes in a vector dataset, or the cell center coordinates in a raster dataset, subject to roundoff errors in the calculations.
Scaling involves applying scale factors to the coordinate pairs. The most simple use of a scaling function would be to convert between lengths in miles and lengths in kilometers, by multiplying the coordinate locations in miles by the factor:
Thus, scaling or multiplying the (easting, northing) coordinate pairs by the constant l.609 converts displacements in miles to displacements in kilometers. In matrix notation,
S = scaling matrix = [ xscale 0
0 yscale ]
and thus,
P = P·S
In this example, the scale factor is the same for the x and y axes, which is sometimes called a uniform or homogeneous scaling. Under homogeneous scaling, shapes are preserved: circles remain round, and the length and width of squares remain equal. When the scale factors are not the same, we have differential scaling, and the proportions of objects are changed: circles become ovals and squares become rectangles.
There are a number of situations in which coordinate locations must be rotated through a specified angle. These include maps based on aerial photographs taken from a flight line that is not oriented due north. When rotating coordinates through an angle a about the origin (where positive angles are measured clockwise), the following equations are used:
x = x*cos(a) - y*sin(a)
y = x*sin(a) + y*cos(a)
Again moving to a matrix formulation:
R = rotation matrix = [ cos(a) sin(a)
-sin(a) cos (a) ]
and ,
P = P·R.
To tie these three geometric operations together, we introduce the concept of homogeneous coordinates (Foley and Van Dam,1982). In homogeneous coordinates, it may be useful to apply a single scale factor to both the x and y values, so that a point is represented by:
P(x,y) = > P(W*x, W*y, W)
The scaled x and y values can be converted to absolute cartesian coordinates by dividing the values by W, the scale factor, to recover x and y. Thus, points are now triples: P(X,Y,W). To show the value of this formalism, consider the scaling example above, where we converted our vectors from miles to kilometers. In the new notation system, all that is necessary is to change the value of W.
Since the elemental points are now three-element vectors, we must modify the transformation arrays above:
Translation:
[ 1 0 0
p = [x y l] = [x y l] · 0 1 0
xoffset yoffset l ]
Scaling:
[ xscale 0 0
p = [x y l] = [x y l] · 0 yscale 0
0 0 l ]
Rotation:
[ cos (a) sin (a) 0
p = [x y l] = [x y l] · -sin(a) cos(a) 0
0 0 l ]
With this new matrix formulation, an appropriate sequence of translation, rotation, and scaling functions can be merged into a single matrix for the application at hand. For example, let’s examine conversion from displacement in miles to kilometers, followed by a rotation of 8 degrees counterclockwise, to simulate a conversion from magnetic to geodetic north. The scaling operation becomes:
[ 1.609 0 0
p = p · 0 l. 609 0
0 0 l ]
As we have observed, in this case of homogeneous scaling (since xscale = yscale), we could have absorbed the scale factor into W. Considering the rotation operation, we get:
[ 0.9848 0.1736 0
p = p · - 0.1736 0.9848 0
0 0 l ]
Putting these two operations together in sequence, scaling before rotation, we can develop a single routine to produce the combined effect:
P = (P·S)·R
[ 1.5845 0.2793 0
p = [ x y l ] · -0.2793 l.5845 0
0 0 l ]
For vector datasets, it is a simple matter to use these operations on the points or nodes that are the fundamental units, and then construct vectors and polygons from the transformed points. For raster datasets, the operations may be substantially different. This is largely because with vector data, we are able to place objects at any arbitrary location in space, limited by our measurements. In raster datasets, a given raster cell’s attribute value is often a spatial average of the characteristics of the primitive objects in the cell. In a raster array of population density, for example, a cell which contains a large apartment building surrounded by open space may have the same overall population density as a cell which is completely covered by a rectangular array by the selected points.
Weighting the contribution of each control point by a normalized inverse of the distance from the control point to the interpolated point.
This is an extremely common approach, in which we assume that locations farther away are less important. The results of such a process for groundwater elevation data are illustrated in Figure 6.16d.
Weighting the contribution of each control point by the square of the inverse of the distance from the control point to the interpolated point.
This is a similar approach to weighting by the inverse of the distance, but in this case, locations further away are even less important.
A common geometrical model for interpolation involves building triangular networks, which are based on groups of three nearest control points. This model works as follows. To determine the estimated elevation at any point on the surface, find the nearest three control points. Determine the plane that goes through the three points, and then calculate the elevation of the desired point from the elevation on the plane that corresponds to the desired x-y position (see Figure 6.16c). This is directly related to the one-dimensional model we examined in Figure 6.16a.
Since these procedures involve search (to determine the appropriate points to consider for each desired location), distance calculations, and floating point arithmetic to perform the averaging functions, they can be computationally expensive.
The calculation of trend surfaces appears easy at first glance. An ordinary statistical program can be used to calibrate the model, where input data point values (such as elevation or population) are a polynomial function of x-y location values. A polynomial of an appropriate order is chosen to fit the points. Then, the function is use to estimate values at new points. In practice, it is important to understand that high-order polynomials tend to oscillate between fitted points, and this may produce unreasonable surfaces. One practical approach that avoids this kind of behavior is to fit low-order polynomials separately in each of a number of local neighborhoods throughout the data area. This kind of piece-wise approximation is often called a finite element approach. In this way, a separate polynomial is estimated for each local area, and known values distant from the area do not affect the local computations.
developed, which are designed to minimize the errors in the estimated values. The method is based on estimating the strength of the correlations between known data points, as a function of the distance between the points (Burgess and Webster, 1980; Webster and Burgess, 1980). This information is then used to select an optimal set of weights for the interpolation. Variations have been developed to include a trend-surface model component, which permits estimating values outside the area of known points (Figure 6.16e); this is not possible with a simple distance-weighted model as shown in Figure 6.16d.
6.8 Photointerpretation
An important but sometimes neglected component of data preprocessing in a geographic information system is photointerpretation: the process of extracting useful information from photographs and other images. Note that when we speak of photographs, we imply a data-acquisition activity that involves a camera or camera-like system, with lens and film. There are many other non-photographic systems that produce image data (see Chapter 10 for a number of examples) and we often apply the techniques of photointerpretation to these images as well.
Photography provides a tremendous resource for spatial data processing. Aerial photographs have been made of large areas of the Earth’s surface, albeit principally in developed countries. Aircraft photography can be an inexpensive means to gather information about very large areas in a short amount of time. Also, photographs are efficient means of storing large amounts of data in relatively small volumes of space.
The development of new photographic systems has not stopped by any means. One of the most interesting recent developments is a device called the Large Format Camera (Doyle, 1985). This instrument has been flown on NASA’s space shuttle, and was designed for high-precision mapping applications. It has a focal length of 30.5 centimeters, with a film size of 23 by 26 centimeters, and was designed specifically for mapping large areas.
The analysis or imagery, when viewed in its most basic form, is a highly complex process involving numerous components. These components include:
a basic understanding of the specific application for which the imagery is being used,
a general knowledge of the geographic area of the imagery,
experience and expertise in image analysis techniques, and
a sound understanding of image characteristics.
The essential tasks of photointerpretation are identification, measurements and problem solving. Identification is recognizing the features of interest. Once these features are identified, it is possible to make measurements of such things as the distance between objects or the total number of features per unit area. Some of these measurements, of course, require that we understand the photograph’s viewpoint, in terms of the variations in scale in the photograph due to the geometry of the camera/platform/Earth-surface system, the date and time of the image (to be able to calibrate shadows for sun angle, and to understand such things as seasonal effects), and so forth. The techniques of making accurate measurements from photographs are termed photogrammetry.
The process of photointerpretation is normally based on a systematic examination of the photograph, in conjunction with a wide range of ancillary data. The ancillary data are often very diverse, and may include maps, vegetation phenologies, and many kinds of information about human activities in the general area. The best photointerpreters have expertise not only in the delimited field of photointerpretation, but also in such related disciplines as physical geography, geology, and plant biology and ecology. In addition to these systematic components of image analysis, human interpretation also includes a significant perceptual or subjective component. Every analyst perceives features visible within imagery in a different manner. Also, the background of each interpreter is different in terms of knowledge, experience, and expertise in the various systems components and applications domains of image analysis. Consequently, human image analysis represents components of both an art form as well as a strict science.
The features found in a photograph are characterized by a list of image elements or primitives, which may be thought of as the basic characteristics of the image (Lillesand and Kiefer, 1987). These include:
Tone or Color:
The brightness (or color) at a single location on the image. The bright intensity of fresh snow differs from the darkness of deep fresh-water bodies (neglecting skylight reflected from the water’s surface).
Size:
Size of an object can be considered in two different contexts. Relative size is a description that compares some objects to others in an image. Absolute size makes explicit use of our knowledge of the scale of an image; it requires detailed knowledge about the image that is not always available. However, in many images, we may be able to recognize objects whose size we know, and thus, we may infer the absolute sizes of other objects.
Shape:
The general form of an object is an important interpretation element. It is sometimes difficult to develop a taxonomy of shape, to be able to describe different shapes in a consistent way. The Pentagon Building in Washington, D.C., which appears as concentric five-sided shapes from the air, is the classic example of an object that may be identified by shape alone.
Texture:
Texture is based on spatial changes in tone. Texture in an image may be discernible at some scales, while at other scales, the spatial variations can not be resolved. Fine-grained texture implies large changes in brightness over small distances, such as the variations in brightness in a grassland. Coarse-grained texture implies relatively homogeneous regions with sharp discontinuities at the boundaries; areas of intensive agriculture often show a coarse texture from aircraft altitudes.
Shadow:
The shadows cast by objects can be important clues to interpretation. When data about a photograph include information for determining sun angle at the time of the image acquisition, shadows may provide excellent information about the height of objects. At the same time, shadows make it difficult to identify objects that are located in the shadows.
Pattern:
Pattern is discerned when there is predictability in spatial arrangements. The regular arrangements of trees in an orchard is an obvious example of pattern. Pattern is sometimes described as organized texture.
Site:
Site refers to geographic location. Site can be extremely valuable for understanding vegetation distributions in photographs, since vegetation species groups may only be found in certain kinds of sites (for example, swamps versus well-drained upland areas).
Association:
Association is based on the interrelationships of features at a location. For example, a fossil-fuel power plant is likely to be found associated with railroad facilities and power lines.
Tone can be considered the most basic of these elements, since without variations in tone, objects cannot be distinguished. Size, shape, and texture are more complex and are based on analyzing and interpreting individual features. Finally, shadow, pattern, site, and association may be considered the most complex, since they involve relationships between features.
As an example, we will briefly go through the image elements, based on the photograph in Plate 1. In this color infrared photography, the red tones indicate lush vegetation. Size and shape are diagnostics of several features, including the cars in the parking lot and the roofs of the houses. There is texture in the patterns in the sand on the beach, as well as in the vegetated median strip in between the lanes of the road. Shadows are evident adjacent to the houses; if you look closely, you can also see shadows from the street lights which (with information about the angle of the sun at the time of the photograph) can be used to estimate their elevation. The pattern of the lines in the image clearly tell us that there is a parking lot covering a large fraction of the image. The pattern of circular objects in the sand, combined with their size, suggests that these are fire rings (concrete rings in which it is permissible to light a fire). The association of the parking lot, fire rings, and beach suggests a recreational area.
The essential processes of human image analysis consist of the recognition of features and the formation of inferences (the latter through the synthesis of several elements of image interpretation). Features are recognized by the analyst’s taking the elements of an image and applying to them procedures, techniques, and tools of analysis; it is through detection and measurement that objects and phenomena in an image are identified and their significance judged. Inferences are drawn typically through deductive reasoning using knowledge of both the specific application and the geographic area being examined.
A methodical approach is of particular importance to effective image analysis. By approaching an interpretive project in a systematic manner, the analyst is able to examine comprehensively all of the evidence present in the given imagery, minimizing the probability of an incorrect interpretation. Such an approach involves proceeding from the general to the specific, and from the known to the unknown. Implicit in this methodology is the development of parallel lines of reasoning through hypothesis testing, and verification of conclusions through convergence of evidence. The process is thus a deductive one, involving the recognition of basic features and a critical examination of evidence present within the imagery. A significant component of manual interpretation consists of identifying the most probable alternative from numerous hypotheses.
Photointerpreters commonly work with interpretation keys, which help to organize the process recognizing and characterizing features in images. One common form is called a dichotomous key, since at each step two alternatives are presented. A decision at a step in the key eliminates a number of possible interpretations, and the analyst then is guided to another decision. By stepping through these decisions, the analyst is led to a single (hopefully) correct interpretation of a feature. The interpretation keys of photointerpretation are closely related to those used in plant and animal biology to identify organisms to the species level.
In addition to evidence within the imagery itself, additional collateral or ancillary data are often highly useful, particularly when present in a geographic information system. Yet today this is seldom the case in manual image analysis; interpreters must search out or often create these data themselves. This can include ground truth data (such as field observations of species of vegetation), geological and geophysical information (including laboratory measurements of soil characteristics), and scientific literature (for example, a publication describing principal crops and forestry practices in a region).
Often these sources of ancillary data provide important clues that either substantiate or refute an interpretation. Also, the nature of the specific imagery used will have a large impact on the appearance of objects within an image. Factors such as sensor type (e.g., photographic, digital scanner, synthetic aperture radar), scale, spectral resolution, spatial resolution, sun elevation, time of year, and atmospheric conditions will have an influence on the utility of any given imagery for a specific application. Therefore, an image analyst must have a solid background in the characteristics of the type imagery being used, and in the potential types, sources and magnitudes of error that the imagery may contain or that can result from the variety of processing step which they may undergo. These are briefly discussed in Chapter 10.
Image analysis is thus an iterative process in which various types of information and the interrelationships between these and among the information types is examined. Evidence is collected, hypotheses are tested, and interpretations are made and iteratively refined in order to derive a correct interpretation. Often the derivation of new information requires the assimilation of lower-order image elements using convergence of evidence criteria. Consequently, the human analysis tends to be highly unstructured in nature due to the large degree of inter-dependencies between the various low-order image elements employed in the analysis procedure.
The output of this overall interpretation-analysis process is typically an overlay that contains the features or phenomena that have been extracted from the data. Less often the output of the analysis process may be produced in tabular or documentary form. Examples of these types of products might include a forest stand map, a product that can be used by resource managers to make decisions concerning harvesting scheduling. One aspect of such an analysis here might be timber-volume estimates for the stands identified on the image. As previously stated, however, the product of the analysis process is most often a map-like product because the geometry of the image format can produce distortions that must be taken into account when the analyst’s results are prepared for input to a GIS.
In some instances interpretations from an image can be directly transferred to an overlay graphic of some kind. In other cases the product is often transferred to a map base using a direct entry process with local “eyeball” corrections for image distortions. The choice of a specific procedure to be used is governed by the overall accuracy, cost, and time requirements of the project being accomplished. Generally speaking, the higher the required accuracy, or the larger the area covered, or the more complex the analysis problem, the higher the cost. It is important to point out that an image-based analysis process is often a more cost-effective method of obtaining environmental information than relying exclusively on field work. Indeed, some types information cannot realistically be obtained by any means other than aircraft or satellite data and photointerpretation.
There are a number of kinds of photogrammetric instruments that can be of great value during the preprocessing phase. Wolfe (1983) discusses their operation and the mathematics behind their design. We will briefly mention only a few of these devices.
There are a variety of simple devices for making measurements on photographs. Many of these are related to the vernier caliper, a common tool in a machine shop. This type of tool is used to measure length with very high precision. The complex part of the operation is to determine the relationship between the measurement of distance on the image and the true distance on the ground. The same comments are true for measurements on maps, since, as we have discussed, scale is not constant for many families of map projections.
Another family of photogrammetric instruments is frequently used to either develop or update planimetric maps. These systems project the image in the photograph onto a map or sketch, using components that can minimize distortions in the photograph. The zoom transfer scope is one such instrument. With this instrument, the operator views the map or sketch with the photograph optically superimposed. Direct revisions to the map can be made with relative ease.
A great deal of the information extracted from photographs is done through the stereoscopic viewing of sequences of photographs. By understanding the geometrical details of the camera system and the Earth’s surface, it is possible to determine the horizontal and vertical positions of objects with very high accuracy and precision. The simplest devices for viewing pairs of photographs in stereo, called stereoscopes, effectively recreate the illusion of one’s eyes being the same position as the camera lenses when the photographs were taken. In this way, the analyst can view the Earth in three dimensions. This ability to discern differences in elevation is a valuable technique in photointerpretation. Systems are available to aid in determining absolute differences in elevation, to quantify the differences in vertical position.
A family of instruments known as stereoplotters are used when pairs of photographs are the basis for developing accurate topographic maps. These complex systems are based on a pair of optical projectors, which recreate a three-dimensional view from a pair of photographs. A viewing system makes portions of the three-dimensional model visible to the operator, and a measuring system then records horizontal and vertical positions. Stereoplotters are the classical tools for compiling topographic maps. When we speak of an analytic plotter, we imply that portions of the stereoplotter system have been automated.
When an image correlator is a part of a stereoplotter system, the stereoscopic measurements once performed by the operator are also automated. The correlation process involves selecting portions of each image in the stereo pair, matching features in each, and, based on their horizontal positions, solving the parallax equations to determine both vertical and horizontal coordinates. These kinds of systems have had great success in those agencies with large-volume map production requirements. At this time, these systems cannot operate completely without human operator intervention.
After the analysis is complete and the product derived, the information must be input into the GIS. This is accomplished through the kinds of digitizing processes discussed in more detail in section 6.1.2.