Integrating TM and Ancillary Geographical Data with Classification Trees for Land Cover Classification of Marsh Area

: The main objective of this research is to determine the capacity of land cover classification combining spectral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjiang Plain,


Introduction
Wetlands are considered an integral part of the global ecosystem as they prevent or reduce the severity of floods, feed ground water and provide a unique habitat for flora and fauna (Mitsch and Gosselink, 1993). However, the rapid population increase and fast economic development are placing increasing pressures on wetland ecosystems. To effectively conserve and manage wetland resources, it is essential to have up-to-date information on wetland distribution and their adjacent uplands. However, their dynamic hydrological characteristics, extensive areas and remote locations means that wetlands are often difficult to be monitored in situ.
Satellite remote sensing has many advantages over ground-based survey including providing a synoptic view, multi-spectral data, multi-temporal coverage and cost-effectiveness. Optical sensors such as Landsat′s MSS, TM and SPOT′s HRV have been investigated for wetland mapping at various scales with useful results (Jensen et al., 1986;Laine et al., 2002;Shalaby and Tateishi, 2007;Zhang et al., 2009), which can be analyzed by various landscape metrics, e.g. vector analysis metrics (Zhang et al., 2006). However, it has been proven that wetlands are difficult to be mapped using only optic-based remote sensing techniques because they occur rarely and their spectral and spatial characteristics are highly context-dependent (Stehman et al., 2003). Researches have found that using ancillary geographical data and multi-spectral satellite data can improve classification results (Ozesmi and Bauer, 2002). Ancillary data that have been used to assist in identification of wetlands include: soils data (Ernst and Hoffer, 1979;Bolstad and Lillesand, 1992;Sader et al., 1995), topographic or elevation data (Bolstad and Lillesand, 1992;Hinson et al., 1994;Sader et al., 1995), bedrock geology and landforms (Wright and Gallant, 2007) and climate data (Wright and Gallant, 2007;Sesnie et al., 2008). Our work was motivated by a need for reliable and relatively automated methods to combine TM imagery and ancillary geographical data for mapping typical freshwater marsh and their adjacent land cover types.
As a class of machine learning algorithms, classification trees automatically select variables and their hierarchical structure is capable of detecting non-additive interactions between variables without explicit specification (Clark and Pregibon, 1992). Advantages of classification trees include their versatility to integrate both numerical and categorical variables into classifications and they make no distributional assumptions (Breiman et al., 1984). Classification trees also require less training time compared to other machine learning techniques such as artificial neural networks and support vector machines, while attaining similar accuracy (Pal, 2005). Instability of trees to outliers or small changes in the training data are limitations of the method (Miller and Franklin, 2002). Classification trees have been successfully used to integrate remote sensing imagery with ancillary geographical information for land use/cover classification (Hansen et al., 1996;Joy et al., 2003;Sesnie et al., 2008), however, its application to detecting wetland was few, and was restricted within wetlands with small area in the Greater Yellowstone Region (Wright and Gallant, 2007).
The Sanjiang Plain is one of the most important marsh areas in China. The availability of high-quality ancillary geographical information, including terrain, soil and landform characteristics, presented a unique opportunity for integrating TM imagery with the ancillary information to improve the land cover classification. The objective of this paper is to determine which window sizes, texture measures and which wave band combinations enhance the separability of different land cover types, and to determine the extent to which spectral features, selected textural combinations and ancillary geographical information can be used to improve classification accuracy of marsh and their adjacent land cover types.

Study area
The Small Sanjiang Plain (46°48′06″-48°29′16″N, 132°26′26″-135°07′24″E), located in the northeastern part of the Sanjiang Plain in Heilongjiang Province of China, was selected as the study area in this paper. With a total area of 1,606,008ha, the Small Sanjiang Plain is the most important marsh area in China and encompasses two wetlands of international importance: Honghe National Nature Reserve (HNNR) and Sanjiang National Nature Reserve (SNNR) (Fig. 1). The climate of the study area is characterized as temperate humid and sub-humid continental monsoon climate. The average temperature is -18 in January, and 21 ℃ -22 in ℃ July. The frost-free period is 120-140d. Annual precipitation is 500-650mm, of which 80% occur from May to September. Most of the rivers in the area have riparian wetland characteristics. The types of vegetation belong to Changbai flora, mainly meadow and marsh vegetation.
Carex spp. are distributed widely and Phragmites spp. are scattered in some places in the study area (Liu, 1995).

Data sources
One scene cloud-free Landsat-5 TM image was obtained from the Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences. The image (WRS-2 path 113, row 27) was acquired on the 30 August 2006 for the Small Sanjiang Plain and comprised six multi-spectral bands of 30m spatial resolution (bands 1 to 5 and band 7). The thermal infrared band (band 6) was removed due to its unsuitability for classification. A digital administrative map and 36 topographic maps at the scale of 1 ‫׃‬ 50,000 in 1986 were obtained from the Heilongjiang Mapping and Surveying Bureau. A vegetation thematic map, a landform thematic map and a soil thematic map at the scale of 1 ‫׃‬ 200,000 of 1985 were obtained from the Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences.

Radiometric normalization and geometric rectification
The digital values of the TM imagery were converted to top of the atmosphere radiance units by using standard calibration coefficients and corrected for the atmospheric effects by the dark-pixel subtraction technique (Chavez, 1988). About 66 Ground Control Points (GCPs) were evenly distributed throughout the whole study area and most of them were laid on distinguishable objects, such as, the intersections of roads and fence lines. The registration procedure achieved an accuracy of less than 0.5 pixel (RMSE). All images were registered to the Gauss projection (identical to that of the digital district map) and re-sampled to pixel size 30m×30m by using ERDAS IMAGINE 8.7 software.

Land cover categories and training data
The land cover classification system was designed to consider three factors: 1) the moderate spatial and spectral resolution of multi-spectral instruments like the TM sensor and the spectral differences among diverse land cover categories; 2) the characteristics of the vegetation distribution in the study area; and 3) the national standard land cover classification system in China. Seven land use/cover categories (dry land, forestland, meadow, marsh, residential area, open water and rice field) were chosen for the classification. For the classification tree model running, 50,625 training pixels for seven land use/cover types were identified from: 1) field locations referenced on the ground with a global positioning system (GPS) and 2) spatially referenced wetland plots and inventory data with wetland vegetation species records. The number of training pixels in each category depended on its overall proportion in order to avoid the over prediction of rare classes. A total of 50,627 training locations were sampled, of which 21.78% were dry land pixels, 15.81% forestland pixels, 14.53% meadow pixels, 12.52% open water pixels, 4.45% residential area, 16.34% marsh pixels and 14.57% rice field pixels.

Spectral, textural and ancillary geospatial variables
Spectral predictive variables included bands 1 to 5 and band 7 from Landsat TM imagery. Two vegetation indices (Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI)) (Huete et al., 2002) and a data fusion transformation combining the six bands information from the Landsat TM image (first principal component (PC1)) (Lillesand et al., 2004) were added as additional predictors. Vegetation indices (NDVI and EVI) were selected as they were found useful for analysis of vegetation variability by Stow et al. (1998). Derivation of spatial information has also been successful from data fusion transformation bands such as PC1 (Johansen and Phinn, 2006a).
Image texture was a measure of the spatial frequencies of tonal change within a set distance on an image. We selected representative square subset areas larger than 5ha of each land cover type from the Landsat TM image. Firstly, for each of the representative subset, five grey-level texture co-occurrence measures (variance, homogeneity, contrast, dissimilarity and entropy) were calculated for the six spectral bands and the NDVI, EVI and PC1 based on the Global Spatial Statistics function in ENVI 4.4. These texture co-occurrence measures were selected, as they had provided useful information on vegetation structural parameters in other studies (Johansen and Phinn, 2006b). The semi-variogram range was used to determine the optimal window size for image grey-level co-occurrence texture analysis, based on the research which has shown that the window size used for texture analysis should not exceed the distance represented by the semi-variogram range (Franklin et al., 1996). Secondly, a Z-test was used to determine the texture co-occurrence measures and its derivative bands in order to maximize the separability of each land cover type. All pixel values within each subset were compared by using the Z-test to identify the combinations of texture measure and its derivative bands that provided the largest statistically significant difference. Window sizes for each image band were based on the findings of the semi-variogram analysis. Equation (1) was applied to calculating the z-statistics value (Zar, 1984). ( 1) where, µ 1 and µ 2 are mean pixel values of two different texture bands; n 1 and n 2 are number of samples; s 1 and s 2 are sample standard deviation of corresponding texture bands. A 30-m digital elevation model (DEM) was calculated based on the function of Kriging interpolation of Arcgis 9.1 software from the digitized topological map. Elevation data and percent slope associated with soil drainage, moisture and hydrological condition were anticipated to regulate differences in wetland vegetation species composition (Wright and Gallant, 2007). The classification of landform in the Small Sanjiang Plain was based on slope gradient distribution, slope curvature and shape of bedrock exposure as associated with wetland formation. We expected that the landform thematic map influences the probability of wetland presence. The soil classification of the Small Sanjiang Plain consisted of 22 soil subgroups. Each soil subgroup had particular physical and chemical properties, which influenced soil texture and hydraulic connectivity. Thus the soil thematic map could be related to the occurrence of wetland vegetation (Zhang, 1988).

Model building and wetland mapping
We constructed classification trees for Landsat satellite imagery and ancillary geographical data using three different combinations of predictors. 1) TM imagery alone (including bands 1 to 5, band 7, NDVI, EVI and PC1). 2) TM imagery plus image texture (TM+TXT model). The combinations of texture measures, window size and derivative spectral bands were determined by Semi-variograms analysis and Z-test. 3) All predictors including TM imagery, image texture measures and all additional ancillary GIS information including terrain, landform and soil information (TM+TXT+GIS model). We built classification trees using CART software (CART 6.0, Salford Systems Corporation, San Diego, California) based on 50,627 training samples constructed by predictive variables and objective variables. Those trees were pruned by the automated cost-complexity method of Breiman et al. (1984). The tree structure was equivalent to a set of "If -Then" production rules. The path, from the root of the tree to one of the leaf nodes, corresponded to one rule. The land cover classification map was produced by using those production rules. With respect to wetland mapping, a nine-pixel minimal patch size (about 0.8ha) was used to avoid the "salt and pepper" phenomena without losing much specificity.

Error assessment
We performed traditional Maximum Likelihood Classification (MLC) supervised classification for TM imagery alone using the same training samples in order to compare the classification accuracy with that of the classification trees. We evaluated the classification results of three classification trees and traditional supervised classification by applying them to testing data acquired in May and August 2005 respectively through field GPS position. Population error matrices were estimated based on 609 pixels of seven land use/cover types not previously used for training purposes. Stratified random sampling was used to calculate Kappa coefficients of agreement and their variances (Stehman, 1999). We evaluated the statistical significance of Kappa coefficients following the addition of more predictors by calculating z-statistics value (Congalton and Green, 1999).

Textural variable selection
Spatial autocorrelation of pixel values within each representative subset areas of different land cover types was shown in the semi-variograms for the bands 1 to 5, band 7, NDVI, EVI and PC1 (Fig. 2). The semi-variogram range values for each land cover types optimized the selection of image window sizes. The most common semi-variogram ranges for the five typical land cover types (excluding residential area and open water) were at three pixels (90m) or at 11 pixels (330m). In the marsh semi-variogram, there were some intermediate la-g distances. As is evident in Fig. 2, however, the semivariogram ranges could be interpreted as 11 and 3 pixels respectively without significant misinterpretation. Based on the largest two z-statistics value derived from pairwise land cover subsets, 20 textural variables combination were identified as optimal for significant discrimination of land cover types (Fig. 3). Marsh and dry land had the most significant discrimination in all land cover types using entropy computed from the NDVI. Other easily discriminated classes were forestland and rice field using the Homogeneity filter from Band 7. The land cover types that corresponded to the least statistically significant z-statistics value were marsh and meadow, dry land and rice field. From the result of semi-variograms analysis and Z-test, 17 combinations of window sizes, textural measures and derivative bands (ignoring the repeat combinations) were selected as the texture variables for building classification trees.

Classification tree models and accuracy assessment
There were a total of 30 predictive variables in the construction of the classification tree models, including six multi-spectral bands of TM image (bands 1 to 5, band 7), two vegetation indices (NDVI and EVI), one data fusion transformation band (PC1), 17 selected texture measures and four ancillary geographical variables (DEM, Slope, Soil and Landform). Among the 30 predictors, Soil and Landform were categorical variables, others were numerical variables. There were seven objective variables: ND＝NDVI, EV = EVI, B1＝TM1, B2= TM2, B5 = TM5, B7 = TM7, Ent = Entropy, Hom = Homogeneity, Dis = Dissimilarity  Table 1. Kappa coefficients and z-statistics value were calculated for MLC and three classification trees predictive models (Table 2).   Table 1 it can be seen that compared with the MLC supervised classification, three classification trees predictive models reduced the overall error rate significantly from 16.42% to 8.7%. Among the three predictive models, the overall error rates decreased incrementally as more predictors were added to the classification trees. Adding texture variables improved classification accuracy of marsh more than other lands cover types. The TM+TXT model depressed the omission error rate of marsh effectively (from 24.73% to 17.2%) while keeping the commission error rate relatively low at 18.09%. In the TM+TXT+GIS model where all predictors were available, the omission error rate of marsh dropped to 12.90% while commission error rate declined to 10.99% (Table 1). The Kappa coefficients increased significantly (p<0.01) from traditional MLC supervised classification to any of the three classification trees models. Among the three predictive models, the Kappa coefficients increased significantly (p<0.1) from TM-only model to TM+TXT+GIS model (Table 2).

Wetland mapping comparison
Classification trees structure for TM+TXT+GIS model is shown in Fig. 4. There were 13 rules in the structure of the TM+TXT+GIS model (Fig. 4). The land cover classification map was produced by using these production rules (Fig. 5a). To compare the classification trees with the traditional classification approach, MLC supervised classification was performed based on the same training samples. The classification result is shown in Fig. 5b.  NA Xiaodong,ZHANG Shuqing,ZHANG Huaiqing et al. 184 As we can see from Fig. 5b, the MLC supervised classification result contained serious speckle noise, and the spatial distribution of land use/cover types was fragmented. Some other land use/cover pixels were misclassified as residential area (red area in Fig. 5b). Adding homogeneity derived from Band 1 at 11×11 window size to the classification tree model separated residential area from other land cover types (branch 1 in Fig. 4). The classification tree model depressed the speckle noise effectively and reduced the inclusion of pixels of residential area by using the texture measures (Fig. 5a).
Downstream of Yalu River as shown in Fig. 4b, many marsh pixels were misclassified as open water. As well, there was much confusion between meadow and dry land in the southeast of Fig. 4b. This was partly due to the fact that the pixel-based MLC was limited by using only spectral information, when land cover types were spectrally similar. In the classification result of TM+TXT+GIS model (Fig. 4), the ancillary geographical data of soil types and DEM were included in the classification rules. Marsh pixels were separated by association with soil types such as meadow bog soil (14), humic bog soil (15), bog soil (19), gleyed meadow soil and humic bog soil (20) and orthic Meadow soil and Meadow bog soil (21) (Fig. 4). Ancillary soil information reduced both the marsh omission error rate (from 17.2% to 12.90%) and the marsh commission error rate (from 18.09% to 10.99%) ( Table 1). DEMs with values greater than 54.91 were associated with dry land, while lesser values were associated with meadow. This classification rule coincided with the geographic correlation laws. Integrating the ancillary geographical data, the TM+TXT+GIS classification tree model decreased the misclassification pixels between open water and marsh by associating marsh with specified soil types, and reduced the confusion between meadow and dry land based on the differentiation of elevation values.

Discussion and Conclusions
We found that image spectral, textural, terrain data and ancillary GIS information improved the land use/cover classification accuracy significantly in the Small Sanjiang Plain where marshes are concentrated. Kappa coefficients of classification trees significantly increase compared with traditional MLC supervised classification. Selected textural combinations included window sizes, texture measures and the derivative bands based on semi-variograms and Z-test. This reduced the speckle noise effectively and reduced the commission errors rate for residential areas. Ancillary geographical information soil types decreased the misclassification pixels between open water and marsh while the DEM reduced the classification confusion between meadow and dry land.
With the introduction of high-dimensional texture information consisting of various combinations of window size, co-occurrence measures and derivative bands, the temporal and spatial complexity of the learning procedure based on the CART algorithm increased. Excessive redundant attributes not only increased the complexity of the classification process but also led to some confusion and therefore reduced the classification accuracy. Semi-variograms analysis and Z-tests provided some significant new insights into the selection of optimal texture combinations to distinguish the land cover types. Window sizes of 3×3 pixels or 11×11 pixels provided the best results with intermediate-sized windows (9×9) providing little additional information. Co-occurrence texture measures entropy, homogeneity and dissimilarity were the most significant texture measures for discriminating land cover types while contrast and variance provided little differentiation.
The training locations were sampled according to the proportion of the land cover types in the landscape. We did not detect significant marsh over prediction in the wetland mapping study caused by over-sampling of wetland training data in areas of rare wetland occurrence. In our full-variable classification tree model, the omission and commission error rates of marsh were 12.9% and 10.99 respectively. Our results suggested that the classification tree approach was portable, relatively easy to implement, and suitable for large-area wetland mapping.