1,325 research outputs found

    Object-Based Supervised Machine Learning Regional-Scale Land-Cover Classification Using High Resolution Remotely Sensed Data

    Get PDF
    High spatial resolution (HR) (1m – 5m) remotely sensed data in conjunction with supervised machine learning classification are commonly used to construct land-cover classifications. Despite the increasing availability of HR data, most studies investigating HR remotely sensed data and associated classification methods employ relatively small study areas. This work therefore drew on a 2,609 km2, regional-scale study in northeastern West Virginia, USA, to investigates a number of core aspects of HR land-cover supervised classification using machine learning. Issues explored include training sample selection, cross-validation parameter tuning, the choice of machine learning algorithm, training sample set size, and feature selection. A geographic object-based image analysis (GEOBIA) approach was used. The data comprised National Agricultural Imagery Program (NAIP) orthoimagery and LIDAR-derived rasters. Stratified-statistical-based training sampling methods were found to generate higher classification accuracies than deliberative-based sampling. Subset-based sampling, in which training data is collected from a small geographic subset area within the study site, did not notably decrease the classification accuracy. For the five machine learning algorithms investigated, support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), and learning vector quantization (LVQ), increasing the size of the training set typically improved the overall accuracy of the classification. However, RF was consistently more accurate than the other four machine learning algorithms, even when trained from a relatively small training sample set. Recursive feature elimination (RFE), which can be used to reduce the dimensionality of a training set, was found to increase the overall accuracy of both SVM and NEU classification, however the improvement in overall accuracy diminished as sample size increased. RFE resulted in only a small improvement the overall accuracy of RF classification, indicating that RF is generally insensitive to the Hughes Phenomenon. Nevertheless, as feature selection is an optional step in the classification process, and can be discarded if it has a negative effect on classification accuracy, it should be investigated as part of best practice for supervised machine land-cover classification using remotely sensed data

    The Improvement of Land Cover Classification by Thermal Remote Sensing

    Get PDF
    Land cover classification has been widely investigated in remote sensing for agricultural, ecological and hydrological applications. Landsat images with multispectral bands are commonly used to study the numerous classification methods in order to improve the classification accuracy. Thermal remote sensing provides valuable information to investigate the effectiveness of the thermal bands in extracting land cover patterns. k-NN and Random Forest algorithms were applied to both the single Landsat 8 image and the time series Landsat 4/5 images for the Attert catchment in the Grand Duchy of Luxembourg, trained and validated by the ground-truth reference data considering the three level classification scheme from COoRdination of INformation on the Environment (CORINE) using the 10-fold cross validation method. The accuracy assessment showed that compared to the visible and near infrared (VIS/NIR) bands, the time series of thermal images alone can produce comparatively reliable land cover maps with the best overall accuracy of 98.7% to 99.1% for Level 1 classification and 93.9% to 96.3% for the Level 2 classification. In addition, the combination with the thermal band improves the overall accuracy by 5% and 6% for the single Landsat 8 image in Level 2 and Level 3 category and provides the best classified results with all seven bands for the time series of Landsat TM images

    The Improvement of Land Cover Classification by Thermal Remote Sensing

    Get PDF
    Land cover classification has been widely investigated in remote sensing for agricultural, ecological and hydrological applications. Landsat images with multispectral bands are commonly used to study the numerous classification methods in order to improve the classification accuracy. Thermal remote sensing provides valuable information to investigate the effectiveness of the thermal bands in extracting land cover patterns. k-NN and Random Forest algorithms were applied to both the single Landsat 8 image and the time series Landsat 4/5 images for the Attert catchment in the Grand Duchy of Luxembourg, trained and validated by the ground-truth reference data considering the three level classification scheme from COoRdination of INformation on the Environment (CORINE) using the 10-fold cross validation method. The accuracy assessment showed that compared to the visible and near infrared (VIS/NIR) bands, the time series of thermal images alone can produce comparatively reliable land cover maps with the best overall accuracy of 98.7% to 99.1% for Level 1 classification and 93.9% to 96.3% for the Level 2 classification. In addition, the combination with the thermal band improves the overall accuracy by 5% and 6% for the single Landsat 8 image in Level 2 and Level 3 category and provides the best classified results with all seven bands for the time series of Landsat TM images

    Effects of Training Set Size on Supervised Machine-Learning Land-Cover Classification of Large-Area High-Resolution Remotely Sensed Data

    Get PDF
    The size of the training data set is a major determinant of classification accuracy. Neverthe- less, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algo- rithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project

    A Markov Chain Random Field Cosimulation-Based Approach for Land Cover Post-classification and Urban Growth Detection

    Get PDF
    The recently proposed Markov chain random field (MCRF) approach has great potential to significantly improve land cover classification accuracy when used as a post-classification method by taking advantage of expert-interpreted data and pre-classified image data. This doctoral dissertation explores the effectiveness of the MCRF cosimulation (coMCRF) model in land cover post-classification and further improves it for land cover post-classification and urban growth detection. The intellectual merits of this research include the following aspects: First, by examining the coMCRF method in different conditions, this study provides land cover classification researchers with a solid reference regarding the performance of the coMCRF method for land cover post-classification. Second, this study provides a creative idea to reduce the smoothing effect in land cover post-classification by incorporating spectral similarity into the coMCRF method, which should be also applicable to other geostatistical models. Third, developing an integrated framework by integrating multisource data, spatial statistical models, and morphological operator reasoning for large area urban vertical and horizontal growth detection from medium resolution remotely sensed images enables us to detect and study the footprint of vertical and horizontal urbanization so that we can understand global urbanization from a new angle. Such a new technology can be transformative to urban growth study. The broader impacts of this research are concentrated on several points: The first point is that the coMCRF method and the integrated approach will be turned into open access user-friendly software with a graphical user interface (GUI) and an ArcGIS tool. Researchers and other users will be able to use them to produce high-quality land cover maps or improve the quality of existing land cover maps. The second point is that these research results will lead to a better insight of urban growth in terms of horizontal and vertical dimensions, as well as the spatial and temporal relationships between urban horizontal and vertical growth and changes in socioeconomic variables. The third point is that all products will be archived and shared on the Internet

    K-NN FOREST: a software for the non-parametric prediction and mapping of environmental variables by the k-Nearest Neighbors algorithm

    Get PDF
    In the last decades researchers investigated the possibility of extending the information collected in sampling units during a field survey to wider geographical areas through the use of remotely sensed images. One of the most widely adopted approaches is based on the non-parametric k-Nearest Neighbors (k-NN) algorithm. This contribution describes the software K-NN FOREST we developed to provide a complete tool for the implementation of the k-NN technique to generate spatially explicit estimations (maps) of a response variable acquired in the field by sampling units through the use of remotely sensed data or other ancillary variables. K-NN FOREST is designed to guide the user through a graphic user interface in the different phases of the process. K-NN FOREST is freely available for download and it is designed to run under Windows environment in conjunction with the GIS software IDRISI

    K-NN FOREST: a software for the non-parametric prediction and mapping of environmental variables by the k-Nearest Neighbors algorithm

    Get PDF
    In the last decades researchers investigated the possibility of extending the information collected in sampling units during a field survey to wider geographical areas through the use of remotely sensed images. One of the most widely adopted approaches is based on the non-parametric k-Nearest Neighbors (k-NN) algorithm. This contribution describes the software K-NN FOREST we developed to provide a complete tool for the implementation of the k-NN technique to generate spatially explicit estimations (maps) of a response variable acquired in the field by sampling units through the use of remotely sensed data or other ancillary variables. K-NN FOREST is designed to guide the user through a graphic user interface in the different phases of the process. K-NN FOREST is freely available for download and it is designed to run under Windows environment in conjunction with the GIS software IDRISI

    Land Use/Land Cover Mapping Using Multitemporal Sentinel-2 Imagery and Four Classification Methods-A Case Study from Dak Nong, Vietnam

    Get PDF
    Information on land use and land cover (LULC) including forest cover is important for the development of strategies for land planning and management. Satellite remotely sensed data of varying resolutions have been an unmatched source of such information that can be used to produce estimates with a greater degree of confidence than traditional inventory estimates. However, use of these data has always been a challenge in tropical regions owing to the complexity of the biophysical environment, clouds, and haze, and atmospheric moisture content, all of which impede accurate LULC classification. We tested a parametric classifier (logistic regression) and three non-parametric machine learning classifiers (improved k-nearest neighbors, random forests, and support vector machine) for classification of multi-temporal Sentinel 2 satellite imagery into LULC categories in Dak Nong province, Vietnam. A total of 446 images, 235 from the year 2017 and 211 from the year 2018, were pre-processed to gain high quality images for mapping LULC in the 6516 km(2) study area. The Sentinel 2 images were tested and classified separately for four temporal periods: (i) dry season, (ii) rainy season, (iii) the entirety of the year 2017, and (iv) the combination of dry and rainy seasons. Eleven different LULC classes were discriminated of which five were forest classes. For each combination of temporal image set and classifier, a confusion matrix was constructed using independent reference data and pixel classifications, and the area on the ground of each class was estimated. For overall temporal periods and classifiers, overall accuracy ranged from 63.9% to 80.3%, and the Kappa coefficient ranged from 0.611 to 0.813. Area estimates for individual classes ranged from 70 km(2) (1% of the study area) to 2200 km(2) (34% of the study area) with greater uncertainties for smaller classes.Peer reviewe
    • …
    corecore