592 research outputs found

    Exploring issues of balanced versus imbalanced samples in mapping grass community in the telperion reserve using high resolution images and selected machine learning algorithms

    Get PDF
    ABSTRACT Accurate vegetation mapping is essential for a number of reasons, one of which is for conservation purposes. The main objective of this research was to map different grass communities in the game reserve using RapidEye and Sentinel-2 MSI images and machine learning classifiers [support vector machine (SVM) and Random forest (RF)] to test the impacts of balanced and imbalance training data on the performance and the accuracy of Support Vector Machine and Random forest in mapping the grass communities and test the sensitivities of pixel resolution to balanced and imbalance training data in image classification. The imbalanced and balanced data sets were obtained through field data collection. The results show RF and SVM are producing a high overall accuracy for Sentinel-2 imagery for both the balanced and imbalanced data set. The RF classifier has yielded an overall accuracy of 79.45% and kappa of 74.38% and an overall accuracy of 76.19% and kappa of 73.21% using imbalanced and balanced training data respectively. The SVM classifier yielded an overall accuracy of 82.54% and kappa of 80.36% and an overall accuracy of 82.21% and a kappa of 78.33% using imbalanced and balanced training data respectively. For the RapidEye imagery, RF and SVM algorithm produced overall accuracy affected by a balanced data set leading to reduced accuracy. The RF algorithm had an overall accuracy that dropped by 6% (from 63.24% to 57.94%) while the SVM dropped by 7% (from 57.31% to 50.79%). The results thereby show that the imbalanced data set is a better option when looking at the image classification of vegetation species than the balanced data set. The study recommends the implementation of ways of handling misclassification among the different grass species to improve classification for future research. Further research can be carried out on other types of high resolution multispectral imagery using different advanced algorithms on different training size samples.EM201

    Exploring synergetic effects of dimensionality reduction and resampling tools on hyperspectral imagery data classification

    Get PDF
    The present paper addresses the problem of the classification of hyperspectral images with multiple imbalanced classes and very high dimensionality. Class imbalance is handled by resampling the data set, whereas PCA and a supervised filter are applied to reduce the number of spectral bands. This is a preliminary study that pursues to investigate the benefits of combining several techniques to tackle the imbalance and the high dimensionality problems, and also to evaluate the order of application that leads to the best classification performance. Experimental results demonstrate the significance of using together these two preprocessing tools to improve the performance of hyperspectral imagery classification. Although it seems that the most effective order corresponds to first a resampling strategy and then a feature (or extraction) selection algorithm, this is a question that still needs a much more thorough investigation in the futureThis work has partially been supported by the Spanish Ministry of Education and Science under grants CSD2007–00018, AYA2008–05965–0596 and TIN2009–14205, the Fundació Caixa Castelló–Bancaixa under grant P1–1B2009–04, and the Generalitat Valenciana under grant PROMETEO/2010/02

    Using LUCAS survey and Recurrent Neural Networks to produce LCLU classification based on a Satellite Image time series of Sentinel-2

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe need of timely and accurate information for the territory has increased over the years, making Land Cover Land Use (LCLU) mapping one of the most common application of remote sensing. Recently, the advances in satellite technology and the open access policies for remote sensing data increased the interest in exploring satellite image time series. In addition, the attention of researchers has shifted from standard machine learning algorithms (e.g., Support Vector Machines and Random Forest) to Recurrent Neural Networks due to their ability of exploiting sequential information. However, acquiring reference data to train these algorithms is still a hurdle. This study aims to evaluate the capability of a Gated Recurrent Unit in performing pixel-level LCLU classification of a satellite image time series, using Sentinel-2 imagery and having the LUCAS survey as reference data. To assess the performance of our model we compared it to state-of-the-art classifiers (SVM and RF). Due to the unbalance nature of the LUCAS survey, we applied oversampling to this dataset to increase the performance of our models, testing three different oversampling techniques. The results attained showed that Recurrent Neural Networks did not outperform the other state-of-the-art algorithms, when trained with a limited number of sampling units, and that oversampling the LUCAS survey increased the performance of all the classifiers. Finally, we were able to demonstrate that it is possible to produce LCLU classification of satellite image time series using only open-source data by using Sentinel-2 imagery and the LUCAS survey as refence data

    Large Area Land Cover Mapping Using Deep Neural Networks and Landsat Time-Series Observations

    Get PDF
    This dissertation focuses on analysis and implementation of deep learning methodologies in the field of remote sensing to enhance land cover classification accuracy, which has important applications in many areas of environmental planning and natural resources management. The first manuscript conducted a land cover analysis on 26 Landsat scenes in the United States by considering six classifier variants. An extensive grid search was conducted to optimize classifier parameters using only the spectral components of each pixel. Results showed no gain in using deep networks by using only spectral components over conventional classifiers, possibly due to the small reference sample size and richness of features. The effect of changing training data size, class distribution, or scene heterogeneity were also studied and we found all of them having significant effect on classifier accuracy. The second manuscript reviewed 103 research papers on the application of deep learning methodologies in remote sensing, with emphasis on per-pixel classification of mono-temporal data and utilizing spectral and spatial data dimensions. A meta-analysis quantified deep network architecture improvement over selected convolutional classifiers. The effect of network size, learning methodology, input data dimensionality and training data size were also studied, with deep models providing enhanced performance over conventional one using spectral and spatial data. The analysis found that input dataset was a major limitation and available datasets have already been utilized to their maximum capacity. The third manuscript described the steps to build the full environment for dataset generation based on Landsat time-series data using spectral, spatial, and temporal information available for each pixel. A large dataset containing one sample block from each of 84 ecoregions in the conterminous United States (CONUS) was created and then processed by a hybrid convolutional+recurrent deep network, and the network structure was optimized with thousands of simulations. The developed model achieved an overall accuracy of 98% on the test dataset. Also, the model was evaluated for its overall and per-class performance under different conditions, including individual blocks, individual or combined Landsat sensors, and different sequence lengths. The analysis found that although the deep model performance per each block is superior to other candidates, the per block performance still varies considerably from block to block. This suggests extending the work by model fine-tuning for local areas. The analysis also found that including more time stamps or combining different Landsat sensor observations in the model input significantly enhances the model performance

    Introducing artificial data generation in active learning for land use/land cover classification

    Get PDF
    Fonseca, J., Douzas, G., & Bacao, F. (2021). Increasing the effectiveness of active learning: Introducing artificial data generation in active learning for land use/land cover classification. Remote Sensing, 13(13), 1-20. [2619]. https://doi.org/10.3390/rs13132619In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data “on-demand” for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on the optimization of AL by modifying the selection criteria and the classifiers used. Although improvements in these areas will result in more effective data collection, the use of artificial data sources to reduce human–computer interaction remains unexplored. In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL. The implementation of the proposed AL framework is done using Geometric SMOTE as the data generator. We compare the new AL framework to the original one using similar acquisition functions and classifiers over three AL-specific performance metrics in seven benchmark datasets. We show that this modification of the AL framework significantly reduces cost and time requirements for a successful AL implementation in all of the datasets used in the experiment.publishersversionpublishe

    Mapping urban tree species in a tropical environment using airborne multispectral and LiDAR data

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesAccurate and up-to-date urban tree inventory is an essential resource for the development of strategies towards sustainable urban planning, as well as for effective management and preservation of biodiversity. Trees contribute to thermal comfort within urban centers by lessening heat island effect and have a direct impact in the reduction of air pollution. However, mapping individual trees species normally involves time-consuming field work over large areas or image interpretation performed by specialists. The integration of airborne LiDAR data with high-spatial resolution and multispectral aerial image is an alternative and effective approach to differentiate tree species at the individual crown level. This thesis aims to investigate the potential of such remotely sensed data to discriminate 5 common urban tree species using traditional Machine Learning classifiers (Random Forest, Support Vector Machine, and k-Nearest Neighbors) in the tropical environment of Salvador, Brazil. Vegetation indices and texture information were extracted from multispectral imagery, and LiDAR-derived variables for tree crowns, were tested separately and combined to perform tree species classification applying three different classifiers. Random Forest outperformed the other two classifiers, reaching overall accuracy of 82.5% when using combined multispectral and LiDAR data. The results indicate that (1) given the similarity in spectral signature, multispectral data alone is not sufficient to distinguish tropical tree species (only k-NN classifier could detect all species); (2) height values and intensity of crown returns points were the most relevant LiDAR features, combination of both datasets improved accuracy up to 20%; (3) generation of canopy height model derived from LiDAR point cloud is an effective method to delineate individual tree crowns in a semi-automatic approach

    Classification of Tree Species in a Diverse African Agroforestry Landscape Using Imaging Spectroscopy and Laser Scanning

    Get PDF
    Airborne imaging spectroscopy (IS) and laser scanning (ALS) have been explored widely for tree species classification during the past decades. However, African agroforestry areas, where a few exotic tree species are dominant and many native species occur less frequently, have not yet been studied. Obtaining maps of tree species would provide useful information for the characterization of agroforestry systems and detecting invasive species. Our objective was to study tree species classification in a diverse tropical landscape using IS and ALS data at the tree crown level, with primary interest in the exotic tree species. We performed multiple analyses based on different IS and ALS feature sets, identified important features using feature selection, and evaluated the impact of combining the two data sources. Given that a high number of tree species with limited sample size (499 samples for 31 species) was expected to limit the classification accuracy, we tested different approaches to group the species based on the frequency of their occurrence and Jeffries-Matusita (JM) distance. Surface reflectance at wavelengths between 400-450 nm and 750-800 nm, and height to crown width ratio, were identified as important features. Nonetheless, a selection of minimum noise fraction (MNF) transformed reflectance bands showed superior performance. Support vector machine classifier performed slightly better than the random forest classifier, but the improvement was not statistically significant for the best performing feature set. The highest F1-scores were achieved when each of the species was classified separately against a mixed group of all other species, which makes this approach suitable for invasive species detection. Our results are valuable for organizations working on biodiversity conservation and improving agroforestry practices, as we showed how the non-native Eucalyptus spp., Acacia mearnsii and Grevillea robusta (mean F1-scores 76%, 79% and 89%, respectively) trees can be mapped with good accuracy. We also found a group of six fruit bearing trees using JM distance, which was classified with mean F1-score of 65%. This was a useful finding, as these species could not be classified with acceptable accuracy individually, while they all share common economic and ecological importance.Peer reviewe
    corecore