A Contribution to land cover and land use mapping: in Portugal with multi-temporal Sentinel-2 data and supervised classification

Abstract

Dissertation presented as the partial requirement for obtaining a Master's degree in Geographic Information Systems and ScienceRemote sensing techniques have been widely employed to map and monitor land cover and land use, important elements for the description of the environment. The current land cover and land use mapping paradigm takes advantage of a variety of data options with proper spatial, spectral and temporal resolutions along with advances in technology. This enabled the creation of automated data processing workflows integrated with classification algorithms to accurately map large areas with multi-temporal data. In Portugal, the General Directorate for Territory (DGT) is developing an operational Land Cover Monitoring System (SMOS), which includes an annual land cover cartography product (COSsim) based on an automatic process using supervised classification of multi-temporal Sentinel-2 data. In this context, a range of experiments are being conducted to improve map accuracy and classification efficiency. This study provides a contribution to DGT’s work. A classification of the biogeographic region of Trás-os-Montes in the North of Portugal was performed for the agricultural year of 2018 using Random Forest and an intra-annual multi-temporal Sentinel-2 dataset, with stratification of the study area and a combination of manually and automatically extracted training samples, with the latter being based on existing reference datasets. This classification was compared to a benchmark classification, conducted without stratification and with training data collected automatically only. In addition, an assessment of the influence of training sample size in classification accuracy was conducted. The main focus of this study was to investigate whether the use of vi classification uncertainty to create an improved training dataset could increase classification accuracy. A process of extracting additional training samples from areas of high classification uncertainty was conducted, then a new classification was performed and the results were compared. Classification accuracy assessment for all proposed experiments was conducted using the overall accuracy, precision, recall and F1-score. The use of stratification and combination of training strategies resulted in a classification accuracy of 66.7%, in contrast to 60.2% in the case of the benchmark classification. Despite the difference being considered not statistically significant, visual inspection of both maps indicated that stratification and introduction of manual training contributed to map land cover more accurately in some areas. Regarding the influence of sample size in classification accuracy, the results indicated a small difference, considered not statistically significant, in accuracy even after a reduction of over 90% in the sample size. This supports the findings of other studies which suggested that Random Forest has low sensitivity to variations in training sample size. However, the results might have been influenced by the training strategy employed, which uses spectral subclasses, thus creating spectral diversity in the samples independently of their size. With respect to the use of classification uncertainty to improve training sample, a slight increase of approximately 1% was observed, which was considered not statistically significant. This result could have been affected by limitations in the process of collecting additional sampling units for some classes, which resulted in a lack of additional training for some classes (eg. agriculture) and an overall imbalanced training dataset. Additionally, some classes had their additional training sampling units collected from a limited number of polygons, which could limit the spectral diversity of new samples. Nevertheless, visual inspection of the map suggested that the new training contributed to reduce confusion between some classes, improving map agreement with ground truth. Further investigation can be conducted to explore more deeply the potential of classification uncertainty, especially focusing on addressing problems related to the collection of the additional samples

    Similar works