15 research outputs found

    Comparison of Data Fusion Methods Using Crowdsourced Data in Creating a Hybrid Forest Cover Map

    Get PDF
    Data fusion represents a powerful way of integrating individual sources of information to produce a better output than could be achieved by any of the individual sources on their own. This paper focuses on the data fusion of different land cover products derived from remote sensing. In the past, many different methods have been applied, without regard to their relative merit. In this study, we compared some of the most commonly-used methods to develop a hybrid forest cover map by combining available land cover/forest products and crowdsourced data on forest cover obtained through the Geo-Wiki project. The methods include: nearest neighbour, naive Bayes, logistic regression and geographically-weighted logistic regression (GWR), as well as classification and regression trees (CART). We ran the comparison experiments using two data types: presence/absence of forest in a grid cell; percentage of forest cover in a grid cell. In general, there was little difference between the methods. However, GWR was found to perform better than the other tested methods in areas with high disagreement between the inputs

    Geographically weighted correspondence matrices for local error reporting and change analyses: mapping the spatial distribution of errors and change

    Get PDF
    This letter describes and applies generic methods for generating local measures from the correspondence table. These were developed by integrating the functionality of two existing R packages: gwxtab and diffeR. They demonstrate how spatially explicit accuracy and error measures can be generated from local geographically weighted correspondence matrices, for example to compare classified and reference data (predicted and observed) for error analyses, and classes at times t1 and t2 for change analyses. The approaches in this letter extend earlier work that considered the measures derived from correspondence matrices in the context of generalized linear models and probability. Here the methods compute local, geographically weighted correspondence matrices, from which local statistics are directly calculated. In this case a selection of the overall and categorical difference measures proposed by Pontius and Milones (2011) and Pontius and Santacruz (2014), as well as spatially distributed estimates of kappa coefficients, User and Producer accuracies. The discussion reflects on the use of the correspondence matrix in remote sensing research, the philosophical underpinnings of local rather than global approaches for modelling landscape processes and the potential for policy and scientific benefits that local approaches support

    A global dataset of crowdsourced land cover and land use reference data

    Get PDF
    Global land cover is an essential climate variable and a key biophysical driver for earth system models. While remote sensing technology, particularly satellites, have played a key role in providing land cover datasets, large discrepancies have been noted among the available products. Global land use is typically more difficult to map and in many cases cannot be remotely sensed. In-situ or ground-based data and high resolution imagery are thus an important requirement for producing accurate land cover and land use datasets and this is precisely what is lacking. Here we describe the global land cover and land use reference data derived from the Geo-Wiki crowdsourcing platform via four campaigns. These global datasets provide information on human impact, land cover disagreement, wilderness and land cover and land use. Hence, they are relevant for the scientific community that requires reference data for global satellite-derived products, as well as those interested in monitoring global terrestrial ecosystems in general

    Evaluation of ESA CCI prototype land cover map at 20m

    Get PDF
    In September 2017, the ESA CCI Land Cover Team released a prototype land cover (LC) map at 20 m resolution over Africa for the year 2016. This is the first LC map produced at such a high resolution covering an entire continent for the year 2016. To help improve the quality of this product, we have assessed its overall accuracy and identified regions where the map should be improved. We have compared the product against two independent datasets developed within the Copernicus Global Land Services (CGLS): a reference land cover dataset at a 10 m resolution, which has been used as training data to produce the LC map at 100 m over Africa for the year 2015 (http://land.copernicus.eu/global/products/lc); and an independent validation dataset at a 10 m resolution, which has been developed by CGLS for independent assessment of land cover maps at resolutions finer than 100 m. According to our estimates, overall accuracy of the African CCI LC at 20 m is approximately 65%. We have highlighted regions where the spatial distribution of such classes as shrubs, crops and trees should be improved before the map at 20 m could be used as input for research questions, e.g. conservation of biodiversity, crop monitoring and climate modelling

    A spatial assessment of the forest carbon budget for Ukraine

    Get PDF
    The spatial representation of forest cover and forest parameters is a prerequisite for undertaking a systems approach to the full and verified carbon accounting of forest ecosystems over large areas. This study focuses on Ukraine, which contains a diversity of bioclimatic conditions and natural landscapes found across Europe. Ukraine has a high potential to sequester carbon dioxide through afforestation and proper forest management. This paper presents a new 2010 forest map for Ukraine at a 60 m resolution with an accuracy of 91.6 ± 0.8% (CI 0.95), which is then applied to the calculation of the carbon budget. The forest cover map and spatially distributed forest parameters were developed through the integration of remote sensing data, forest statistics, and data collected using the Geo-Wiki application, which involves visual interpretation of very high-resolution satellite imagery. The use of this map in combination with the mapping of other forest parameters had led to a decrease in the uncertainty of the forest carbon budget for Ukraine. The application of both stock-based and flux-based methods shows that Ukrainian forests have served as a net carbon sink, absorbing 11.4 ± 1.7 Tg C year−1 in 2010, which is around 25% less than the official values reported to the United Nations Framework Convention on Climate Change

    Geographically weighted evidence combination approaches for combining discordant and inconsistent volunteered geographical information

    Get PDF
    There is much interest in being able to combine crowdsourced data. One of the critical issues in information sciences is how to combine data or information that are discordant or inconsistent in some way. Many previous approaches have taken a majority rules approach under the assumption that most people are correct most of the time. This paper analyses crowdsourced land cover data generated by the Geo-Wiki initiative in order to infer the land cover present at locations on a 50 km grid. It compares four evidence combination approaches (Dempster Shafer, Bayes, Fuzzy Sets and Possibility) applied under a geographically weighted kernel with the geographically weighted average approach applied in many current Geo-Wiki analyses. A geographically weighted approach uses a moving kernel under which local analyses are undertaken. The contribution (or salience) of each data point to the analysis is weighted by its distance to the kernel centre, reflecting Tobler’s 1st law of geography. A series of analyses were undertaken using different kernel sizes (or bandwidths). Each of the geographically weighted evidence combination methods generated spatially distributed measures of belief in hypotheses associated with the presence of individual land cover classes at each location on the grid. These were compared with GlobCover, a global land cover product. The results from the geographically weighted average approach in general had higher correspondence with the reference data and this increased with bandwidth. However, for some classes other evidence combination approaches had higher correspondences possibly because of greater ambiguity over class conceptualisations and / or lower densities of crowdsourced data. The outputs also allowed the beliefs in each class to be mapped. The differences in the soft and the crisp maps are clearly associated with the logics of each evidence combination approach and of course the different questions that they ask of the data. The results show that discordant data can be combined (rather than being removed from analysis) and that data integrated in this way can be parameterised by different measures of belief uncertainty. The discussion highlights a number of critical areas for future research

    Testing two data fusion methods for multiscale and multiclass land-use/land-cover maps to improve fractional information at medium resolution

    Get PDF
    High uncertainty is found during inter-comparison of land-use/land-cover (LULC) maps derived from remote sensing imagery. Among the reasons for classification mismatch, especially in coarse maps and heterogeneous areas characterized by mixed pixels, is that the landscape heterogeneity is ignored by providing only the LULC class covering the largest portion of a pixel. Pixels are arbitrary spatial units determined mainly by the sensor’s properties and can have little relation to natural units on the ground. In fact, the use of class proportions in ground-truth training data, that better depict reality, proved to decrease the thematic accuracy of traditional LULC maps characterized by one LULC class per pixel. Because high-resolution LULC maps upscaled to coarser resolutions provide higher accuracy than natively-coarse maps, and because, except from creating new maps, integration of available ones can increase the final accuracy, during this project the potential of two data fusion methods for multi-scale (from high to coarse resolution) and multi-class maps to derive more accurate ones with fraction information at medium resolution (100m) was explored. Two data fusion models were tested in four study areas characterized by both mixed and pure-pixels by using seven LULC maps as input and a ground-truth sub-pixel database as response variable. The models’ output was then validated and compared against each individual input map, in both mixed and pure-pixels, by using the sub-pixel thematic accuracy matrix. To make more robust predictions and better answer the research questions of the study improvement of the goodness of fit of the data fusion models is needed. Despite the need of the models’ amelioration, it was observed that multiscale and multiclass data fusion improved the sub-pixel accuracy of some LULC classes compared to some of the maps used as input specially in mixed-pixels

    Land cover harmonization using Latent Dirichlet Allocation

    Get PDF
    Large-area land cover maps are produced to satisfy different information needs. Land cover maps having partial or complete spatial and/or temporal overlap, different legends, and varying accuracies for similar classes, are increasingly common. To address these concerns and combine two 30-m resolution land cover products, we implemented a harmonization procedure using a Latent Dirichlet Allocation (LDA) model. The LDA model used regionalized class co-occurrences from multiple maps to generate a harmonized class label for each pixel by statistically characterizing land attributes from the class co-occurrences. We evaluated multiple harmonization approaches: using the LDA model alone and in combination with more commonly used information sources for harmonization (i.e. error matrices and semantic affinity scores). The results were compared with the benchmark maps generated using simple legend crosswalks and showed that using LDA outputs with error matrices performed better and increased harmonized map overall accuracy by 6–19% for areas of disagreement between the source maps. Our results revealed the importance of error matrices to harmonization, since excluding error matrices reduced overall accuracy by 4–20%. The LDA-based harmonization approach demonstrated in this paper is quantitative, transparent, portable, and efficient at leveraging the strengths of multiple land cover maps over large areas

    Understanding MapSwipe: Analysing Data Quality of Crowdsourced Classifications on Human Settlements

    Get PDF
    Geodata is missing to populate maps for usage of local communities. Efforts for filling gaps (automatically) by deriving data on human settlements using aerial or satellite imagery is of current concern (Esch et al., 2013; Pesaresi et al., 2013; Voigt et al., 2007). Among semi-automated methods and pre-processed data products, crowdsourcing is another tool which can help to collect information on human settlements and complement existing data, yet it’s accuracy is debated (Goodchild and Li, 2012; Haklay, 2010; Senaratne et al., 2016). Here the quality of data produced by volunteers using the MapSwipe app was investigated. Three different intrinsic parameters of crowdsourced data and their impact on data quality were examined: agreement, user characteristics and spatial characteristics. Additionally, a novel mechanism based on machine learning techniques was presented to aggregate data provided from multiple users. The results have shown that a random forest based aggregation of crowdsourced classifications from MapSwipe can produce high quality data in comparison to state-of-the-art products derived from satellite imagery. High agreement serves as an indicator for correct classifications. Intrinsic user characteristics can be utilized to identify consistently incorrect classifications. Classifications that are spatial outliers show a higher error rate. The findings pronounce that the integration of machine learning techniques into existing crowdsourcing workflows can become a key point for the future development of crowdsourcing applications
    corecore