503 research outputs found

    An Exploration of Some Pitfalls of Thematic Map Assessment Using the New Map Tools Resource

    Get PDF
    A variety of metrics are commonly employed by map producers and users to assess and compare thematic maps’ quality, but their use and interpretation is inconsistent. This problem is exacerbated by a shortage of tools to allow easy calculation and comparison of metrics from different maps or as a map’s legend is changed. In this paper, we introduce a new website and a collection of R functions to facilitate map assessment. We apply these tools to illustrate some pitfalls of error metrics and point out existing and newly developed solutions to them. Some of these problems have been previously noted, but all of them are under-appreciated and persist in published literature. We show that binary and categorical metrics, including information about true-negative classifications, are inflated for rare categories, and more robust alternatives should be chosen. Most metrics are useful to compare maps only if their legends are identical. We also demonstrate that combining land-cover classes has the often-neglected consequence of apparent improvement, particularly if the combined classes are easily confused (e.g., different forest types). However, we show that the average mutual information (AMI) of a map is relatively robust to combining classes, and reflects the information that is lost in this process; we also introduce a modified AMI metric that credits only correct classifications. Finally, we introduce a method of evaluating statistical differences in the information content of competing maps, and show that this method is an improvement over other methods in more common use. We end with a series of recommendations for the meaningful use of accuracy metrics by map users and producer

    Cropland Capture – A Game for Improving Global Cropland Maps

    Get PDF
    Current satellite-derived global land-cover products, which are crucial for many modelling and monitoring applications, show large disagreements when compared with each another. To help improve global land cover (in particular the cropland class), we developed a game called Cropland Capture. This is a simple cross-platform game for collecting image classifications that will be used to develop and validate global cropland maps in the future. In this paper, we describe the game design of Cropland Capture in detail, including aspects such as simplicity,efficiency in data collection and what mechanisms were implemented to ensure data quality.We also discuss the impact of incentives on attracting and sustaining players in the game

    Cropland Capture: A gaming approach to improve global land cover

    Get PDF
    Ponencias, comunicaciones y pĂłsters presentados en el 17th AGILE Conference on Geographic Information Science "Connecting a Digital Europe through Location and Place", celebrado en la Universitat Jaume I del 3 al 6 de junio de 2014.Accurate and reliable information on global cropland extent is needed for a number of applications, e g. to estimate potential yield losses in the wake of a drought or for assessing future scenarios of climate change on crop production. However, current global land cover and cropland products are not accurate enough for many of these applications. One way forward is to increase the amount of data that are used to create these maps as well as for validation purposes. One method for doing this is to involve citizens in the classification of satellite imagery as undertaken using the Geo-Wiki tool. This paper outlines Cropland Capture, which is simplified game version of Geo-Wiki in which players classify satellite imagery based on whether they can see evidence of cropland or not. On overview of the game is provided along with some initial results from the first 3 months of game play. The paper concludes with a discussion of the future steps in this research

    Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data

    Get PDF
    Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations

    Improved Vote Aggregation Techniques for the Geo-Wiki Cropland Capture Crowdsourcing Game

    Get PDF
    Crowdsourcing is a new approach for solving data processing problems for which conventional methods appear to be inaccurate, expensive, or time-consuming. Nowadays, the development of new crowdsourcing techniques is mostly motivated by so called Big Data problems, including problems of assessment and clustering for large datasets obtained in aerospace imaging, remote sensing, and even in social network analysis. By involving volunteers from all over the world, the Geo-Wiki project tackles problems of environmental monitoring with applications to flood resilience, biomass data analysis and classification of land cover. For example, the Cropland Capture Game, which is a gamified version of Geo-Wiki, was developed to aid in the mapping of cultivated land, and was used to gather 4.5 million image classifications from the Earth’s surface. More recently, the Picture Pile game, which is a more generalized version of Cropland Capture, aims to identify tree loss over time from pairs of very high resolution satellite images. Despite recent progress in image analysis, the solution to these problems is hard to automate since human experts still outperform the majority of machine learning algorithms and artificial systems in this field on certain image recognition tasks. The replacement of rare and expensive experts by a team of distributed volunteers seems to be promising, but this approach leads to challenging questions such as: how can individual opinions be aggregated optimally, how can confidence bounds be obtained, and how can the unreliability of volunteers be dealt with? In this paper, on the basis of several known machine learning techniques, we propose a technical approach to improve the overall performance of the majority voting decision rule used in the Cropland Capture Game. The proposed approach increases the estimated consistency with expert opinion from 77% to 86%

    How to Increase the Accuracy of Crowdsourcing Campaigns?

    Get PDF
    Crowdsourcing is a new approach to performing tasks, with a group of volunteers rather than experts. For example, the Geo-Wiki project [1] aims to improve the global land-cover map by crowdsourcing for image recognition. Though crowdsourcing gives a simple way to perform tasks that are hard to automate, analysis of data received from non-experts is a challenging problem that requires a holistic approach. Here we study in detail the dataset of the Cropland Capture game (part of Geo-Wiki project) to increase the accuracy of campaign’s results. Using this analysis, we developed a methodology for a generic type of crowdsourcing campaign similar to the Cropland Capture game. The proposed methodology relies on computer vision and machine learning techniques. Using the Cropland Capture dataset we showed that our methodology increases agreement between aggregated volunteers’ votes and experts’ decisions from 77% to 86%. [1] Fritz, Steffen, et al. “Geo-Wiki. Org: The use of crowdsourcing to improve global land cover.” Remote Sensing 1.3 (2009): 345-354

    Lanczos exact diagonalization study of field-induced phase transition for Ising and Heisenberg antiferromagnets

    Full text link
    Using an exact diagonalization treatment of Ising and Heisenberg model Hamiltonians, we study field-induced phase transition for two-dimensional antiferromagnets. For the system of Ising antiferromagnet the predicted field-induced phase transition is of first order, while for the system of Heisenberg antiferromagnet it is the second-order transition. We find from the exact diagonalization calculations that the second-order phase transition (metamagnetism) occurs through a spin-flop process as an intermediate step.Comment: 4 pages, 4 figure

    Comparing the quality of crowdsourced data contributed by expert and non-experts

    Get PDF
    There is currently a lack of in-situ environmental data for the calibration and validation of remotely sensed products and for the development and verification of models. Crowdsourcing is increasingly being seen as one potentially powerful way of increasing the supply of in-situ data but here are a number of concerns over the subsequent use of the data, in particular over data quality. This paper examined crowdsourced data from the Geo-Wiki crowdsourcing tool for land cover validation to determine whether there were significant differences in quality between the answers provided by experts and non-experts in the domain of remote sensing and therefore the extent to which crowdsourced data describing human impact and land cover can be used in further scientific research. The results showed that there was little difference between experts and non-experts in identifying human impact although results varied by land cover while experts were better than non-experts in identifying the land cover type. This suggests the need to create training materials with more examples in those areas where difficulties in identification were encountered, and to offer some method for contributors to reflect on the information they contribute, perhaps by feeding back the evaluations of their contributed data or by making additional training materials available. Accuracies were also found to be higher when the volunteers were more consistent in their responses at a given location and when they indicated higher confidence, which suggests that these additional pieces of information could be used in the development of robust measures of quality in the future
    • …
    corecore