136 research outputs found

    Trust-based algorithms for fusing crowdsourced estimates of continuous quantities

    No full text
    Crowdsourcing has provided a viable way of gathering information at unprecedented volumes and speed by engaging individuals to perform simple microā€“tasks. In particular, the crowdsourcing paradigm has been successfully applied to participatory sensing, in which the users perform sensing tasks and provide data using their mobile devices. In this way, people can help solve complex environmental sensing tasks, such as weather monitoring, nuclear radiation monitoring and cell tower mapping, in a highly decentralised and parallelised fashion. Traditionally, crowdsourcing technologies were primarily used for gathering data for classifications and image labelling tasks. In contrast, such crowdā€“based participatory sensing poses new challenges that relate to (i) dealing with humanā€“reported sensor data that are available in the form of continuous estimates of an observed quantity such as a location, a temperature or a sound reading, (ii) dealing with possible spatial and temporal correlations within the data and (ii) issues of data trustworthiness due to the unknown capabilities and incentives of the participants and their devices. Solutions to these challenges need to be able to combine the data provided by multiple users to ensure the accuracy and the validity of the aggregated results. With this in mind, our goal is to provide methods to better aid the aggregation process of crowdā€“reported sensor estimates of continuous quantities when data are provided by individuals of varying trustworthiness. To achieve this, we develop a trustā€“based in- formation fusion framework that incorporates latent trustworthiness traits of the users within the data fusion process. Through this framework, we develop a set of four novel algorithms (MaxTrust, BACE, TrustGP and TrustLGCP) to compute reliable aggregations of the usersā€™ reports in both the settings of observing a stationary quantity (Max- Trust and BACE) and a spatially distributed phenomenon (TrustGP and TrustLGCP). The key feature of all these algorithm is the ability of (i) learning the trustworthiness of each individual who provide the data and (ii) exploit this latent userā€™s trustworthiness information to compute a more accurate fused estimate. In particular, this is achieved by using a probabilistic framework that allows our methods to simultaneously learn the fused estimate and the usersā€™ trustworthiness from the crowd reports. We validate our algorithms in four key application areas (cell tower mapping, WiFi networks mapping, nuclear radiation monitoring and disaster response) that demonstrate the practical impact of our framework to achieve substantially more accurate and informative predictions compared to the existing fusion methods. We expect that results of this thesis will allow to build more reliable data fusion algorithms for the broad class of humanā€“centred information systems (e.g., recommendation systems, peer reviewing systems, student grading tools) that are based on making decisions upon subjective opinions provided by their users

    Bayesian modelling of community-based multidimensional trust in participatory sensing under data sparsity

    No full text
    We propose a new Bayesian model for reliable aggregation of crowdsourced estimates of real-valued quantities in participatory sensing applications. Existing approaches focus on probabilistic modelling of userā€™s reliability as the key to accurate aggregation. However, these are either limited to estimating discrete quantities, or require a significant number of reports from each user to accurately model their reliability. To mitigate these issues, we adopt a community-based approach, which reduces the data required to reliably aggregate real-valued estimates, by leveraging correlations between the re- porting behaviour of users belonging to different communities. As a result, our method is up to 16.6% more accurate than existing state-of-the-art methods and is up to 49% more effective under data sparsity when used to estimate Wi-Fi hotspot locations in a real-world crowdsourcing application

    2018 ADRF Network Research Conference Program

    Get PDF

    Earth Observation Open Science and Innovation

    Get PDF
    geospatial analytics; social observatory; big earth data; open data; citizen science; open innovation; earth system science; crowdsourced geospatial data; citizen science; science in society; data scienc

    Intelligent Data Analytics using Deep Learning for Data Science

    Get PDF
    Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization

    Big Earth Data and Machine Learning for Sustainable and Resilient Agriculture

    Full text link
    Big streams of Earth images from satellites or other platforms (e.g., drones and mobile phones) are becoming increasingly available at low or no cost and with enhanced spatial and temporal resolution. This thesis recognizes the unprecedented opportunities offered by the high quality and open access Earth observation data of our times and introduces novel machine learning and big data methods to properly exploit them towards developing applications for sustainable and resilient agriculture. The thesis addresses three distinct thematic areas, i.e., the monitoring of the Common Agricultural Policy (CAP), the monitoring of food security and applications for smart and resilient agriculture. The methodological innovations of the developments related to the three thematic areas address the following issues: i) the processing of big Earth Observation (EO) data, ii) the scarcity of annotated data for machine learning model training and iii) the gap between machine learning outputs and actionable advice. This thesis demonstrated how big data technologies such as data cubes, distributed learning, linked open data and semantic enrichment can be used to exploit the data deluge and extract knowledge to address real user needs. Furthermore, this thesis argues for the importance of semi-supervised and unsupervised machine learning models that circumvent the ever-present challenge of scarce annotations and thus allow for model generalization in space and time. Specifically, it is shown how merely few ground truth data are needed to generate high quality crop type maps and crop phenology estimations. Finally, this thesis argues there is considerable distance in value between model inferences and decision making in real-world scenarios and thereby showcases the power of causal and interpretable machine learning in bridging this gap.Comment: Phd thesi

    Modeling User Transportation Patterns Using Mobile Devices

    Get PDF
    Participatory sensing frameworks use humans and their computing devices as a large mobile sensing network. Dramatic accessibility and affordability have turned mobile devices (smartphone and tablet computers) into the most popular computational machines in the world, exceeding laptops. By the end of 2013, more than 1.5 billion people on earth will have a smartphone. Increased coverage and higher speeds of cellular networks have given these devices the power to constantly stream large amounts of data. Most mobile devices are equipped with advanced sensors such as GPS, cameras, and microphones. This expansion of smartphone numbers and power has created a sensing system capable of achieving tasks practically impossible for conventional sensing platforms. One of the advantages of participatory sensing platforms is their mobility, since human users are often in motion. This dissertation presents a set of techniques for modeling and predicting user transportation patterns from cell-phone and social media check-ins. To study large-scale transportation patterns, I created a mobile phone app, Kpark, for estimating parking lot occupancy on the UCF campus. Kpark aggregates individual user reports on parking space availability to produce a global picture across all the campus lots using crowdsourcing. An issue with crowdsourcing is the possibility of receiving inaccurate information from users, either through error or malicious motivations. One method of combating this problem is to model the trustworthiness of individual participants to use that information to selectively include or discard data. This dissertation presents a comprehensive study of the performance of different worker quality and data fusion models with plausible simulated user populations, as well as an evaluation of their performance on the real data obtained from a full release of the Kpark app on the UCF Orlando campus. To evaluate individual trust prediction methods, an algorithm selection portfolio was introduced to take advantage of the strengths of each method and maximize the overall prediction performance. Like many other crowdsourced applications, user incentivization is an important aspect of creating a successful crowdsourcing workflow. For this project a form of non-monetized incentivization called gamification was used in order to create competition among users with the aim of increasing the quantity and quality of data submitted to the project. This dissertation reports on the performance of Kpark at predicting parking occupancy, increasing user app usage, and predicting worker quality
    • ā€¦
    corecore