Search CORE

683 research outputs found

Recommended from our members

Crowdsourced Data Mining for Urban Activity: A Review of Data Sources, Applications and Methods

Author: Niu Haifeng
Silva Elisabete
Publication venue: Journal of Urban Planning and Development
Publication date: 01/01/2020
Field of study

The penetration of devices integrated with location-based services and internet services has generated massive data about the everyday life of citizens and tracked their activities happening in cities. Crowdsourced data, such as social media data, POIs data and collaborative websites, generated by the crowd, has become fine-grained proxy data of urban activity and widely used in research in urban studies. However, due to the heterogeneity of data types of crowdsourced data and the limitation of previous studies mainly focusing on a specific application, a systematic review of crowdsourced data mining for urban activity is still lacking. In order to fill the gap, this paper conducts a literature search in the Web of Science database, selecting 226 highly related papers published between 2013 and 2019. Based on those papers, the review firstly conducts a bibliometric analysis identifying underpinning domains, pivot scholars and papers around this topic. The review also synthesises previous research into three parts: main applications of different data sources and data fusion; application of spatial analysis in mobility patterns, functional areas and event detection; application of socio-demographic and perception analysis in city attractiveness, demographic characteristics and sentiment analysis. The challenges of this type of data are also discussed in the end. This study provides a systematic and current review for both researchers and practitioners interested in the applications of crowdsourced data mining for urban activity.This research is funded by a scholarship from the China Scholarship Counci

Apollo (Cambridge)

Similarity-driven and Task-driven Models for Diversity of Opinion in Crowdsourcing Markets

Author: Chen Lei
Hao Fei
Hui Pan
Liu Yunrui
Wu Ting
Zeng Pengcheng
Zhang Chen Jason
Publication venue
Publication date: 25/10/2023
Field of study

The recent boom in crowdsourcing has opened up a new avenue for utilizing human intelligence in the realm of data analysis. This innovative approach provides a powerful means for connecting online workers to tasks that cannot effectively be done solely by machines or conducted by professional experts due to cost constraints. Within the field of social science, four elements are required to construct a sound crowd - Diversity of Opinion, Independence, Decentralization and Aggregation. However, while the other three components have already been investigated and implemented in existing crowdsourcing platforms, 'Diversity of Opinion' has not been functionally enabled yet. From a computational point of view, constructing a wise crowd necessitates quantitatively modeling and taking diversity into account. There are usually two paradigms in a crowdsourcing marketplace for worker selection: building a crowd to wait for tasks to come and selecting workers for a given task. We propose similarity-driven and task-driven models for both paradigms. Also, we develop efficient and effective algorithms for recruiting a limited number of workers with optimal diversity in both models. To validate our solutions, we conduct extensive experiments using both synthetic datasets and real data sets.Comment: 32 pages, 10 figure

arXiv.org e-Print Archive

A survey of urban drive-by sensing: An optimization perspective

Author: Han Ke
Ji Wen
Liu Tao
Publication venue
Publication date: 29/07/2023
Field of study

Pervasive and mobile sensing is an integral part of smart transport and smart city applications. Vehicle-based mobile sensing, or drive-by sensing (DS), is gaining popularity in both academic research and field practice. The DS paradigm has an inherent transport component, as the spatial-temporal distribution of the sensors are closely related to the mobility patterns of their hosts, which may include third-party (e.g. taxis, buses) or for-hire (e.g. unmanned aerial vehicles and dedicated vehicles) vehicles. It is therefore essential to understand, assess and optimize the sensing power of vehicle fleets under a wide range of urban sensing scenarios. To this end, this paper offers an optimization-oriented summary of recent literature by presenting a four-step discussion, namely (1) quantifying the sensing quality (objective); (2) assessing the sensing power of various fleets (strategic); (3) sensor deployment (strategic/tactical); and (4) vehicle maneuvers (tactical/operational). By compiling research findings and practical insights in this way, this review article not only highlights the optimization aspect of drive-by sensing, but also serves as a practical guide for configuring and deploying vehicle-based urban sensing systems.Comment: 24 pages, 3 figures, 4 table

arXiv.org e-Print Archive

An Inquiry into Supply Chain Strategy Implications of the Sharing Economy for Last Mile Logistics

Author: Castillo Vincent Emanuel
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 12/05/2018
Field of study

As the prevalence of e-commerce and subsequent importance of effective and efficient omnichannel logistics strategies continues to rise, retail firms are exploring the viability of sourcing logistics capabilities from the sharing economy. Questions arise such as, “how can crowdbased logistics solutions such as crowdsourced logistics (CSL), crowdshipping, and pickup point networks (PPN) be leveraged to increase performance?” In this dissertation, empirical and analytical research is conducted that increases understanding of how firms can leverage the sharing economy to increase logistics and supply chain performance. Essay 1 explores crowdsourced logistics (CSL) by employing a stochastic discrete event simulation set in New York City in which a retail firm sources drivers from the crowd to perform same day deliveries under dynamic market conditions. Essay 2 employs a design science paradigm to develop a typology of crowdbased logistics strategies using two qualitative methodologies: web content analysis and Delphi surveys. A service-dominant logic theoretical perspective guides this essay and explains how firms co-create value with the crowd and consumer markets while presenting a generic design for integrating crowdbased models into logistics strategy. In Essay 3, a crowdsourced logistics strategy for home delivery is modeled in an empirically grounded simulation optimization to explore the logistics cost and responsiveness implications of sharing economy solutions on omnichannel fulfillment strategies

University of Tennessee, Knoxville: Trace

Searching and mining in enriched geo-spatial data

Author: Schmid Klaus Arthur
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 09/12/2016
Field of study

The emergence of new data collection mechanisms in geo-spatial applications paired with a heightened tendency of users to volunteer information provides an ever-increasing flow of data of high volume, complex nature, and often associated with inherent uncertainty. Such mechanisms include crowdsourcing, automated knowledge inference, tracking, and social media data repositories. Such data bearing additional information from multiple sources like probability distributions, text or numerical attributes, social context, or multimedia content can be called multi-enriched. Searching and mining this abundance of information holds many challenges, if all of the data's potential is to be released. This thesis addresses several major issues arising in that field, namely path queries using multi-enriched data, trend mining in social media data, and handling uncertainty in geo-spatial data. In all cases, the developed methods have made significant contributions and have appeared in or were accepted into various renowned international peer-reviewed venues. A common use of geo-spatial data is path queries in road networks where traditional methods optimise results based on absolute and ofttimes singular metrics, i.e., finding the shortest paths based on distance or the best trade-off between distance and travel time. Integrating additional aspects like qualitative or social data by enriching the data model with knowledge derived from sources as mentioned above allows for queries that can be issued to fit a broader scope of needs or preferences. This thesis presents two implementations of incorporating multi-enriched data into road networks. In one case, a range of qualitative data sources is evaluated to gain knowledge about user preferences which is subsequently matched with locations represented in a road network and integrated into its components. Several methods are presented for highly customisable path queries that incorporate a wide spectrum of data. In a second case, a framework is described for resource distribution with reappearance in road networks to serve one or more clients, resulting in paths that provide maximum gain based on a probabilistic evaluation of available resources. Applications for this include finding parking spots. Social media trends are an emerging research area giving insight in user sentiment and important topics. Such trends consist of bursts of messages concerning a certain topic within a time frame, significantly deviating from the average appearance frequency of the same topic. By investigating the dissemination of such trends in space and time, this thesis presents methods to classify trend archetypes to predict future dissemination of a trend. Processing and querying uncertain data is particularly demanding given the additional knowledge required to yield results with probabilistic guarantees. Since such knowledge is not always available and queries are not easily scaled to larger datasets due to the #P-complete nature of the problem, many existing approaches reduce the data to a deterministic representation of its underlying model to eliminate uncertainty. However, data uncertainty can also provide valuable insight into the nature of the data that cannot be represented in a deterministic manner. This thesis presents techniques for clustering uncertain data as well as query processing, that take the additional information from uncertainty models into account while preserving scalability using a sampling-based approach, while previous approaches could only provide one of the two. The given solutions enable the application of various existing clustering techniques or query types to a framework that manages the uncertainty.Das Erscheinen neuer Methoden zur Datenerhebung in räumlichen Applikationen gepaart mit einer erhöhten Bereitschaft der Nutzer, Daten über sich preiszugeben, generiert einen stetig steigenden Fluss von Daten in großer Menge, komplexer Natur, und oft gepaart mit inhärenter Unsicherheit. Beispiele für solche Mechanismen sind Crowdsourcing, automatisierte Wissensinferenz, Tracking, und Daten aus sozialen Medien. Derartige Daten, angereichert mit mit zusätzlichen Informationen aus verschiedenen Quellen wie Wahrscheinlichkeitsverteilungen, Text- oder numerische Attribute, sozialem Kontext, oder Multimediainhalten, werden als multi-enriched bezeichnet. Suche und Datamining in dieser weiten Datenmenge hält viele Herausforderungen bereit, wenn das gesamte Potenzial der Daten genutzt werden soll. Diese Arbeit geht auf mehrere große Fragestellungen in diesem Feld ein, insbesondere Pfadanfragen in multi-enriched Daten, Trend-mining in Daten aus sozialen Netzwerken, und die Beherrschung von Unsicherheit in räumlichen Daten. In all diesen Fällen haben die entwickelten Methoden signifikante Forschungsbeiträge geleistet und wurden veröffentlicht oder angenommen zu diversen renommierten internationalen, von Experten begutachteten Konferenzen und Journals. Ein gängiges Anwendungsgebiet räumlicher Daten sind Pfadanfragen in Straßennetzwerken, wo traditionelle Methoden die Resultate anhand absoluter und oft auch singulärer Maße optimieren, d.h., der kürzeste Pfad in Bezug auf die Distanz oder der beste Kompromiss zwischen Distanz und Reisezeit. Durch die Integration zusätzlicher Aspekte wie qualitativer Daten oder Daten aus sozialen Netzwerken als Anreicherung des Datenmodells mit aus diesen Quellen abgeleitetem Wissen werden Anfragen möglich, die ein breiteres Spektrum an Anforderungen oder Präferenzen erfüllen. Diese Arbeit präsentiert zwei Ansätze, solche multi-enriched Daten in Straßennetze einzufügen. Zum einen wird eine Reihe qualitativer Datenquellen ausgewertet, um Wissen über Nutzerpräferenzen zu generieren, welches darauf mit Örtlichkeiten im Straßennetz abgeglichen und in das Netz integriert wird. Diverse Methoden werden präsentiert, die stark personalisierbare Pfadanfragen ermöglichen, die ein weites Spektrum an Daten mit einbeziehen. Im zweiten Fall wird ein Framework präsentiert, das eine Ressourcenverteilung im Straßennetzwerk modelliert, bei der einmal verbrauchte Ressourcen erneut auftauchen können. Resultierende Pfade ergeben einen maximalen Ertrag basieren auf einer probabilistischen Evaluation der verfügbaren Ressourcen. Eine Anwendung ist die Suche nach Parkplätzen. Trends in sozialen Medien sind ein entstehendes Forscchungsgebiet, das Einblicke in Benutzerverhalten und wichtige Themen zulässt. Solche Trends bestehen aus großen Mengen an Nachrichten zu einem bestimmten Thema innerhalb eines Zeitfensters, so dass die Auftrittsfrequenz signifikant über den durchschnittlichen Level liegt. Durch die Untersuchung der Fortpflanzung solcher Trends in Raum und Zeit präsentiert diese Arbeit Methoden, um Trends nach Archetypen zu klassifizieren und ihren zukünftigen Weg vorherzusagen. Die Anfragebearbeitung und Datamining in unsicheren Daten ist besonders herausfordernd, insbesondere im Hinblick auf das notwendige Zusatzwissen, um Resultate mit probabilistischen Garantien zu erzielen. Solches Wissen ist nicht immer verfügbar und Anfragen lassen sich aufgrund der \P-Vollständigkeit des Problems nicht ohne Weiteres auf größere Datensätze skalieren. Dennoch kann Datenunsicherheit wertvollen Einblick in die Struktur der Daten liefern, der mit deterministischen Methoden nicht erreichbar wäre. Diese Arbeit präsentiert Techniken zum Clustering unsicherer Daten sowie zur Anfragebearbeitung, die die Zusatzinformation aus dem Unsicherheitsmodell in Betracht ziehen, jedoch gleichzeitig die Skalierbarkeit des Ansatzes auf große Datenmengen sicherstellen

Happiness is greater in more scenic locations

Author: MacKerron George
Moat Helen Susannah
Preis Tobias
Seresinhe Chanuki Illushka
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Does spending time in beautiful settings boost people’s happiness? The answer to this question has long remained elusive due to a paucity of large-scale data on environmental aesthetics and individual happiness. Here, we draw on two novel datasets: first, individual happiness data from the smartphone app, Mappiness, and second, crowdsourced ratings of the “scenicness” of photographs taken across England from the online game Scenic-Or-Not. We find that individuals are happier in more scenic locations, even when we account for a range of factors such as the activity the individual was engaged in at the time, weather conditions and the income of local inhabitants. Crucially, this relationship holds not only in natural environments, but in built-up areas too, even after controlling for the presence of green space. Our results provide evidence that the aesthetics of the environments that policymakers choose to build or demolish may have consequences for our everyday wellbeing

Directory of Open Access Journals

Warwick Research Archives Portal Repository

Sussex Research Online

RL-OPRA: Reinforcement Learning for Online and Proactive Resource Allocation of crowdsourced live videos

Author: Baccour Emna
Erbad Aiman
Guizani Mohsen
Hamdi Mounir
Haouari Fatima
Mohamed Amr
Publication venue: 'Elsevier BV'
Publication date: 01/11/2020
Field of study

© 2020 Elsevier B.V. With the advancement of rich media generating devices, the proliferation of live Content Providers (CP), and the availability of convenient internet access, crowdsourced live streaming services have witnessed unexpected growth. To ensure a better Quality of Experience (QoE), higher availability, and lower costs, large live streaming CPs are migrating their services to geo-distributed cloud infrastructure. However, because of the dynamics of live broadcasting and the wide geo-distribution of viewers and broadcasters, it is still challenging to satisfy all requests with reasonable resources. To overcome this challenge, we introduce in this paper a prediction driven approach that estimates the potential number of viewers near different cloud sites at the instant of broadcasting. This online and instant prediction of distributed popularity distinguishes our work from previous efforts that provision constant resources or alter their allocation as the popularity of the content changes. Based on the derived predictions, we formulate an Integer-Linear Program (ILP) to proactively and dynamically choose the right data center to allocate exact resources and serve potential viewers, while minimizing the perceived delays. As the optimization is not adequate for online serving, we propose a real-time approach based on Reinforcement Learning (RL), namely RL-OPRA, which adaptively learns to optimize the allocation and serving decisions by interacting with the network environment. Extensive simulation and comparison with the ILP have shown that our RL-based approach is able to present optimal results compared to heuristic-based approaches.This work was supported by the Qatar Foundation

Qatar University Institutional Repository

Multi-modal Spatial Crowdsourcing for Enriching Spatial Datasets

Author: Gummidi Srinivasa Raghavendra Bhuvan
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2021
Field of study

VBN

Privacy-preserving crowdsourced site survey in WiFi fingerprint-based localization

Author
Publication venue: Springer
Publication date: 04/05/2016
Field of study

Springer - Publisher Connector

Data analytics 2016: proceedings of the fifth international conference on data analytics

Author: Bhulai Sandjai
Semanjski Ivana
Publication venue: The International Academy, Research and Industry Association
Publication date: 01/01/2016
Field of study

VU Research Portal

Ghent University Academic Bibliography