Search CORE

332 research outputs found

Location Privacy in Spatial Crowdsourcing

Author: A Liu
A Pham
Ashwin Machanavajjhala
Bo Zhang
C-Y Chow
Delphine Christin
Hien To
LATANYA SWEENEY
Lefeng Zhang
M Shin
Publication venue
Publication date: 22/04/2017
Field of study

Spatial crowdsourcing (SC) is a new platform that engages individuals in collecting and analyzing environmental, social and other spatiotemporal information. With SC, requesters outsource their spatiotemporal tasks to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. This chapter identifies privacy threats toward both workers and requesters during the two main phases of spatial crowdsourcing, tasking and reporting. Tasking is the process of identifying which tasks should be assigned to which workers. This process is handled by a spatial crowdsourcing server (SC-server). The latter phase is reporting, in which workers travel to the tasks' locations, complete the tasks and upload their reports to the SC-server. The challenge is to enable effective and efficient tasking as well as reporting in SC without disclosing the actual locations of workers (at least until they agree to perform a task) and the tasks themselves (at least to workers who are not assigned to those tasks). This chapter aims to provide an overview of the state-of-the-art in protecting users' location privacy in spatial crowdsourcing. We provide a comparative study of a diverse set of solutions in terms of task publishing modes (push vs. pull), problem focuses (tasking and reporting), threats (server, requester and worker), and underlying technical approaches (from pseudonymity, cloaking, and perturbation to exchange-based and encryption-based techniques). The strengths and drawbacks of the techniques are highlighted, leading to a discussion of open problems and future work

arXiv.org e-Print Archive

Crossref

Combating Attacks and Abuse in Large Online Communities

Author: Wang Gang
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Internet users today are connected more widely and ubiquitously than ever before. As a result, various online communities are formed, ranging from online social networks (Facebook, Twitter), to mobile communities (Foursquare, Waze), to content/interests based networks (Wikipedia, Yelp, Quora). While users are benefiting from the ease of access to information and social interactions, there is a growing concern for users' security and privacy against various attacks such as spam, phishing, malware infection and identity theft. Combating attacks and abuse in online communities is challenging. First, today’s online communities are increasingly dependent on users and user-generated content. Securing online systems demands a deep understanding of the complex and often unpredictable human behaviors. Second, online communities can easily have millions or even billions of users, which requires the corresponding security mechanisms to be highly scalable. Finally, cybercriminals are constantly evolving to launch new types of attacks. This further demands high robustness of security defenses. In this thesis, we take concrete steps towards measuring, understanding, and defending against attacks and abuse in online communities. We begin with a series of empirical measurements to understand user behaviors in different online services and the uniquesecurity and privacy challenges that users are facing with. This effort covers a broad set of popular online services including social networks for question and answering (Quora), anonymous social networks (Whisper), and crowdsourced mobile communities (Waze). Despite the differences of specific online communities, our study provides a first look at their user activity patterns based on empirical data, and reveals the need for reliable mechanisms to curate user content, protect privacy, and defend against emerging attacks. Next, we turn our attention to attacks targeting online communities, with focus on spam campaigns. While traditional spam is mostly generated by automated software, attackers today start to introduce "human intelligence" to implement attacks. This is maliciouscrowdsourcing (or crowdturfing) where a large group of real-users are organized to carry out malicious campaigns, such as writing fake reviews or spreading rumors on social media. Using collective human efforts, attackers can easily bypass many existing defenses (e.g.,CAPTCHA). To understand the ecosystem of crowdturfing, we first use measurements to examine their detailed campaign organization, workers and revenue. Based on insights from empirical data, we develop effective machine learning classifiers to detect crowdturfingactivities. In the meantime, considering the adversarial nature of crowdturfing, we also build practical adversarial models to simulate how attackers can evade or disrupt machine learning based defenses. To aid in this effort, we next explore using user behavior models to detect a wider range of attacks. Instead of making assumptions about attacker behavior, our idea is to model normal user behaviors and capture (malicious) behaviors that are deviated from norm. In this way, we can detect previously unknown attacks. Our behavior model is based on detailed clickstream data, which are sequences of click events generated by users when using the service. We build a similarity graph where each user is a node and the edges are weightedby clickstream similarity. By partitioning this graph, we obtain "clusters" of users with similar behaviors. We then use a small set of known good users to "color" these clusters to differentiate the malicious ones. This technique has been adopted by real-world social networks (Renren and LinkedIn), and already detected unexpected attacks. Finally, we extend clickstream model to understanding more-grained behaviors of attackers (and real users), and tracking how user behavior changes over time. In summary, this thesis illustrates a data-driven approach to understanding and defending against attacks and abuse in online communities. Our measurements have revealed new insights about how attackers are evolving to bypass existing security defenses today. Inaddition, our data-driven systems provide new solutions for online services to gain a deep understanding of their users, and defend them from emerging attacks and abuse

Ezid

eScholarship - University of California

A survey of spatial crowdsourcing

Author: Gummidi Srinivasa Raghavendra Bhuvan
Pedersen Torben Bach
Xie Xike
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2019
Field of study

VBN

Quality control and cost management in crowdsourcing

Author: Wang Yue
Publication venue
Publication date: 06/05/2021
Field of study

By harvesting online workers’ knowledge, crowdsourcing has become an efficient and cost-effective way to obtain a large amount of labeled data for solving human intelligent tasks (HITs), such as entity resolution and sentiment analysis. Due to the open nature of crowdsourcing, online workers with different knowledge backgrounds may provide conflicting labels to tasks. Therefore, it is a common practice to perform a pre-determined number of assignments, either per task or for all tasks, and aggregate collected labels to infer the true label of tasks. This model could suffer from poor accuracy in case of under-budget or a waste of resource in case of over-budget. In addition, as worker labels are usually aggregated in a voting manner, crowdsourcing systems are vulnerable to strategic Sybil attack, where the attacker may manipulate several robot Sybil workers to share randomized labels for outvoting independent workers and apply various strategies to evade Sybil detection. In this thesis, we are specifically interested in providing a guaranteed aggregation accuracy with minimum worker cost and defending against strategic Sybil attack. In our first work, we assume that workers are independent and honest. By enforcing a specified accuracy threshold on aggregated labels and minimizing the worker cost under this requirement, we formulate the dual requirements for quality and cost as a Guaranteed Accuracy Problem (GAP) and present an efficient task assignment algorithm for solving the problem. In our second work, we assume that strategic Sybil attackers may coordinate Sybil workers to obtain rewards without honestly labeling tasks and apply different strategies to evade detection. By camouflaging golden tasks (i.e., tasks with known true labels) from the attacker and suppressing the impact of Sybil workers and low-quality independent workers, we extend the principled truth discovery to defend against strategic Sybil attack in crowdsorucing. For both works, we conduct comprehensive empirical evaluations on real and synthetic datasets to demonstrate the effectiveness and efficiency of our methods

Simon Fraser University Institutional Repository

Considering Human Aspects on Strategies for Designing and Managing Distributed Human Computation

Author: Andrade Nazareno
Brasileiro Francisco
Ponciano Lesandro
Sampaio Lívia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/06/2015
Field of study

A human computation system can be viewed as a distributed system in which the processors are humans, called workers. Such systems harness the cognitive power of a group of workers connected to the Internet to execute relatively simple tasks, whose solutions, once grouped, solve a problem that systems equipped with only machines could not solve satisfactorily. Examples of such systems are Amazon Mechanical Turk and the Zooniverse platform. A human computation application comprises a group of tasks, each of them can be performed by one worker. Tasks might have dependencies among each other. In this study, we propose a theoretical framework to analyze such type of application from a distributed systems point of view. Our framework is established on three dimensions that represent different perspectives in which human computation applications can be approached: quality-of-service requirements, design and management strategies, and human aspects. By using this framework, we review human computation in the perspective of programmers seeking to improve the design of human computation applications and managers seeking to increase the effectiveness of human computation infrastructures in running such applications. In doing so, besides integrating and organizing what has been done in this direction, we also put into perspective the fact that the human aspects of the workers in such systems introduce new challenges in terms of, for example, task assignment, dependency management, and fault prevention and tolerance. We discuss how they are related to distributed systems and other areas of knowledge.Comment: 3 figures, 1 tabl

arXiv.org e-Print Archive

Springer - Publisher Connector

Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions

Author: Allahbakhsh Mohammad
Benatallah Boualem
Cappiello Cinzia
Daniel Florian
Kucherbaev Pavel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Crowdsourcing enables one to leverage on the intelligence and wisdom of potentially large groups of individuals toward solving problems. Common problems approached with crowdsourcing are labeling images, translating or transcribing text, providing opinions or ideas, and similar - all tasks that computers are not good at or where they may even fail altogether. The introduction of humans into computations and/or everyday work, however, also poses critical, novel challenges in terms of quality control, as the crowd is typically composed of people with unknown and very diverse abilities, skills, interests, personal objectives and technological resources. This survey studies quality in the context of crowdsourcing along several dimensions, so as to define and characterize it and to understand the current state of the art. Specifically, this survey derives a quality model for crowdsourcing tasks, identifies the methods and techniques that can be used to assess the attributes of the model, and the actions and strategies that help prevent and mitigate quality problems. An analysis of how these features are supported by the state of the art further identifies open issues and informs an outlook on hot future research directions.Comment: 40 pages main paper, 5 pages appendi

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano