35 research outputs found

    Adversarial attacks on crowdsourcing quality control

    Get PDF
    Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks of which the requester knows the correct answer and, thus, is able to directly assess crowd work quality. In this paper, we show that such mechanism is prone to an attack carried out by a group of colluding crowd workers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowd workers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack

    The Online Laboratory: Conducting Experiments in a Real Labor Market

    Get PDF
    Online labor markets have great potential as platforms for conducting experiments, as they provide immediate access to a large and diverse subject pool and allow researchers to conduct randomized controlled trials. We argue that online experiments can be just as valid---both internally and externally---as laboratory and field experiments, while requiring far less money and time to design and to conduct. In this paper, we first describe the benefits of conducting experiments in online labor markets; we then use one such market to replicate three classic experiments and confirm their results. We confirm that subjects (1) reverse decisions in response to how a decision-problem is framed, (2) have pro-social preferences (value payoffs to others positively), and (3) respond to priming by altering their choices. We also conduct a labor supply field experiment in which we confirm that workers have upward sloping labor supply curves. In addition to reporting these results, we discuss the unique threats to validity in an online setting and propose methods for coping with these threats. We also discuss the external validity of results from online domains and explain why online results can have external validity equal to or even better than that of traditional methods, depending on the research question. We conclude with our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices

    A survey of spatial crowdsourcing

    Get PDF

    Spam elimination and bias correction : ensuring label quality in crowdsourced tasks.

    Get PDF
    Crowdsourcing is proposed as a powerful mechanism for accomplishing large scale tasks via anonymous workers online. It has been demonstrated as an effective and important approach for collecting labeled data in application domains which require human intelligence, such as image labeling, video annotation, natural language processing, etc. Despite the promises, one big challenge still exists in crowdsourcing systems: the difficulty of controlling the quality of crowds. The workers usually have diverse education levels, personal preferences, and motivations, leading to unknown work performance while completing a crowdsourced task. Among them, some are reliable, and some might provide noisy feedback. It is intrinsic to apply worker filtering approach to crowdsourcing applications, which recognizes and tackles noisy workers, in order to obtain high-quality labels. The presented work in this dissertation provides discussions in this area of research, and proposes efficient probabilistic based worker filtering models to distinguish varied types of poor quality workers. Most of the existing work in literature in the field of worker filtering either only concentrates on binary labeling tasks, or fails to separate the low quality workers whose label errors can be corrected from the other spam workers (with label errors which cannot be corrected). As such, we first propose a Spam Removing and De-biasing Framework (SRDF), to deal with the worker filtering procedure in labeling tasks with numerical label scales. The developed framework can detect spam workers and biased workers separately. The biased workers are defined as those who show tendencies of providing higher (or lower) labels than truths, and their errors are able to be corrected. To tackle the biasing problem, an iterative bias detection approach is introduced to recognize the biased workers. The spam filtering algorithm proposes to eliminate three types of spam workers, including random spammers who provide random labels, uniform spammers who give same labels for most of the items, and sloppy workers who offer low accuracy labels. Integrating the spam filtering and bias detection approaches into aggregating algorithms, which infer truths from labels obtained from crowds, can lead to high quality consensus results. The common characteristic of random spammers and uniform spammers is that they provide useless feedback without making efforts for a labeling task. Thus, it is not necessary to distinguish them separately. In addition, the removal of sloppy workers has great impact on the detection of biased workers, with the SRDF framework. To combat these problems, a different way of worker classification is presented in this dissertation. In particular, the biased workers are classified as a subcategory of sloppy workers. Finally, an ITerative Self Correcting - Truth Discovery (ITSC-TD) framework is then proposed, which can reliably recognize biased workers in ordinal labeling tasks, based on a probabilistic based bias detection model. ITSC-TD estimates true labels through applying an optimization based truth discovery method, which minimizes overall label errors by assigning different weights to workers. The typical tasks posted on popular crowdsourcing platforms, such as MTurk, are simple tasks, which are low in complexity, independent, and require little time to complete. Complex tasks, however, in many cases require the crowd workers to possess specialized skills in task domains. As a result, this type of task is more inclined to have the problem of poor quality of feedback from crowds, compared to simple tasks. As such, we propose a multiple views approach, for the purpose of obtaining high quality consensus labels in complex labeling tasks. In this approach, each view is defined as a labeling critique or rubric, which aims to guide the workers to become aware of the desirable work characteristics or goals. Combining the view labels results in the overall estimated labels for each item. The multiple views approach is developed under the hypothesis that workers\u27 performance might differ from one view to another. Varied weights are then assigned to different views for each worker. Additionally, the ITSC-TD framework is integrated into the multiple views model to achieve high quality estimated truths for each view. Next, we propose a Semi-supervised Worker Filtering (SWF) model to eliminate spam workers, who assign random labels for each item. The SWF approach conducts worker filtering with a limited set of gold truths available as priori. Each worker is associated with a spammer score, which is estimated via the developed semi-supervised model, and low quality workers are efficiently detected by comparing the spammer score with a predefined threshold value. The efficiency of all the developed frameworks and models are demonstrated on simulated and real-world data sets. By comparing the proposed frameworks to a set of state-of-art methodologies, such as expectation maximization based aggregating algorithm, GLAD and optimization based truth discovery approach, in the domain of crowdsourcing, up to 28.0% improvement can be obtained for the accuracy of true label estimation

    Optimization techniques for human computation-enabled data processing systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 119-124).Crowdsourced labor markets make it possible to recruit large numbers of people to complete small tasks that are difficult to automate on computers. These marketplaces are increasingly widely used, with projections of over $1 billion being transferred between crowd employers and crowd workers by the end of 2012. While crowdsourcing enables forms of computation that artificial intelligence has not yet achieved, it also presents crowd workflow designers with a series of challenges including describing tasks, pricing tasks, identifying and rewarding worker quality, dealing with incorrect responses, and integrating human computation into traditional programming frameworks. In this dissertation, we explore the systems-building, operator design, and optimization challenges involved in building a crowd-powered workflow management system. We describe a system called Qurk that utilizes techniques from databases such as declarative workflow definition, high-latency workflow execution, and query optimization to aid crowd-powered workflow developers. We study how crowdsourcing can enhance the capabilities of traditional databases by evaluating how to implement basic database operators such as sorts and joins on datasets that could not have been processed using traditional computation frameworks. Finally, we explore the symbiotic relationship between the crowd and query optimization, enlisting crowd workers to perform selectivity estimation, a key component in optimizing complex crowd-powered workflows.by Adam Marcus.Ph.D

    Multi-modal Spatial Crowdsourcing for Enriching Spatial Datasets

    Get PDF

    The Online Laboratory: Conducting Experiments in a Real Labor Market

    Get PDF
    Online labor markets have great potential as platforms for conducting experiments, as they provide immediate access to a large and diverse subject pool and allow researchers to conduct randomized controlled trials. We argue that online experiments can be just as valid – both internally and externally – as laboratory and field experiments, while requiring far less money and time to design and to conduct. In this paper, we first describe the benefits of conducting experiments in online labor markets; we then use one such market to replicate three classic experiments and confirm their results. We confirm that subjects (1) reverse decisions in response to how a decision-problem is framed, (2) have pro-social preferences (value payoffs to others positively), and (3) respond to priming by altering their choices. We also conduct a labor supply field experiment in which we confirm that workers have upward sloping labor supply curves. In addition to reporting these results, we discuss the unique threats to validity in an online setting and propose methods for coping with these threats. We also discuss the external validity of results from online domains and explain why online results can have external validity equal to or even better than that of traditional methods, depending on the research question. We conclude with our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices.

    Mutually reinforcing systems

    Get PDF
    Human computation can be described as outsourcing part of a computational process to humans. This technique might be used when a problem can be solved better by humans than computers or it may require a level of adaptation that computers are not yet capable of handling. This can be particularly important in changeable settings which require a greater level of adaptation to the surrounding environment. In most cases, human computation has been used to gather data that computers struggle to create. Games with by-products can provide an incentive for people to carry out such tasks by rewarding them with entertainment. These are games which are designed to create a by-product during the course of regular play. However, such games have traditionally been unable to deal with requests for specific data, relying instead on a broad capture of data in the hope that it will cover specific needs. A new method is needed to focus the efforts of human computation and produce specifically requested results. This would make human computation a more valuable and versatile technique. Mutually reinforcing systems are a new approach to human computation that tries to attain this focus. Ordinary human computation systems tend to work in isolation and do not work directly with each other. Mutually reinforcing systems are an attempt to allow multiple human computation systems to work together so that each can benefit from the other's strengths. For example, a non-game system can request specific data from a game. The game can then tailor its game-play to deliver the required by-products from the players. This is also beneficial to the game because the requests become game content, creating variety in the game-play which helps to prevent players getting bored of the game. Mobile systems provide a particularly good test of human computation because they allow users to react to their environment. Real world environments are changeable and require higher levels of adaptation from the users. This means that, in addition to the human computation required by other systems, mobile systems can also take advantage of a user's ability to apply environmental context to the computational task. This research explores the effects of mutually reinforcing systems on mobile games with by-products. These effects will be explored by building and testing mutually reinforcing systems, including mobile games. A review of existing literature, human computation systems and games with by-products will set out problems which exist in outsourcing parts of a computational process to humans. Mutually reinforcing systems are presented as one approach of addressing some of these problems. Example systems have been created to demonstrate the successes and failures of this approach and their evolving designs have been documented. The evaluation of these systems will be presented along with a discussion of the outcomes and possible future work. A conclusion will summarize the findings of the work carried out. This dissertation shows that extending human computation techniques to allow the collection and classification of useful contextual information in mobile environments is possible and can be extended to allow the by-products to match the specific needs of another system

    Invisible Labor, Invisible Play: Online Gold Farming and the Boundary Between Jobs and Games

    Get PDF
    When does work become play and play become work? Court shave considered the question in a variety of economic contexts, from student athletes seeking recognition as employees to professional blackjack players seeking to be treated by casinos just like casual players. Here, this question is applied to a relatively novel context: that of online gold farming, a gray-market industry in which wage-earning workers, largely based in China, are paid to play fantasy massively multiplayer online games (MMOs) that reward them with virtual items that their employers sell for profit to the same games\u27 casual players. Gold farming is clearly a job (and under the terms of service of most MMOs, clearly prohibited), yet as shown, US law itself provides no clear means of distinguishing the efforts of the gold farmer from those of the casual player. Viewed through the lens of US labor and employment law, the unpaid players of a typical MMO can arguably be classified as employees of the company that markets the game. Viewed through case law governing when the work of professional players does and does not constitute game play, gold farmers arguably are players in good standing. As a practical matter, these arguments suggest new ways of approaching the regulation of so-called virtual property and of online gaming in general. More broadly, the very viability of these arguments shows that the line between work and play is not so much an empirical fact as it is a social one, produced by negotiations in which the law has a leading role to play. This insight contributes to an ongoing debate about commodification and play that grows more urgent as digital technologies suffuse the world\u27s economy with gaming and its logic
    corecore