6 research outputs found

    Efficient crowdsourcing of crowd-generated microtasks

    Full text link
    Allowing members of the crowd to propose novel microtasks for one another is an effective way to combine the efficiencies of traditional microtask work with the inventiveness and hypothesis generation potential of human workers. However, microtask proposal leads to a growing set of tasks that may overwhelm limited crowdsourcer resources. Crowdsourcers can employ methods to utilize their resources efficiently, but algorithmic approaches to efficient crowdsourcing generally require a fixed task set of known size. In this paper, we introduce *cost forecasting* as a means for a crowdsourcer to use efficient crowdsourcing algorithms with a growing set of microtasks. Cost forecasting allows the crowdsourcer to decide between eliciting new tasks from the crowd or receiving responses to existing tasks based on whether or not new tasks will cost less to complete than existing tasks, efficiently balancing resources as crowdsourcing occurs. Experiments with real and synthetic crowdsourcing data show that cost forecasting leads to improved accuracy. Accuracy and efficiency gains for crowd-generated microtasks hold the promise to further leverage the creativity and wisdom of the crowd, with applications such as generating more informative and diverse training data for machine learning applications and improving the performance of user-generated content and question-answering platforms.Comment: 12 pages, 5 figure

    Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions

    Get PDF
    Crowdsourcing enables one to leverage on the intelligence and wisdom of potentially large groups of individuals toward solving problems. Common problems approached with crowdsourcing are labeling images, translating or transcribing text, providing opinions or ideas, and similar - all tasks that computers are not good at or where they may even fail altogether. The introduction of humans into computations and/or everyday work, however, also poses critical, novel challenges in terms of quality control, as the crowd is typically composed of people with unknown and very diverse abilities, skills, interests, personal objectives and technological resources. This survey studies quality in the context of crowdsourcing along several dimensions, so as to define and characterize it and to understand the current state of the art. Specifically, this survey derives a quality model for crowdsourcing tasks, identifies the methods and techniques that can be used to assess the attributes of the model, and the actions and strategies that help prevent and mitigate quality problems. An analysis of how these features are supported by the state of the art further identifies open issues and informs an outlook on hot future research directions.Comment: 40 pages main paper, 5 pages appendi

    Accurate inference about crowdsourcing problems when using efficient allocation strategies

    No full text
    Accurate inference about crowdsourcing problems when using efficient allocation strategies Crowdsourcing is a modern technique to solve complex and computationally challenging sets of problems using the abilities of human participants [1]. However, human participants are relatively expensive compared with computational methods, so considerable research has investigated algorithmic strategies for efficiently distributing problems to participants and determining when problems have been sufficiently completed [2, 3, 4]. Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual problems. We show that allocation algorithms introduce bias by allocating workers to easy tasks at the expense of difficult tasks and by ceasing to obtain information about tasks once the algorithm has concluded. As a result, data gathered with allocation algorithms are biased and not representative of the true distribution of those data. This bias challenges inference of crowdsourcing features such as typical task difficulty or worker completion times. To study crowdsourcing algorithms and problem bias we introduce a model for crowdsourcing a set of problems where we can tune the distribution of problem difficulty. We then apply an allocation algorithm, Requallo [2], to our model and find that the distribution of problem difficulty is biased—Requallo-completed tasks are more likely to be easy tasks and less likely to be hard tasks. Finally, we introduce an inference procedure, Decision-Explicit Probability Sampling (DEPS), to estimate the true problem difficulty distribution given only an allocation algorithm’s responses, allowing us to reason about the larger problem space while leveraging the efficiency of the allocation method. Results on real and synthetic crowdsourcing classifications show that DEPS creates a more accurate representation of the underlying distribution than baseline methods. The ability to perform accurate inference when using non-representative data allows crowdsourcers to extract more knowledge out of a given budget. References [1] Brabham, D. C. (2008). Crowdsourcing as a model for problem solving: An introduction and cases. Convergence, 14(1), 75-90. [2] Li, Q., Ma, F., Gao, J., Su, L., & Quinn, C. J. (2016). Crowdsourcing high quality labels with a tight budget. In WSDM’16, ACM. [3] Chen, Xi, Qihang Lin, and Dengyong Zhou. Optimistic knowledge gradient policy for optimal budget allocation in crowdsourcing. International conference on machine learning. 2013. [4] McAndrew, T. C., Guseva, E. A., & Bagrow, J. P. (2017). Reply & Supply: E fficient crowdsourcing when workers do more than answer questions. PloS one, 12(8), e0182662