6 research outputs found
Efficient crowdsourcing of crowd-generated microtasks
Allowing members of the crowd to propose novel microtasks for one another is
an effective way to combine the efficiencies of traditional microtask work with
the inventiveness and hypothesis generation potential of human workers.
However, microtask proposal leads to a growing set of tasks that may overwhelm
limited crowdsourcer resources. Crowdsourcers can employ methods to utilize
their resources efficiently, but algorithmic approaches to efficient
crowdsourcing generally require a fixed task set of known size. In this paper,
we introduce *cost forecasting* as a means for a crowdsourcer to use efficient
crowdsourcing algorithms with a growing set of microtasks. Cost forecasting
allows the crowdsourcer to decide between eliciting new tasks from the crowd or
receiving responses to existing tasks based on whether or not new tasks will
cost less to complete than existing tasks, efficiently balancing resources as
crowdsourcing occurs. Experiments with real and synthetic crowdsourcing data
show that cost forecasting leads to improved accuracy. Accuracy and efficiency
gains for crowd-generated microtasks hold the promise to further leverage the
creativity and wisdom of the crowd, with applications such as generating more
informative and diverse training data for machine learning applications and
improving the performance of user-generated content and question-answering
platforms.Comment: 12 pages, 5 figure
Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions
Crowdsourcing enables one to leverage on the intelligence and wisdom of
potentially large groups of individuals toward solving problems. Common
problems approached with crowdsourcing are labeling images, translating or
transcribing text, providing opinions or ideas, and similar - all tasks that
computers are not good at or where they may even fail altogether. The
introduction of humans into computations and/or everyday work, however, also
poses critical, novel challenges in terms of quality control, as the crowd is
typically composed of people with unknown and very diverse abilities, skills,
interests, personal objectives and technological resources. This survey studies
quality in the context of crowdsourcing along several dimensions, so as to
define and characterize it and to understand the current state of the art.
Specifically, this survey derives a quality model for crowdsourcing tasks,
identifies the methods and techniques that can be used to assess the attributes
of the model, and the actions and strategies that help prevent and mitigate
quality problems. An analysis of how these features are supported by the state
of the art further identifies open issues and informs an outlook on hot future
research directions.Comment: 40 pages main paper, 5 pages appendi
Accurate inference about crowdsourcing problems when using efficient allocation strategies
Accurate inference about crowdsourcing problems when using efficient allocation strategies Crowdsourcing is a modern technique to solve complex and computationally challenging sets of problems using the abilities of human participants [1]. However, human participants are relatively expensive compared with computational methods, so considerable research has investigated algorithmic strategies for efficiently distributing problems to participants and determining when problems have been sufficiently completed [2, 3, 4]. Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual problems. We show that allocation algorithms introduce bias by allocating workers to easy tasks at the expense of difficult tasks and by ceasing to obtain information about tasks once the algorithm has concluded. As a result, data gathered with allocation algorithms are biased and not representative of the true distribution of those data. This bias challenges inference of crowdsourcing features such as typical task difficulty or worker completion times. To study crowdsourcing algorithms and problem bias we introduce a model for crowdsourcing a set of problems where we can tune the distribution of problem difficulty. We then apply an allocation algorithm, Requallo [2], to our model and find that the distribution of problem difficulty is biased—Requallo-completed tasks are more likely to be easy tasks and less likely to be hard tasks. Finally, we introduce an inference procedure, Decision-Explicit Probability Sampling (DEPS), to estimate the true problem difficulty distribution given only an allocation algorithm’s responses, allowing us to reason about the larger problem space while leveraging the efficiency of the allocation method. Results on real and synthetic crowdsourcing classifications show that DEPS creates a more accurate representation of the underlying distribution than baseline methods. The ability to perform accurate inference when using non-representative data allows crowdsourcers to extract more knowledge out of a given budget. References [1] Brabham, D. C. (2008). Crowdsourcing as a model for problem solving: An introduction and cases. Convergence, 14(1), 75-90. [2] Li, Q., Ma, F., Gao, J., Su, L., & Quinn, C. J. (2016). Crowdsourcing high quality labels with a tight budget. In WSDM’16, ACM. [3] Chen, Xi, Qihang Lin, and Dengyong Zhou. Optimistic knowledge gradient policy for optimal budget allocation in crowdsourcing. International conference on machine learning. 2013. [4] McAndrew, T. C., Guseva, E. A., & Bagrow, J. P. (2017). Reply & Supply: E fficient crowdsourcing when workers do more than answer questions. PloS one, 12(8), e0182662