12 research outputs found
Optimal Inference in Crowdsourced Classification via Belief Propagation
Crowdsourcing systems are popular for solving large-scale labelling tasks
with low-paid workers. We study the problem of recovering the true labels from
the possibly erroneous crowdsourced labels under the popular Dawid-Skene model.
To address this inference problem, several algorithms have recently been
proposed, but the best known guarantee is still significantly larger than the
fundamental limit. We close this gap by introducing a tighter lower bound on
the fundamental limit and proving that Belief Propagation (BP) exactly matches
this lower bound. The guaranteed optimality of BP is the strongest in the sense
that it is information-theoretically impossible for any other algorithm to
correctly label a larger fraction of the tasks. Experimental results suggest
that BP is close to optimal for all regimes considered and improves upon
competing state-of-the-art algorithms.Comment: This article is partially based on preliminary results published in
the proceeding of the 33rd International Conference on Machine Learning (ICML
2016
Learning from Crowds by Modeling Common Confusions
Crowdsourcing provides a practical way to obtain large amounts of labeled
data at a low cost. However, the annotation quality of annotators varies
considerably, which imposes new challenges in learning a high-quality model
from the crowdsourced annotations. In this work, we provide a new perspective
to decompose annotation noise into common noise and individual noise and
differentiate the source of confusion based on instance difficulty and
annotator expertise on a per-instance-annotator basis. We realize this new
crowdsourcing model by an end-to-end learning solution with two types of noise
adaptation layers: one is shared across annotators to capture their commonly
shared confusions, and the other one is pertaining to each annotator to realize
individual confusion. To recognize the source of noise in each annotation, we
use an auxiliary network to choose the two noise adaptation layers with respect
to both instances and annotators. Extensive experiments on both synthesized and
real-world benchmarks demonstrate the effectiveness of our proposed common
noise adaptation solution.Comment: Accepted by AAAI 202
Iterative Bayesian Learning for Crowdsourced Regression
Crowdsourcing platforms emerged as popular venues for purchasing human
intelligence at low cost for large volume of tasks. As many low-paid workers
are prone to give noisy answers, a common practice is to add redundancy by
assigning multiple workers to each task and then simply average out these
answers. However, to fully harness the wisdom of the crowd, one needs to learn
the heterogeneous quality of each worker. We resolve this fundamental challenge
in crowdsourced regression tasks, i.e., the answer takes continuous labels,
where identifying good or bad workers becomes much more non-trivial compared to
a classification setting of discrete labels. In particular, we introduce a
Bayesian iterative scheme and show that it provably achieves the optimal mean
squared error. Our evaluations on synthetic and real-world datasets support our
theoretical results and show the superiority of the proposed scheme
Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification
Many real world problems can now be effectively solved using supervised
machine learning. A major roadblock is often the lack of an adequate quantity
of labeled data for training. A possible solution is to assign the task of
labeling data to a crowd, and then infer the true label using aggregation
methods. A well-known approach for aggregation is the Dawid-Skene (DS)
algorithm, which is based on the principle of Expectation-Maximization (EM). We
propose a new simple, yet effective, EM-based algorithm, which can be
interpreted as a `hard' version of DS, that allows much faster convergence
while maintaining similar accuracy in aggregation. We show the use of this
algorithm as a quick and effective technique for online, real-time sentiment
annotation. We also prove that our algorithm converges to the estimated labels
at a linear rate. Our experiments on standard datasets show a significant
speedup in time taken for aggregation - upto 8x over Dawid-Skene and
6x over other fast EM methods, at competitive accuracy performance. The
code for the implementation of the algorithms can be found at
https://github.com/GoodDeeds/Fast-Dawid-SkeneComment: 8 pages, 5 tables, 1 figure, KDD Workshop on Issues of Sentiment
Discovery and Opinion Mining (WISDOM) 201
A Provably Improved Algorithm for Crowdsourcing with Hard and Easy Tasks
Crowdsourcing is a popular method used to estimate ground-truth labels by
collecting noisy labels from workers. In this work, we are motivated by
crowdsourcing applications where each worker can exhibit two levels of accuracy
depending on a task's type. Applying algorithms designed for the traditional
Dawid-Skene model to such a scenario results in performance which is limited by
the hard tasks. Therefore, we first extend the model to allow worker accuracy
to vary depending on a task's unknown type. Then we propose a spectral method
to partition tasks by type. After separating tasks by type, any Dawid-Skene
algorithm (i.e., any algorithm designed for the Dawid-Skene model) can be
applied independently to each type to infer the truth values. We theoretically
prove that when crowdsourced data contain tasks with varying levels of
difficulty, our algorithm infers the true labels with higher accuracy than any
Dawid-Skene algorithm. Experiments show that our method is effective in
practical applications
A Task-Interdependency Model of Complex Collaboration Towards Human-Centered Crowd Work
Models of crowdsourcing and human computation often assume that individuals
independently carry out small, modular tasks. However, while these models have
successfully shown how crowds can accomplish significant objectives, they can
inadvertently advance a less than human view of crowd workers and fail to
capture the unique human capacity for complex collaborative work. We present a
model centered on interdependencies -- a phenomenon well understood to be at
the core of collaboration -- that allows one to formally reason about diverse
challenges to complex collaboration. Our model represents tasks as an
interdependent collection of subtasks, formalized as a task graph. We use it to
explain challenges to scaling complex collaborative work, underscore the
importance of expert workers, reveal critical factors for learning on the job,
and explore the relationship between coordination intensity and occupational
wages. Using data from O*NET and the Bureau of Labor Statistics, we introduce
an index of occupational coordination intensity to validate our theoretical
predictions. We present preliminary evidence that occupations with greater
coordination intensity are less exposed to displacement by AI, and discuss
opportunities for models that emphasize the collaborative capacities of human
workers, bridge models of crowd work and traditional work, and promote AI in
roles augmenting human collaboration
Learning from the Crowd with Pairwise Comparison
Efficient learning of halfspaces is arguably one of the most important
problems in machine learning and statistics. With the unprecedented growth of
large-scale data sets, it has become ubiquitous to appeal to crowd for data
annotation, and the central problem that attracts a surge of recent interests
is how one can provably learn the underlying halfspace from the highly noisy
crowd feedback. On the other hand, a large body of recent works have been
dedicated to the problem of learning with not only labels, but also pairwise
comparisons, since in many cases it is easier to compare than to label. In this
paper we study the problem of learning halfspaces from the crowd under the
realizable PAC learning setting, and we assume that the crowd workers can
provide (noisy) labels or pairwise comparison tags upon request. We show that
with a powerful boosting framework, together with our novel design of a
filtering process, the overhead (to be defined) of the crowd acts as a
constant, whereas the natural extension of standard approaches to crowd setting
leads to an overhead growing with the size of the data sets