84,025 research outputs found
Deep learning from crowds
Over the last few years, deep learning has revolutionized the field of
machine learning by dramatically improving the state-of-the-art in various
domains. However, as the size of supervised artificial neural networks grows,
typically so does the need for larger labeled datasets. Recently, crowdsourcing
has established itself as an efficient and cost-effective solution for labeling
large sets of data in a scalable manner, but it often requires aggregating
labels from multiple noisy contributors with different levels of expertise. In
this paper, we address the problem of learning deep neural networks from
crowds. We begin by describing an EM algorithm for jointly learning the
parameters of the network and the reliabilities of the annotators. Then, a
novel general-purpose crowd layer is proposed, which allows us to train deep
neural networks end-to-end, directly from the noisy labels of multiple
annotators, using only backpropagation. We empirically show that the proposed
approach is able to internally capture the reliability and biases of different
annotators and achieve new state-of-the-art results for various crowdsourced
datasets across different settings, namely classification, regression and
sequence labeling.Comment: 10 pages, The Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI), 201
Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
This paper presents a generic Bayesian framework that enables any deep
learning model to actively learn from targeted crowds. Our framework inherits
from recent advances in Bayesian deep learning, and extends existing work by
considering the targeted crowdsourcing approach, where multiple annotators with
unknown expertise contribute an uncontrolled amount (often limited) of
annotations. Our framework leverages the low-rank structure in annotations to
learn individual annotator expertise, which then helps to infer the true labels
from noisy and sparse annotations. It provides a unified Bayesian model to
simultaneously infer the true labels and train the deep learning model in order
to reach an optimal learning efficacy. Finally, our framework exploits the
uncertainty of the deep learning model during prediction as well as the
annotators' estimated expertise to minimize the number of required annotations
and annotators for optimally training the deep learning model.
We evaluate the effectiveness of our framework for intent classification in
Alexa (Amazon's personal assistant), using both synthetic and real-world
datasets. Experiments show that our framework can accurately learn annotator
expertise, infer true labels, and effectively reduce the amount of annotations
in model training as compared to state-of-the-art approaches. We further
discuss the potential of our proposed framework in bridging machine learning
and crowdsourcing towards improved human-in-the-loop systems
Label Selection Approach to Learning from Crowds
Supervised learning, especially supervised deep learning, requires large
amounts of labeled data. One approach to collect large amounts of labeled data
is by using a crowdsourcing platform where numerous workers perform the
annotation tasks. However, the annotation results often contain label noise, as
the annotation skills vary depending on the crowd workers and their ability to
complete the task correctly. Learning from Crowds is a framework which directly
trains the models using noisy labeled data from crowd workers. In this study,
we propose a novel Learning from Crowds model, inspired by SelectiveNet
proposed for the selective prediction problem. The proposed method called Label
Selection Layer trains a prediction model by automatically determining whether
to use a worker's label for training using a selector network. A major
advantage of the proposed method is that it can be applied to almost all
variants of supervised learning problems by simply adding a selector network
and changing the objective function for existing models, without explicitly
assuming a model of the noise in crowd annotations. The experimental results
show that the performance of the proposed method is almost equivalent to or
better than the Crowd Layer, which is one of the state-of-the-art methods for
Deep Learning from Crowds, except for the regression problem case.Comment: 15 pages, 1 figur
- …