121,555 research outputs found
Batch Reinforcement Learning from Crowds
A shortcoming of batch reinforcement learning is its requirement for rewards
in data, thus not applicable to tasks without reward functions. Existing
settings for lack of reward, such as behavioral cloning, rely on optimal
demonstrations collected from humans. Unfortunately, extensive expertise is
required for ensuring optimality, which hinder the acquisition of large-scale
data for complex tasks. This paper addresses the lack of reward in a batch
reinforcement learning setting by learning a reward function from preferences.
Generating preferences only requires a basic understanding of a task. Being a
mental process, generating preferences is faster than performing
demonstrations. So preferences can be collected at scale from non-expert humans
using crowdsourcing. This paper tackles a critical challenge that emerged when
collecting data from non-expert humans: the noise in preferences. A novel
probabilistic model is proposed for modelling the reliability of labels, which
utilizes labels collaboratively. Moreover, the proposed model smooths the
estimation with a learned reward function. Evaluation on Atari datasets
demonstrates the effectiveness of the proposed model, followed by an ablation
study to analyze the relative importance of the proposed ideas.Comment: 16 pages. Accepted by ECML-PKDD 202
Label Selection Approach to Learning from Crowds
Supervised learning, especially supervised deep learning, requires large
amounts of labeled data. One approach to collect large amounts of labeled data
is by using a crowdsourcing platform where numerous workers perform the
annotation tasks. However, the annotation results often contain label noise, as
the annotation skills vary depending on the crowd workers and their ability to
complete the task correctly. Learning from Crowds is a framework which directly
trains the models using noisy labeled data from crowd workers. In this study,
we propose a novel Learning from Crowds model, inspired by SelectiveNet
proposed for the selective prediction problem. The proposed method called Label
Selection Layer trains a prediction model by automatically determining whether
to use a worker's label for training using a selector network. A major
advantage of the proposed method is that it can be applied to almost all
variants of supervised learning problems by simply adding a selector network
and changing the objective function for existing models, without explicitly
assuming a model of the noise in crowd annotations. The experimental results
show that the performance of the proposed method is almost equivalent to or
better than the Crowd Layer, which is one of the state-of-the-art methods for
Deep Learning from Crowds, except for the regression problem case.Comment: 15 pages, 1 figur
Learning from Crowds by Modeling Common Confusions
Crowdsourcing provides a practical way to obtain large amounts of labeled
data at a low cost. However, the annotation quality of annotators varies
considerably, which imposes new challenges in learning a high-quality model
from the crowdsourced annotations. In this work, we provide a new perspective
to decompose annotation noise into common noise and individual noise and
differentiate the source of confusion based on instance difficulty and
annotator expertise on a per-instance-annotator basis. We realize this new
crowdsourcing model by an end-to-end learning solution with two types of noise
adaptation layers: one is shared across annotators to capture their commonly
shared confusions, and the other one is pertaining to each annotator to realize
individual confusion. To recognize the source of noise in each annotation, we
use an auxiliary network to choose the two noise adaptation layers with respect
to both instances and annotators. Extensive experiments on both synthesized and
real-world benchmarks demonstrate the effectiveness of our proposed common
noise adaptation solution.Comment: Accepted by AAAI 202
Identifying fake Amazon reviews as learning from crowds
Customers who buy products such as books online often rely on other customers reviews more than on reviews found on specialist magazines. Unfortunately the confidence in such reviews is often misplaced due to the explosion of so-called sock puppetry-Authors writing glowing reviews of their own books. Identifying such deceptive reviews is not easy. The first contribution of our work is the creation of a collection including a number of genuinely deceptive Amazon book reviews in collaboration with crime writer Jeremy Duns, who has devoted a great deal of effort in unmasking sock puppeting among his colleagues. But there can be no certainty concerning the other reviews in the collection: All we have is a number of cues, also developed in collaboration with Duns, suggesting that a review may be genuine or deceptive. Thus this corpus is an example of a collection where it is not possible to acquire the actual label for all instances, and where clues of deception were treated as annotators who assign them heuristic labels. A number of approaches have been proposed for such cases; we adopt here the 'learning from crowds' approach proposed by Raykar et al. (2010). Thanks to Duns' certainly fake reviews, the second contribution of this work consists in the evaluation of the effectiveness of different methods of annotation, according to the performance of models trained to detect deceptive reviews. © 2014 Association for Computational Linguistics
- …