This paper addresses the repeated acquisition of labels for data items
when the labeling is imperfect. We examine the improvement (or lack
thereof) in data quality via repeated labeling, and focus especially on
the improvement of training labels for supervised induction. With the
outsourcing of small tasks becoming easier, for example via Amazon's
Mechanical Turk, it often is possible to obtain less-than-expert
labeling at low cost. With low-cost labeling, preparing the unlabeled
part of the data can become considerably more expensive than labeling.
We present repeated-labeling strategies of increasing complexity, and
show several main results. (i) Repeated-labeling can improve label
quality and model quality, but not always. (ii) When labels are noisy,
repeated labeling can be preferable to single labeling even in the
traditional setting where labels are not particularly cheap. (iii) As
soon as the cost of processing the unlabeled data is not free, even the
simple strategy of labeling everything multiple times can give
considerable advantage. (iv) Repeatedly labeling a carefully chosen set
of points is generally preferable, and we present a set of robust
techniques that combine different notions of uncertainty to select data
points for which quality should be improved. The bottom line: the
results show clearly that when labeling is not perfect, selective
acquisition of multiple labels is a strategy that data miners should
have in their repertoire. For certain label-quality/cost regimes, the
benefit is substantial.This work was supported by the National Science Foundation under Grant
No. IIS-0643846, by an NSERC Postdoctoral Fellowship, and by an NEC
Faculty Fellowship