We propose a learning setting in which unlabeled data is free, and the cost
of a label depends on its value, which is not known in advance. We study binary
classification in an extreme case, where the algorithm only pays for negative
labels. Our motivation are applications such as fraud detection, in which
investigating an honest transaction should be avoided if possible. We term the
setting auditing, and consider the auditing complexity of an algorithm: the
number of negative labels the algorithm requires in order to learn a hypothesis
with low relative error. We design auditing algorithms for simple hypothesis
classes (thresholds and rectangles), and show that with these algorithms, the
auditing complexity can be significantly lower than the active label
complexity. We also discuss a general competitive approach for auditing and
possible modifications to the framework.Comment: Corrections in section