45 research outputs found
Agnostic Active Learning Without Constraints
We present and analyze an agnostic active learning algorithm that works
without keeping a version space. This is unlike all previous approaches where a
restricted set of candidate hypotheses is maintained throughout learning, and
only hypotheses from this set are ever returned. By avoiding this version space
approach, our algorithm sheds the computational burden and brittleness
associated with maintaining version spaces, yet still allows for substantial
improvements over supervised learning for classification
Active classification with comparison queries
We study an extension of active learning in which the learning algorithm may
ask the annotator to compare the distances of two examples from the boundary of
their label-class. For example, in a recommendation system application (say for
restaurants), the annotator may be asked whether she liked or disliked a
specific restaurant (a label query); or which one of two restaurants did she
like more (a comparison query).
We focus on the class of half spaces, and show that under natural
assumptions, such as large margin or bounded bit-description of the input
examples, it is possible to reveal all the labels of a sample of size using
approximately queries. This implies an exponential improvement over
classical active learning, where only label queries are allowed. We complement
these results by showing that if any of these assumptions is removed then, in
the worst case, queries are required.
Our results follow from a new general framework of active learning with
additional queries. We identify a combinatorial dimension, called the
\emph{inference dimension}, that captures the query complexity when each
additional query is determined by examples (such as comparison queries,
each of which is determined by the two compared examples). Our results for half
spaces follow by bounding the inference dimension in the cases discussed above.Comment: 23 pages (not including references), 1 figure. The new version
contains a minor fix in the proof of Lemma 4.
Online Importance Weight Aware Updates
An importance weight quantifies the relative importance of one example over
another, coming up in applications of boosting, asymmetric classification
costs, reductions, and active learning. The standard approach for dealing with
importance weights in gradient descent is via multiplication of the gradient.
We first demonstrate the problems of this approach when importance weights are
large, and argue in favor of more sophisticated ways for dealing with them. We
then develop an approach which enjoys an invariance property: that updating
twice with importance weight is equivalent to updating once with importance
weight . For many important losses this has a closed form update which
satisfies standard regret guarantees when all examples have . We also
briefly discuss two other reasonable approaches for handling large importance
weights. Empirically, these approaches yield substantially superior prediction
with similar computational performance while reducing the sensitivity of the
algorithm to the exact setting of the learning rate. We apply these to online
active learning yielding an extraordinarily fast active learning algorithm that
works even in the presence of adversarial noise