23 research outputs found
Active classification with comparison queries
We study an extension of active learning in which the learning algorithm may
ask the annotator to compare the distances of two examples from the boundary of
their label-class. For example, in a recommendation system application (say for
restaurants), the annotator may be asked whether she liked or disliked a
specific restaurant (a label query); or which one of two restaurants did she
like more (a comparison query).
We focus on the class of half spaces, and show that under natural
assumptions, such as large margin or bounded bit-description of the input
examples, it is possible to reveal all the labels of a sample of size using
approximately queries. This implies an exponential improvement over
classical active learning, where only label queries are allowed. We complement
these results by showing that if any of these assumptions is removed then, in
the worst case, queries are required.
Our results follow from a new general framework of active learning with
additional queries. We identify a combinatorial dimension, called the
\emph{inference dimension}, that captures the query complexity when each
additional query is determined by examples (such as comparison queries,
each of which is determined by the two compared examples). Our results for half
spaces follow by bounding the inference dimension in the cases discussed above.Comment: 23 pages (not including references), 1 figure. The new version
contains a minor fix in the proof of Lemma 4.
Beyond Disagreement-based Agnostic Active Learning
We study agnostic active learning, where the goal is to learn a classifier in
a pre-specified hypothesis class interactively with as few label queries as
possible, while making no assumptions on the true function generating the
labels. The main algorithms for this problem are {\em{disagreement-based active
learning}}, which has a high label requirement, and {\em{margin-based active
learning}}, which only applies to fairly restricted settings. A major challenge
is to find an algorithm which achieves better label complexity, is consistent
in an agnostic setting, and applies to general classification problems.
In this paper, we provide such an algorithm. Our solution is based on two
novel contributions -- a reduction from consistent active learning to
confidence-rated prediction with guaranteed error, and a novel confidence-rated
predictor
Online Active Learning of Reject Option Classifiers
Active learning is an important technique to reduce the number of labeled
examples in supervised learning. Active learning for binary classification has
been well addressed in machine learning. However, active learning of the reject
option classifier remains unaddressed. In this paper, we propose novel
algorithms for active learning of reject option classifiers. We develop an
active learning algorithm using double ramp loss function. We provide mistake
bounds for this algorithm. We also propose a new loss function called double
sigmoid loss function for reject option and corresponding active learning
algorithm. We offer a convergence guarantee for this algorithm. We provide
extensive experimental results to show the effectiveness of the proposed
algorithms. The proposed algorithms efficiently reduce the number of label
examples required