242 research outputs found
The Power of Localization for Efficiently Learning Linear Separators with Noise
We introduce a new approach for designing computationally efficient learning
algorithms that are tolerant to noise, and demonstrate its effectiveness by
designing algorithms with improved noise tolerance guarantees for learning
linear separators.
We consider both the malicious noise model and the adversarial label noise
model. For malicious noise, where the adversary can corrupt both the label and
the features, we provide a polynomial-time algorithm for learning linear
separators in under isotropic log-concave distributions that can
tolerate a nearly information-theoretically optimal noise rate of . For the adversarial label noise model, where the
distribution over the feature vectors is unchanged, and the overall probability
of a noisy label is constrained to be at most , we also give a
polynomial-time algorithm for learning linear separators in under
isotropic log-concave distributions that can handle a noise rate of .
We show that, in the active learning model, our algorithms achieve a label
complexity whose dependence on the error parameter is
polylogarithmic. This provides the first polynomial-time active learning
algorithm for learning linear separators in the presence of malicious noise or
adversarial label noise.Comment: Contains improved label complexity analysis communicated to us by
Steve Hannek
Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy
We describe a framework for designing efficient active learning algorithms
that are tolerant to random classification noise and are
differentially-private. The framework is based on active learning algorithms
that are statistical in the sense that they rely on estimates of expectations
of functions of filtered random examples. It builds on the powerful statistical
query framework of Kearns (1993).
We show that any efficient active statistical learning algorithm can be
automatically converted to an efficient active learning algorithm which is
tolerant to random classification noise as well as other forms of
"uncorrelated" noise. The complexity of the resulting algorithms has
information-theoretically optimal quadratic dependence on , where
is the noise rate.
We show that commonly studied concept classes including thresholds,
rectangles, and linear separators can be efficiently actively learned in our
framework. These results combined with our generic conversion lead to the first
computationally-efficient algorithms for actively learning some of these
concept classes in the presence of random classification noise that provide
exponential improvement in the dependence on the error over their
passive counterparts. In addition, we show that our algorithms can be
automatically converted to efficient active differentially-private algorithms.
This leads to the first differentially-private active learning algorithms with
exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201
Noise-adaptive Margin-based Active Learning and Lower Bounds under Tsybakov Noise Condition
We present a simple noise-robust margin-based active learning algorithm to
find homogeneous (passing the origin) linear separators and analyze its error
convergence when labels are corrupted by noise. We show that when the imposed
noise satisfies the Tsybakov low noise condition (Mammen, Tsybakov, and others
1999; Tsybakov 2004) the algorithm is able to adapt to unknown level of noise
and achieves optimal statistical rate up to poly-logarithmic factors. We also
derive lower bounds for margin based active learning algorithms under Tsybakov
noise conditions (TNC) for the membership query synthesis scenario (Angluin
1988). Our result implies lower bounds for the stream based selective sampling
scenario (Cohn 1990) under TNC for some fairly simple data distributions. Quite
surprisingly, we show that the sample complexity cannot be improved even if the
underlying data distribution is as simple as the uniform distribution on the
unit ball. Our proof involves the construction of a well separated hypothesis
set on the d-dimensional unit ball along with carefully designed label
distributions for the Tsybakov noise condition. Our analysis might provide
insights for other forms of lower bounds as well.Comment: 16 pages, 2 figures. An abridged version to appear in Thirtieth AAAI
Conference on Artificial Intelligence (AAAI), which is held in Phoenix, AZ
USA in 201
Efficient Learning of Linear Separators under Bounded Noise
We study the learnability of linear separators in in the presence of
bounded (a.k.a Massart) noise. This is a realistic generalization of the random
classification noise model, where the adversary can flip each example with
probability . We provide the first polynomial time algorithm
that can learn linear separators to arbitrarily small excess error in this
noise model under the uniform distribution over the unit ball in , for
some constant value of . While widely studied in the statistical learning
theory community in the context of getting faster convergence rates,
computationally efficient algorithms in this model had remained elusive. Our
work provides the first evidence that one can indeed design algorithms
achieving arbitrarily small excess error in polynomial time under this
realistic noise model and thus opens up a new and exciting line of research.
We additionally provide lower bounds showing that popular algorithms such as
hinge loss minimization and averaging cannot lead to arbitrarily small excess
error under Massart noise, even under the uniform distribution. Our work
instead, makes use of a margin based technique developed in the context of
active learning. As a result, our algorithm is also an active learning
algorithm with label complexity that is only a logarithmic the desired excess
error
- …