611 research outputs found
The Power of Localization for Efficiently Learning Linear Separators with Noise
We introduce a new approach for designing computationally efficient learning
algorithms that are tolerant to noise, and demonstrate its effectiveness by
designing algorithms with improved noise tolerance guarantees for learning
linear separators.
We consider both the malicious noise model and the adversarial label noise
model. For malicious noise, where the adversary can corrupt both the label and
the features, we provide a polynomial-time algorithm for learning linear
separators in under isotropic log-concave distributions that can
tolerate a nearly information-theoretically optimal noise rate of . For the adversarial label noise model, where the
distribution over the feature vectors is unchanged, and the overall probability
of a noisy label is constrained to be at most , we also give a
polynomial-time algorithm for learning linear separators in under
isotropic log-concave distributions that can handle a noise rate of .
We show that, in the active learning model, our algorithms achieve a label
complexity whose dependence on the error parameter is
polylogarithmic. This provides the first polynomial-time active learning
algorithm for learning linear separators in the presence of malicious noise or
adversarial label noise.Comment: Contains improved label complexity analysis communicated to us by
Steve Hannek
Active Learning and Best-Response Dynamics
We examine an important setting for engineered systems in which low-power
distributed sensors are each making highly noisy measurements of some unknown
target function. A center wants to accurately learn this function by querying a
small number of sensors, which ordinarily would be impossible due to the high
noise rate. The question we address is whether local communication among
sensors, together with natural best-response dynamics in an
appropriately-defined game, can denoise the system without destroying the true
signal and allow the center to succeed from only a small number of active
queries. By using techniques from game theory and empirical processes, we prove
positive (and negative) results on the denoising power of several natural
dynamics. We then show experimentally that when combined with recent agnostic
active learning algorithms, this process can achieve low error from very few
queries, performing substantially better than active or passive learning
without these denoising dynamics as well as passive learning with denoising
Efficient Learning of Linear Separators under Bounded Noise
We study the learnability of linear separators in in the presence of
bounded (a.k.a Massart) noise. This is a realistic generalization of the random
classification noise model, where the adversary can flip each example with
probability . We provide the first polynomial time algorithm
that can learn linear separators to arbitrarily small excess error in this
noise model under the uniform distribution over the unit ball in , for
some constant value of . While widely studied in the statistical learning
theory community in the context of getting faster convergence rates,
computationally efficient algorithms in this model had remained elusive. Our
work provides the first evidence that one can indeed design algorithms
achieving arbitrarily small excess error in polynomial time under this
realistic noise model and thus opens up a new and exciting line of research.
We additionally provide lower bounds showing that popular algorithms such as
hinge loss minimization and averaging cannot lead to arbitrarily small excess
error under Massart noise, even under the uniform distribution. Our work
instead, makes use of a margin based technique developed in the context of
active learning. As a result, our algorithm is also an active learning
algorithm with label complexity that is only a logarithmic the desired excess
error
Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy
We describe a framework for designing efficient active learning algorithms
that are tolerant to random classification noise and are
differentially-private. The framework is based on active learning algorithms
that are statistical in the sense that they rely on estimates of expectations
of functions of filtered random examples. It builds on the powerful statistical
query framework of Kearns (1993).
We show that any efficient active statistical learning algorithm can be
automatically converted to an efficient active learning algorithm which is
tolerant to random classification noise as well as other forms of
"uncorrelated" noise. The complexity of the resulting algorithms has
information-theoretically optimal quadratic dependence on , where
is the noise rate.
We show that commonly studied concept classes including thresholds,
rectangles, and linear separators can be efficiently actively learned in our
framework. These results combined with our generic conversion lead to the first
computationally-efficient algorithms for actively learning some of these
concept classes in the presence of random classification noise that provide
exponential improvement in the dependence on the error over their
passive counterparts. In addition, we show that our algorithms can be
automatically converted to efficient active differentially-private algorithms.
This leads to the first differentially-private active learning algorithms with
exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201
Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting
In this paper, we consider the problem of actively learning a linear
classifier through query synthesis where the learner can construct artificial
queries in order to estimate the true decision boundaries. This problem has
recently gained a lot of interest in automated science and adversarial reverse
engineering for which only heuristic algorithms are known. In such
applications, queries can be constructed de novo to elicit information (e.g.,
automated science) or to evade detection with minimal cost (e.g., adversarial
reverse engineering). We develop a general framework, called dimension coupling
(DC), that 1) reduces a d-dimensional learning problem to d-1 low dimensional
sub-problems, 2) solves each sub-problem efficiently, 3) appropriately
aggregates the results and outputs a linear classifier, and 4) provides a
theoretical guarantee for all possible schemes of aggregation. The proposed
method is proved resilient to noise. We show that the DC framework avoids the
curse of dimensionality: its computational complexity scales linearly with the
dimension. Moreover, we show that the query complexity of DC is near optimal
(within a constant factor of the optimum algorithm). To further support our
theoretical analysis, we compare the performance of DC with the existing work.
We observe that DC consistently outperforms the prior arts in terms of query
complexity while often running orders of magnitude faster.Comment: Accepted by AAAI 201
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Noise-adaptive Margin-based Active Learning and Lower Bounds under Tsybakov Noise Condition
We present a simple noise-robust margin-based active learning algorithm to
find homogeneous (passing the origin) linear separators and analyze its error
convergence when labels are corrupted by noise. We show that when the imposed
noise satisfies the Tsybakov low noise condition (Mammen, Tsybakov, and others
1999; Tsybakov 2004) the algorithm is able to adapt to unknown level of noise
and achieves optimal statistical rate up to poly-logarithmic factors. We also
derive lower bounds for margin based active learning algorithms under Tsybakov
noise conditions (TNC) for the membership query synthesis scenario (Angluin
1988). Our result implies lower bounds for the stream based selective sampling
scenario (Cohn 1990) under TNC for some fairly simple data distributions. Quite
surprisingly, we show that the sample complexity cannot be improved even if the
underlying data distribution is as simple as the uniform distribution on the
unit ball. Our proof involves the construction of a well separated hypothesis
set on the d-dimensional unit ball along with carefully designed label
distributions for the Tsybakov noise condition. Our analysis might provide
insights for other forms of lower bounds as well.Comment: 16 pages, 2 figures. An abridged version to appear in Thirtieth AAAI
Conference on Artificial Intelligence (AAAI), which is held in Phoenix, AZ
USA in 201
Learning with a Drifting Target Concept
We study the problem of learning in the presence of a drifting target
concept. Specifically, we provide bounds on the error rate at a given time,
given a learner with access to a history of independent samples labeled
according to a target concept that can change on each round. One of our main
contributions is a refinement of the best previous results for polynomial-time
algorithms for the space of linear separators under a uniform distribution. We
also provide general results for an algorithm capable of adapting to a variable
rate of drift of the target concept. Some of the results also describe an
active learning variant of this setting, and provide bounds on the number of
queries for the labels of points in the sequence sufficient to obtain the
stated bounds on the error rates
- …