401,109 research outputs found
A Survey of Quantum Learning Theory
This paper surveys quantum learning theory: the theoretical aspects of
machine learning using quantum computers. We describe the main results known
for three models of learning: exact learning from membership queries, and
Probably Approximately Correct (PAC) and agnostic learning from classical or
quantum examples.Comment: 26 pages LaTeX. v2: many small changes to improve the presentation.
This version will appear as Complexity Theory Column in SIGACT News in June
2017. v3: fixed a small ambiguity in the definition of gamma(C) and updated a
referenc
Robust Interactive Learning
In this paper we propose and study a generalization of the standard
active-learning model where a more general type of query, class conditional
query, is allowed. Such queries have been quite useful in applications, but
have been lacking theoretical understanding. In this work, we characterize the
power of such queries under two well-known noise models. We give nearly tight
upper and lower bounds on the number of queries needed to learn both for the
general agnostic setting and for the bounded noise model. We further show that
our methods can be made adaptive to the (unknown) noise rate, with only
negligible loss in query complexity
Storage capacity of a constructive learning algorithm
Upper and lower bounds for the typical storage capacity of a constructive
algorithm, the Tilinglike Learning Algorithm for the Parity Machine [M. Biehl
and M. Opper, Phys. Rev. A {\bf 44} 6888 (1991)], are determined in the
asymptotic limit of large training set sizes. The properties of a perceptron
with threshold, learning a training set of patterns having a biased
distribution of targets, needed as an intermediate step in the capacity
calculation, are determined analytically. The lower bound for the capacity,
determined with a cavity method, is proportional to the number of hidden units.
The upper bound, obtained with the hypothesis of replica symmetry, is close to
the one predicted by Mitchinson and Durbin [Biol. Cyber. {\bf 60} 345 (1989)].Comment: 13 pages, 1 figur
Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy
We describe a framework for designing efficient active learning algorithms
that are tolerant to random classification noise and are
differentially-private. The framework is based on active learning algorithms
that are statistical in the sense that they rely on estimates of expectations
of functions of filtered random examples. It builds on the powerful statistical
query framework of Kearns (1993).
We show that any efficient active statistical learning algorithm can be
automatically converted to an efficient active learning algorithm which is
tolerant to random classification noise as well as other forms of
"uncorrelated" noise. The complexity of the resulting algorithms has
information-theoretically optimal quadratic dependence on , where
is the noise rate.
We show that commonly studied concept classes including thresholds,
rectangles, and linear separators can be efficiently actively learned in our
framework. These results combined with our generic conversion lead to the first
computationally-efficient algorithms for actively learning some of these
concept classes in the presence of random classification noise that provide
exponential improvement in the dependence on the error over their
passive counterparts. In addition, we show that our algorithms can be
automatically converted to efficient active differentially-private algorithms.
This leads to the first differentially-private active learning algorithms with
exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201
Optimal Quantum Sample Complexity of Learning Algorithms
In learning theory, the VC dimension of a
concept class is the most common way to measure its "richness." In the PAC
model \Theta\Big(\frac{d}{\eps} + \frac{\log(1/\delta)}{\eps}\Big)
examples are necessary and sufficient for a learner to output, with probability
, a hypothesis that is \eps-close to the target concept . In
the related agnostic model, where the samples need not come from a , we
know that \Theta\Big(\frac{d}{\eps^2} + \frac{\log(1/\delta)}{\eps^2}\Big)
examples are necessary and sufficient to output an hypothesis whose
error is at most \eps worse than the best concept in .
Here we analyze quantum sample complexity, where each example is a coherent
quantum state. This model was introduced by Bshouty and Jackson, who showed
that quantum examples are more powerful than classical examples in some
fixed-distribution settings. However, Atici and Servedio, improved by Zhang,
showed that in the PAC setting, quantum examples cannot be much more powerful:
the required number of quantum examples is
\Omega\Big(\frac{d^{1-\eta}}{\eps} + d + \frac{\log(1/\delta)}{\eps}\Big)\mbox{
for all }\eta> 0. Our main result is that quantum and classical sample
complexity are in fact equal up to constant factors in both the PAC and
agnostic models. We give two approaches. The first is a fairly simple
information-theoretic argument that yields the above two classical bounds and
yields the same bounds for quantum sample complexity up to a \log(d/\eps)
factor. We then give a second approach that avoids the log-factor loss, based
on analyzing the behavior of the "Pretty Good Measurement" on the quantum state
identification problems that correspond to learning. This shows classical and
quantum sample complexity are equal up to constant factors.Comment: 31 pages LaTeX. Arxiv abstract shortened to fit in their
1920-character limit. Version 3: many small changes, no change in result
The Consistency dimension and distribution-dependent learning from queries
We prove a new combinatorial characterization of polynomial
learnability from equivalence queries, and state some of its
consequences relating the learnability of a class with the
learnability via equivalence and membership queries of its
subclasses obtained by restricting the instance space.
Then we propose and study two models of query learning in which there
is a probability distribution on the instance space, both as an
application of the tools developed from the combinatorial
characterization and as models of independent interest.Postprint (published version
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
We are interested in supervised ranking algorithms that perform especially well near the top of the
ranked list, and are only required to perform sufficiently well on the rest of the list. In this work,
we provide a general form of convex objective that gives high-scoring examples more importance.
This “push” near the top of the list can be chosen arbitrarily large or small, based on the preference
of the user. We choose ℓp-norms to provide a specific type of push; if the user sets p larger, the
objective concentrates harder on the top of the list. We derive a generalization bound based on
the p-norm objective, working around the natural asymmetry of the problem. We then derive a
boosting-style algorithm for the problem of ranking with a push at the top. The usefulness of the
algorithm is illustrated through experiments on repository data. We prove that the minimizer of the
algorithm’s objective is unique in a specific sense. Furthermore, we illustrate how our objective is
related to quality measurements for information retrieval
- …