401,109 research outputs found

    A Survey of Quantum Learning Theory

    Get PDF
    This paper surveys quantum learning theory: the theoretical aspects of machine learning using quantum computers. We describe the main results known for three models of learning: exact learning from membership queries, and Probably Approximately Correct (PAC) and agnostic learning from classical or quantum examples.Comment: 26 pages LaTeX. v2: many small changes to improve the presentation. This version will appear as Complexity Theory Column in SIGACT News in June 2017. v3: fixed a small ambiguity in the definition of gamma(C) and updated a referenc

    Robust Interactive Learning

    Full text link
    In this paper we propose and study a generalization of the standard active-learning model where a more general type of query, class conditional query, is allowed. Such queries have been quite useful in applications, but have been lacking theoretical understanding. In this work, we characterize the power of such queries under two well-known noise models. We give nearly tight upper and lower bounds on the number of queries needed to learn both for the general agnostic setting and for the bounded noise model. We further show that our methods can be made adaptive to the (unknown) noise rate, with only negligible loss in query complexity

    Storage capacity of a constructive learning algorithm

    Full text link
    Upper and lower bounds for the typical storage capacity of a constructive algorithm, the Tilinglike Learning Algorithm for the Parity Machine [M. Biehl and M. Opper, Phys. Rev. A {\bf 44} 6888 (1991)], are determined in the asymptotic limit of large training set sizes. The properties of a perceptron with threshold, learning a training set of patterns having a biased distribution of targets, needed as an intermediate step in the capacity calculation, are determined analytically. The lower bound for the capacity, determined with a cavity method, is proportional to the number of hidden units. The upper bound, obtained with the hypothesis of replica symmetry, is close to the one predicted by Mitchinson and Durbin [Biol. Cyber. {\bf 60} 345 (1989)].Comment: 13 pages, 1 figur

    Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy

    Full text link
    We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and are differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns (1993). We show that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of "uncorrelated" noise. The complexity of the resulting algorithms has information-theoretically optimal quadratic dependence on 1/(12η)1/(1-2\eta), where η\eta is the noise rate. We show that commonly studied concept classes including thresholds, rectangles, and linear separators can be efficiently actively learned in our framework. These results combined with our generic conversion lead to the first computationally-efficient algorithms for actively learning some of these concept classes in the presence of random classification noise that provide exponential improvement in the dependence on the error ϵ\epsilon over their passive counterparts. In addition, we show that our algorithms can be automatically converted to efficient active differentially-private algorithms. This leads to the first differentially-private active learning algorithms with exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201

    Optimal Quantum Sample Complexity of Learning Algorithms

    Get PDF
    \newcommand{\eps}{\varepsilon} In learning theory, the VC dimension of a concept class CC is the most common way to measure its "richness." In the PAC model \Theta\Big(\frac{d}{\eps} + \frac{\log(1/\delta)}{\eps}\Big) examples are necessary and sufficient for a learner to output, with probability 1δ1-\delta, a hypothesis hh that is \eps-close to the target concept cc. In the related agnostic model, where the samples need not come from a cCc\in C, we know that \Theta\Big(\frac{d}{\eps^2} + \frac{\log(1/\delta)}{\eps^2}\Big) examples are necessary and sufficient to output an hypothesis hCh\in C whose error is at most \eps worse than the best concept in CC. Here we analyze quantum sample complexity, where each example is a coherent quantum state. This model was introduced by Bshouty and Jackson, who showed that quantum examples are more powerful than classical examples in some fixed-distribution settings. However, Atici and Servedio, improved by Zhang, showed that in the PAC setting, quantum examples cannot be much more powerful: the required number of quantum examples is \Omega\Big(\frac{d^{1-\eta}}{\eps} + d + \frac{\log(1/\delta)}{\eps}\Big)\mbox{ for all }\eta> 0. Our main result is that quantum and classical sample complexity are in fact equal up to constant factors in both the PAC and agnostic models. We give two approaches. The first is a fairly simple information-theoretic argument that yields the above two classical bounds and yields the same bounds for quantum sample complexity up to a \log(d/\eps) factor. We then give a second approach that avoids the log-factor loss, based on analyzing the behavior of the "Pretty Good Measurement" on the quantum state identification problems that correspond to learning. This shows classical and quantum sample complexity are equal up to constant factors.Comment: 31 pages LaTeX. Arxiv abstract shortened to fit in their 1920-character limit. Version 3: many small changes, no change in result

    The Consistency dimension and distribution-dependent learning from queries

    Get PDF
    We prove a new combinatorial characterization of polynomial learnability from equivalence queries, and state some of its consequences relating the learnability of a class with the learnability via equivalence and membership queries of its subclasses obtained by restricting the instance space. Then we propose and study two models of query learning in which there is a probability distribution on the instance space, both as an application of the tools developed from the combinatorial characterization and as models of independent interest.Postprint (published version

    The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List

    Get PDF
    We are interested in supervised ranking algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. In this work, we provide a general form of convex objective that gives high-scoring examples more importance. This “push” near the top of the list can be chosen arbitrarily large or small, based on the preference of the user. We choose ℓp-norms to provide a specific type of push; if the user sets p larger, the objective concentrates harder on the top of the list. We derive a generalization bound based on the p-norm objective, working around the natural asymmetry of the problem. We then derive a boosting-style algorithm for the problem of ranking with a push at the top. The usefulness of the algorithm is illustrated through experiments on repository data. We prove that the minimizer of the algorithm’s objective is unique in a specific sense. Furthermore, we illustrate how our objective is related to quality measurements for information retrieval
    corecore