178 research outputs found
Tight Bounds on Proper Equivalence Query Learning of DNF
We prove a new structural lemma for partial Boolean functions , which we
call the seed lemma for DNF. Using the lemma, we give the first subexponential
algorithm for proper learning of DNF in Angluin's Equivalence Query (EQ) model.
The algorithm has time and query complexity , which
is optimal. We also give a new result on certificates for DNF-size, a simple
algorithm for properly PAC-learning DNF, and new results on EQ-learning -term DNF and decision trees
A Survey of Quantum Learning Theory
This paper surveys quantum learning theory: the theoretical aspects of
machine learning using quantum computers. We describe the main results known
for three models of learning: exact learning from membership queries, and
Probably Approximately Correct (PAC) and agnostic learning from classical or
quantum examples.Comment: 26 pages LaTeX. v2: many small changes to improve the presentation.
This version will appear as Complexity Theory Column in SIGACT News in June
2017. v3: fixed a small ambiguity in the definition of gamma(C) and updated a
referenc
Queries revisited
AbstractWe begin with a brief tutorial on the problem of learning a finite concept class over a finite domain using membership queries and/or equivalence queries. We then sketch general results on the number of queries needed to learn a class of concepts, focusing on the various notions of combinatorial dimension that have been employed, including the teaching dimension, the exclusion dimension, the extended teaching dimension, the fingerprint dimension, the sample exclusion dimension, the VapnikāChervonenkis dimension, the abstract identification dimension, and the general dimension
Sample complexity of robust learning against evasion attacks
It is becoming increasingly important to understand the vulnerability of machine learning models to adversarial attacks. One of the fundamental problems in adversarial machine learning is to quantify how much training data is needed in the presence of so-called evasion attacks, where data is corrupted at test time. In this thesis, we work with the exact-in-the-ball notion of robustness and study the feasibility of adversarially robust learning from the perspective of learning theory, considering sample complexity.
We start with two negative results. We show that no non-trivial concept class can be robustly learned in the distribution-free setting against an adversary who can perturb just a single input bit. We then exhibit a sample-complexity lower bound: the class of monotone conjunctions and any superclass on the boolean hypercube has sample complexity at least exponential in the adversary's budget (that is, the maximum number of bits it can perturb on each input). This implies, in particular, that these classes cannot be robustly learned under the uniform distribution against an adversary who can perturb bits of the input.
As a first route to obtaining robust learning guarantees, we consider restricting the class of distributions over which training and testing data are drawn. We focus on learning problems with probability distributions on the input data that satisfy a Lipschitz condition: nearby points have similar probability. We show that, if the adversary is restricted to perturbing bits, then one can robustly learn the class of monotone conjunctions with respect to the class of log-Lipschitz distributions. We then extend this result to show the learnability of 1-decision lists, 2-decision lists and monotone k-decision lists in the same distributional and adversarial setting. We finish by showing that for every fixed k the class of k-decision lists has polynomial sample complexity against a log(n)-bounded adversary. The advantage of considering intermediate subclasses of k-decision lists is that we are able to obtain improved sample complexity bounds for these cases.
As a second route, we study learning models where the learner is given more power through the use of local queries. The first learning model we consider uses local membership queries (LMQ), where the learner can query the label of points near the training sample. We show that, under the uniform distribution, the exponential dependence on the adversary's budget to robustly learn conjunctions and any superclass remains inevitable even when the learner is given access to LMQs in addition to random examples. Faced with this negative result, we introduce a local equivalence, query oracle, which returns whether the hypothesis and target concept agree in a given region around a point in the training sample, as well as a counterexample if it exists. We show a separation result: on the one hand, if the query radius Ī» is strictly smaller than the adversary's perturbation budget Ļ, then distribution free robust learning is impossible for a wide variety of concept classes; on the other hand, the setting Ī» = Ļ allows us to develop robust empirical risk minimization algorithms in the distribution-free setting. We then bound the query complexity of these algorithms based on online learning guarantees and further improve these bounds for the special case of conjunctions. We follow by giving a robust learning algorithm for halfspaces on {0,1}n. Finally, since the query complexity for halfspaces on Rn is unbounded, we instead consider adversaries with bounded precision and give query complexity upper bounds in this setting as well
Learning Boolean Halfspaces with Small Weights from Membership Queries
We consider the problem of proper learning a Boolean Halfspace with integer
weights from membership queries only. The best known
algorithm for this problem is an adaptive algorithm that asks
membership queries where the best lower bound for the number of membership
queries is [Learning Threshold Functions with Small Weights Using
Membership Queries. COLT 1999]
In this paper we close this gap and give an adaptive proper learning
algorithm with two rounds that asks membership queries. We also give
a non-adaptive proper learning algorithm that asks membership
queries
- ā¦