1,085 research outputs found
A Complete Characterization of Statistical Query Learning with Applications to Evolvability
Statistical query (SQ) learning model of Kearns (1993) is a natural
restriction of the PAC learning model in which a learning algorithm is allowed
to obtain estimates of statistical properties of the examples but cannot see
the examples themselves. We describe a new and simple characterization of the
query complexity of learning in the SQ learning model. Unlike the previously
known bounds on SQ learning our characterization preserves the accuracy and the
efficiency of learning. The preservation of accuracy implies that that our
characterization gives the first characterization of SQ learning in the
agnostic learning framework. The preservation of efficiency is achieved using a
new boosting technique and allows us to derive a new approach to the design of
evolutionary algorithms in Valiant's (2006) model of evolvability. We use this
approach to demonstrate the existence of a large class of monotone evolutionary
learning algorithms based on square loss performance estimation. These results
differ significantly from the few known evolutionary algorithms and give
evidence that evolvability in Valiant's model is a more versatile phenomenon
than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application
Agnostic Learning of Disjunctions on Symmetric Distributions
We consider the problem of approximating and learning disjunctions (or
equivalently, conjunctions) on symmetric distributions over .
Symmetric distributions are distributions whose PDF is invariant under any
permutation of the variables. We give a simple proof that for every symmetric
distribution , there exists a set of
functions , such that for every disjunction , there is function
, expressible as a linear combination of functions in , such
that -approximates in distance on or
. This directly
gives an agnostic learning algorithm for disjunctions on symmetric
distributions that runs in time . The best known
previous bound is and follows from approximation of the
more general class of halfspaces (Wimmer, 2010). We also show that there exists
a symmetric distribution , such that the minimum degree of a
polynomial that -approximates the disjunction of all variables is
distance on is . Therefore the
learning result above cannot be achieved via -regression with a
polynomial basis used in most other agnostic learning algorithms.
Our technique also gives a simple proof that for any product distribution
and every disjunction , there exists a polynomial of
degree such that -approximates in
distance on . This was first proved by Blais et al.
(2008) via a more involved argument
Learning using Local Membership Queries
We introduce a new model of membership query (MQ) learning, where the
learning algorithm is restricted to query points that are \emph{close} to
random examples drawn from the underlying distribution. The learning model is
intermediate between the PAC model (Valiant, 1984) and the PAC+MQ model (where
the queries are allowed to be arbitrary points).
Membership query algorithms are not popular among machine learning
practitioners. Apart from the obvious difficulty of adaptively querying
labelers, it has also been observed that querying \emph{unnatural} points leads
to increased noise from human labelers (Lang and Baum, 1992). This motivates
our study of learning algorithms that make queries that are close to examples
generated from the data distribution.
We restrict our attention to functions defined on the -dimensional Boolean
hypercube and say that a membership query is local if its Hamming distance from
some example in the (random) training data is at most . We show the
following results in this model:
(i) The class of sparse polynomials (with coefficients in R) over
is polynomial time learnable under a large class of \emph{locally smooth}
distributions using -local queries. This class also includes the
class of -depth decision trees.
(ii) The class of polynomial-sized decision trees is polynomial time
learnable under product distributions using -local queries.
(iii) The class of polynomial size DNF formulas is learnable under the
uniform distribution using -local queries in time
.
(iv) In addition we prove a number of results relating the proposed model to
the traditional PAC model and the PAC+MQ model
- …