173 research outputs found
A Complete Characterization of Statistical Query Learning with Applications to Evolvability
Statistical query (SQ) learning model of Kearns (1993) is a natural
restriction of the PAC learning model in which a learning algorithm is allowed
to obtain estimates of statistical properties of the examples but cannot see
the examples themselves. We describe a new and simple characterization of the
query complexity of learning in the SQ learning model. Unlike the previously
known bounds on SQ learning our characterization preserves the accuracy and the
efficiency of learning. The preservation of accuracy implies that that our
characterization gives the first characterization of SQ learning in the
agnostic learning framework. The preservation of efficiency is achieved using a
new boosting technique and allows us to derive a new approach to the design of
evolutionary algorithms in Valiant's (2006) model of evolvability. We use this
approach to demonstrate the existence of a large class of monotone evolutionary
learning algorithms based on square loss performance estimation. These results
differ significantly from the few known evolutionary algorithms and give
evidence that evolvability in Valiant's model is a more versatile phenomenon
than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application
On Statistical Query Sampling and NMR Quantum Computing
We introduce a ``Statistical Query Sampling'' model, in which the goal of an
algorithm is to produce an element in a hidden set with
reasonable probability. The algorithm gains information about through
oracle calls (statistical queries), where the algorithm submits a query
function and receives an approximation to . We
show how this model is related to NMR quantum computing, in which only
statistical properties of an ensemble of quantum systems can be measured, and
in particular to the question of whether one can translate standard quantum
algorithms to the NMR setting without putting all of their classical
post-processing into the quantum system. Using Fourier analysis techniques
developed in the related context of {em statistical query learning}, we prove a
number of lower bounds (both information-theoretic and cryptographic) on the
ability of algorithms to produces an , even when the set is fairly
simple. These lower bounds point out a difficulty in efficiently applying NMR
quantum computing to algorithms such as Shor's and Simon's algorithm that
involve significant classical post-processing. We also explicitly relate the
notion of statistical query sampling to that of statistical query learning.
An extended abstract appeared in the 18th Aunnual IEEE Conference of
Computational Complexity (CCC 2003), 2003.
Keywords: statistical query, NMR quantum computing, lower boundComment: 17 pages, no figures. Appeared in 18th Aunnual IEEE Conference of
Computational Complexity (CCC 2003
General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Boosting
AbstractWe derive general bounds on the complexity of learning in the statistical query (SQ) model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the SQ model. This new model was introduced by Kearns to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Since all SQ algorithms can be simulated in the PAC model with classification noise, we also obtain general upper bounds on learning in the presence of classification noise for classes which can be learned in the SQ model
Learning from Positive and Unlabeled Examples
International audienceIn many machine learning settings, labeled examples are difficult to collect while unlabeled data are abundant. Also, for some binary classification problems, positive examples, that is examples of the target class, are available. Can these additional data be used to improve accuracy of supervised learning algorithms? We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data mining algorithms, such as decision tree induction algorithms and naive Bayes algorithms, only use examples in order to evaluate statistical queries (SQ-like algorithms). Kearns designed the Statistical Query learning model in order to describe these algorithms. Here, we design an algorithm scheme which transforms any SQ-like algorithm into an algorithm based on positive statistical queries (estimates for probabilities over the set of positive instances) and instance statistical queries (estimates for probabilities over the instance space). We prove that any class learnable in the Statistical Query learning model is learnable from positive statistical queries and instance statistical queries only if a lower bound on the weight of any target concept can be estimated in polynomial time. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5, that uses only positive and unlabeled examples and we give experimental results for this algorithm. The case of imbalanced classes in the sense that one of the two classes (say the positive class) is heavily underrepresented compared to the other class remains open. This problem is challenging because it is encountered in many real-world applications
Noise-Tolerant Learning, the Parity Problem, and the Statistical Query Model
We describe a slightly sub-exponential time algorithm for learning parity
functions in the presence of random classification noise. This results in a
polynomial-time algorithm for the case of parity functions that depend on only
the first O(log n log log n) bits of input. This is the first known instance of
an efficient noise-tolerant algorithm for a concept class that is provably not
learnable in the Statistical Query model of Kearns. Thus, we demonstrate that
the set of problems learnable in the statistical query model is a strict subset
of those problems learnable in the presence of noise in the PAC model.
In coding-theory terms, what we give is a poly(n)-time algorithm for decoding
linear k by n codes in the presence of random noise for the case of k = c log n
loglog n for some c > 0. (The case of k = O(log n) is trivial since one can
just individually check each of the 2^k possible messages and choose the one
that yields the closest codeword.)
A natural extension of the statistical query model is to allow queries about
statistical properties that involve t-tuples of examples (as opposed to single
examples). The second result of this paper is to show that any class of
functions learnable (strongly or weakly) with t-wise queries for t = O(log n)
is also weakly learnable with standard unary queries. Hence this natural
extension to the statistical query model does not increase the set of weakly
learnable functions
Approximate resilience, monotonicity, and the complexity of agnostic learning
A function is -resilient if all its Fourier coefficients of degree at
most are zero, i.e., is uncorrelated with all low-degree parities. We
study the notion of of Boolean
functions, where we say that is -approximately -resilient if
is -close to a -valued -resilient function in
distance. We show that approximate resilience essentially characterizes the
complexity of agnostic learning of a concept class over the uniform
distribution. Roughly speaking, if all functions in a class are far from
being -resilient then can be learned agnostically in time and
conversely, if contains a function close to being -resilient then
agnostic learning of in the statistical query (SQ) framework of Kearns has
complexity of at least . This characterization is based on the
duality between approximation by degree- polynomials and
approximate -resilience that we establish. In particular, it implies that
approximation by low-degree polynomials, known to be sufficient for
agnostic learning over product distributions, is in fact necessary.
Focusing on monotone Boolean functions, we exhibit the existence of
near-optimal -approximately
-resilient monotone functions for all
. Prior to our work, it was conceivable even that every monotone
function is -far from any -resilient function. Furthermore, we
construct simple, explicit monotone functions based on and that are close to highly resilient functions. Our constructions are
based on a fairly general resilience analysis and amplification. These
structural results, together with the characterization, imply nearly optimal
lower bounds for agnostic learning of monotone juntas
Replicable Reinforcement Learning
The replicability crisis in the social, behavioral, and data sciences has led
to the formulation of algorithm frameworks for replicability -- i.e., a
requirement that an algorithm produce identical outputs (with high probability)
when run on two different samples from the same underlying distribution. While
still in its infancy, provably replicable algorithms have been developed for
many fundamental tasks in machine learning and statistics, including
statistical query learning, the heavy hitters problem, and distribution
testing. In this work we initiate the study of replicable reinforcement
learning, providing a provably replicable algorithm for parallel value
iteration, and a provably replicable version of R-max in the episodic setting.
These are the first formal replicability results for control problems, which
present different challenges for replication than batch learning settings
- …