90 research outputs found
Statistical Learning of Arbitrary Computable Classifiers
Statistical learning theory chiefly studies restricted hypothesis classes,
particularly those with finite Vapnik-Chervonenkis (VC) dimension. The
fundamental quantity of interest is the sample complexity: the number of
samples required to learn to a specified level of accuracy. Here we consider
learning over the set of all computable labeling functions. Since the
VC-dimension is infinite and a priori (uniform) bounds on the number of samples
are impossible, we let the learning algorithm decide when it has seen
sufficient samples to have learned. We first show that learning in this setting
is indeed possible, and develop a learning algorithm. We then show, however,
that bounding sample complexity independently of the distribution is
impossible. Notably, this impossibility is entirely due to the requirement that
the learning algorithm be computable, and not due to the statistical nature of
the problem.Comment: Expanded the section on prior work and added reference
Absolutely No Free Lunches!
This paper is concerned with learners who aim to learn patterns in infinite
binary sequences: shown longer and longer initial segments of a binary
sequence, they either attempt to predict whether the next bit will be a 0 or
will be a 1 or they issue forecast probabilities for these events. Several
variants of this problem are considered. In each case, a no-free-lunch result
of the following form is established: the problem of learning is a formidably
difficult one, in that no matter what method is pursued, failure is
incomparably more common that success; and difficult choices must be faced in
choosing a method of learning, since no approach dominates all others in its
range of success. In the simplest case, the comparison of the set of situations
in which a method fails and the set of situations in which it succeeds is a
matter of cardinality (countable vs. uncountable); in other cases, it is a
topological matter (meagre vs. co-meagre) or a hybrid computational-topological
matter (effectively meagre vs. effectively co-meagre)
Structure emerges faster during cultural transmission in children than in adults
How does children’s limited processing capacity affect cultural transmission of complex information? We show that over the course of iterated reproduction of two-dimensional random dot patterns transmission accuracy increased to a similar extent in 5- to 8-year-old children and adults whereas algorithmic complexity decreased faster in children. Thus, children require more structure to render complex inputs learnable. In line with the Less-Is-More hypothesis, we interpret this as evidence that children’s processing limitations affecting working memory capacity and executive control constrain the ability to represent and generate complexity, which, in turn, facilitates emergence of structure. This underscores the importance of investigating the role of children in the transmission of complex cultural traits
Logic and Learning
The theory of first-order logic - or Model Theory - appears in few studies of learning and scientific discovery. We speculate about the reasons for this omission, and then argue for the utility of Model Theory in the analysis and design of automated systems of scientific discovery. One scientific task is treated from this perspective in detail, namely, concept discovery. Two formal paradigms bearing on this probleni are presented and investigated using the tools of logical theory. One paradigm bears on PAC learning, the other on identification in the limit
Learning probability distributions generated by finite-state machines
We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference
in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft
Computational Problems in Metric Fixed Point Theory and their Weihrauch Degrees
We study the computational difficulty of the problem of finding fixed points
of nonexpansive mappings in uniformly convex Banach spaces. We show that the
fixed point sets of computable nonexpansive self-maps of a nonempty, computably
weakly closed, convex and bounded subset of a computable real Hilbert space are
precisely the nonempty, co-r.e. weakly closed, convex subsets of the domain. A
uniform version of this result allows us to determine the Weihrauch degree of
the Browder-Goehde-Kirk theorem in computable real Hilbert space: it is
equivalent to a closed choice principle, which receives as input a closed,
convex and bounded set via negative information in the weak topology and
outputs a point in the set, represented in the strong topology. While in finite
dimensional uniformly convex Banach spaces, computable nonexpansive mappings
always have computable fixed points, on the unit ball in infinite-dimensional
separable Hilbert space the Browder-Goehde-Kirk theorem becomes
Weihrauch-equivalent to the limit operator, and on the Hilbert cube it is
equivalent to Weak Koenig's Lemma. In particular, computable nonexpansive
mappings may not have any computable fixed points in infinite dimension. We
also study the computational difficulty of the problem of finding rates of
convergence for a large class of fixed point iterations, which generalise both
Halpern- and Mann-iterations, and prove that the problem of finding rates of
convergence already on the unit interval is equivalent to the limit operator.Comment: 44 page
- …