964 research outputs found
Learning probability distributions generated by finite-state machines
We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference
in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft
On the Learnability of Shuffle Ideals
PAC learning of unrestricted regular languages is long known to be a difficult problem. The class of shuffle ideals is a very restricted subclass of regular languages, where the shuffle ideal generated by a string u is the collection of all strings containing u as a subsequence. This fundamental language family is of theoretical interest in its own right and provides the building blocks for other important language families. Despite its apparent simplicity, the class of shuffle ideals appears quite difficult to learn. In particular, just as for unrestricted regular languages, the class is not properly PAC learnable in polynomial time if RP 6= NP, and PAC learning the class improperly in polynomial time would imply polynomial time algorithms for certain fundamental problems in cryptography. In the positive direction, we give an efficient algorithm for properly learning shuffle ideals in the statistical query (and therefore also PAC) model under the uniform distribution.T-Party Projec
Pac-learning Recursive Logic Programs: Negative Results
In a companion paper it was shown that the class of constant-depth
determinate k-ary recursive clauses is efficiently learnable. In this paper we
present negative results showing that any natural generalization of this class
is hard to learn in Valiant's model of pac-learnability. In particular, we show
that the following program classes are cryptographically hard to learn:
programs with an unbounded number of constant-depth linear recursive clauses;
programs with one constant-depth determinate clause containing an unbounded
number of recursive calls; and programs with one linear recursive clause of
constant locality. These results immediately imply the non-learnability of any
more general class of programs. We also show that learning a constant-depth
determinate program with either two linear recursive clauses or one linear
recursive clause and one non-recursive clause is as hard as learning boolean
DNF. Together with positive results from the companion paper, these negative
results establish a boundary of efficient learnability for recursive
function-free clauses.Comment: See http://www.jair.org/ for any accompanying file
Benchmarking Compositionality with Formal Languages
Recombining known primitive concepts into larger novel combinations is a
quintessentially human cognitive capability. Whether large neural models in NLP
can acquire this ability while learning from data is an open question. In this
paper, we investigate this problem from the perspective of formal languages. We
use deterministic finite-state transducers to make an unbounded number of
datasets with controllable properties governing compositionality. By randomly
sampling over many transducers, we explore which of their properties contribute
to learnability of a compositional relation by a neural network. We find that
the models either learn the relations completely or not at all. The key is
transition coverage, setting a soft learnability limit at 400 examples per
transition
The Consistency dimension and distribution-dependent learning from queries
We prove a new combinatorial characterization of polynomial
learnability from equivalence queries, and state some of its
consequences relating the learnability of a class with the
learnability via equivalence and membership queries of its
subclasses obtained by restricting the instance space.
Then we propose and study two models of query learning in which there
is a probability distribution on the instance space, both as an
application of the tools developed from the combinatorial
characterization and as models of independent interest.Postprint (published version
- …