11,466 research outputs found
Using Contextual Representations to Efficiently Learn Context-Free Languages
International audienceWe present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach
Tabular Parsing
This is a tutorial on tabular parsing, on the basis of tabulation of
nondeterministic push-down automata. Discussed are Earley's algorithm, the
Cocke-Kasami-Younger algorithm, tabular LR parsing, the construction of parse
trees, and further issues.Comment: 21 pages, 14 figure
On Decidable Growth-Rate Properties of Imperative Programs
In 2008, Ben-Amram, Jones and Kristiansen showed that for a simple "core"
programming language - an imperative language with bounded loops, and
arithmetics limited to addition and multiplication - it was possible to decide
precisely whether a program had certain growth-rate properties, namely
polynomial (or linear) bounds on computed values, or on the running time.
This work emphasized the role of the core language in mitigating the
notorious undecidability of program properties, so that one deals with
decidable problems.
A natural and intriguing problem was whether more elements can be added to
the core language, improving its utility, while keeping the growth-rate
properties decidable. In particular, the method presented could not handle a
command that resets a variable to zero. This paper shows how to handle resets.
The analysis is given in a logical style (proof rules), and its complexity is
shown to be PSPACE-complete (in contrast, without resets, the problem was
PTIME). The analysis algorithm evolved from the previous solution in an
interesting way: focus was shifted from proving a bound to disproving it, and
the algorithm works top-down rather than bottom-up
Really Natural Linear Indexed Type Checking
Recent works have shown the power of linear indexed type systems for
enforcing complex program properties. These systems combine linear types with a
language of type-level indices, allowing more fine-grained analyses. Such
systems have been fruitfully applied in diverse domains, including implicit
complexity and differential privacy. A natural way to enhance the
expressiveness of this approach is by allowing the indices to depend on runtime
information, in the spirit of dependent types. This approach is used in DFuzz,
a language for differential privacy. The DFuzz type system relies on an index
language supporting real and natural number arithmetic over constants and
variables. Moreover, DFuzz uses a subtyping mechanism to make types more
flexible. By themselves, linearity, dependency, and subtyping each require
delicate handling when performing type checking or type inference; their
combination increases this challenge substantially, as the features can
interact in non-trivial ways. In this paper, we study the type-checking problem
for DFuzz. We show how we can reduce type checking for (a simple extension of)
DFuzz to constraint solving over a first-order theory of naturals and real
numbers which, although undecidable, can often be handled in practice by
standard numeric solvers
- …