5,679 research outputs found
Graph-Based Shape Analysis Beyond Context-Freeness
We develop a shape analysis for reasoning about relational properties of data
structures. Both the concrete and the abstract domain are represented by
hypergraphs. The analysis is parameterized by user-supplied indexed graph
grammars to guide concretization and abstraction. This novel extension of
context-free graph grammars is powerful enough to model complex data structures
such as balanced binary trees with parent pointers, while preserving most
desirable properties of context-free graph grammars. One strength of our
analysis is that no artifacts apart from grammars are required from the user;
it thus offers a high degree of automation. We implemented our analysis and
successfully applied it to various programs manipulating AVL trees,
(doubly-linked) lists, and combinations of both
An approach to computing downward closures
The downward closure of a word language is the set of all (not necessarily
contiguous) subwords of its members. It is well-known that the downward closure
of any language is regular. While the downward closure appears to be a powerful
abstraction, algorithms for computing a finite automaton for the downward
closure of a given language have been established only for few language
classes.
This work presents a simple general method for computing downward closures.
For language classes that are closed under rational transductions, it is shown
that the computation of downward closures can be reduced to checking a certain
unboundedness property.
This result is used to prove that downward closures are computable for (i)
every language class with effectively semilinear Parikh images that are closed
under rational transductions, (ii) matrix languages, and (iii) indexed
languages (equivalently, languages accepted by higher-order pushdown automata
of order 2).Comment: Full version of contribution to ICALP 2015. Comments welcom
Calibrating Generative Models: The Probabilistic Chomsky-SchĂĽtzenberger Hierarchy
A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning
On Measuring Non-Recursive Trade-Offs
We investigate the phenomenon of non-recursive trade-offs between
descriptional systems in an abstract fashion. We aim at categorizing
non-recursive trade-offs by bounds on their growth rate, and show how to deduce
such bounds in general. We also identify criteria which, in the spirit of
abstract language theory, allow us to deduce non-recursive tradeoffs from
effective closure properties of language families on the one hand, and
differences in the decidability status of basic decision problems on the other.
We develop a qualitative classification of non-recursive trade-offs in order to
obtain a better understanding of this very fundamental behaviour of
descriptional systems
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
A Tractable Extension of Linear Indexed Grammars
It has been shown that Linear Indexed Grammars can be processed in polynomial
time by exploiting constraints which make possible the extensive use of
structure-sharing. This paper describes a formalism that is more powerful than
Linear Indexed Grammar, but which can also be processed in polynomial time
using similar techniques. The formalism, which we refer to as Partially Linear
PATR manipulates feature structures rather than stacks.Comment: 8 pages LaTeX, uses eaclap.sty, to appear in EACL-9
TuLiPA : towards a multi-formalism parsing environment for grammar engineering
In this paper, we present an open-source parsing environment (TĂĽbingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German
TuLiPA : towards a multi-formalism parsing environment for grammar engineering
In this paper, we present an open-source parsing environment (TĂĽbingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German
- …