888 research outputs found
A Tractable Extension of Linear Indexed Grammars
It has been shown that Linear Indexed Grammars can be processed in polynomial
time by exploiting constraints which make possible the extensive use of
structure-sharing. This paper describes a formalism that is more powerful than
Linear Indexed Grammar, but which can also be processed in polynomial time
using similar techniques. The formalism, which we refer to as Partially Linear
PATR manipulates feature structures rather than stacks.Comment: 8 pages LaTeX, uses eaclap.sty, to appear in EACL-9
Polynomial Charts for Totally Unordered Languages
Proceedings of the 16th Nordic Conference
of Computational Linguistics NODALIDA-2007.
Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit.
University of Tartu, Tartu, 2007.
ISBN 978-9985-4-0513-0 (online)
ISBN 978-9985-4-0514-7 (CD-ROM)
pp. 183-190
Calibrating Generative Models: The Probabilistic Chomsky-Schützenberger Hierarchy
A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning
On Infinite Words Determined by Indexed Languages
We characterize the infinite words determined by indexed languages. An
infinite language determines an infinite word if every string in
is a prefix of . If is regular or context-free, it is known
that must be ultimately periodic. We show that if is an indexed
language, then is a morphic word, i.e., can be generated by
iterating a morphism under a coding. Since the other direction, that every
morphic word is determined by some indexed language, also holds, this implies
that the infinite words determined by indexed languages are exactly the morphic
words. To obtain this result, we prove a new pumping lemma for the indexed
languages, which may be of independent interest.Comment: Full version of paper accepted for publication at MFCS 201
Model counting for CNF formuals of bounded module treewidth.
The modular treewidth of a graph is its treewidth after the contraction of modules. Modular treewidth properly generalizes treewidth and is itself properly generalized by clique-width. We show that the number of satisfying assignments of a CNF formula whose incidence graph has bounded modular treewidth can be computed in polynomial time. This provides new tractable classes of formulas for which #SAT is polynomial. In particular, our result generalizes known results for the treewidth of incidence graphs and is incomparable with known results for clique-width (or rank-width) of signed incidence graphs. The contraction of modules is an effective data reduction procedure. Our algorithm is the first one to harness this technique for #SAT. The order of the polynomial time bound of our algorithm depends on the modular treewidth. We show that this dependency cannot be avoided subject to an assumption from Parameterized Complexity
An Abstract Machine for Unification Grammars
This work describes the design and implementation of an abstract machine,
Amalia, for the linguistic formalism ALE, which is based on typed feature
structures. This formalism is one of the most widely accepted in computational
linguistics and has been used for designing grammars in various linguistic
theories, most notably HPSG. Amalia is composed of data structures and a set of
instructions, augmented by a compiler from the grammatical formalism to the
abstract instructions, and a (portable) interpreter of the abstract
instructions. The effect of each instruction is defined using a low-level
language that can be executed on ordinary hardware.
The advantages of the abstract machine approach are twofold. From a
theoretical point of view, the abstract machine gives a well-defined
operational semantics to the grammatical formalism. This ensures that grammars
specified using our system are endowed with well defined meaning. It enables,
for example, to formally verify the correctness of a compiler for HPSG, given
an independent definition. From a practical point of view, Amalia is the first
system that employs a direct compilation scheme for unification grammars that
are based on typed feature structures. The use of amalia results in a much
improved performance over existing systems.
In order to test the machine on a realistic application, we have developed a
small-scale, HPSG-based grammar for a fragment of the Hebrew language, using
Amalia as the development platform. This is the first application of HPSG to a
Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks,
pst-node, psfig, fullname and a macros fil
- …