221 research outputs found
Taking Primitive Optimality Theory Beyond the Finite State
Primitive Optimality Theory (OTP) (Eisner, 1997a; Albro, 1998), a
computational model of Optimality Theory (Prince and Smolensky, 1993), employs
a finite state machine to represent the set of active candidates at each stage
of an Optimality Theoretic derivation, as well as weighted finite state
machines to represent the constraints themselves. For some purposes, however,
it would be convenient if the set of candidates were limited by some set of
criteria capable of being described only in a higher-level grammar formalism,
such as a Context Free Grammar, a Context Sensitive Grammar, or a Multiple
Context Free Grammar (Seki et al., 1991). Examples include reduplication and
phrasal stress models. Here we introduce a mechanism for OTP-like Optimality
Theory in which the constraints remain weighted finite state machines, but sets
of candidates are represented by higher-level grammars. In particular, we use
multiple context-free grammars to model reduplication in the manner of
Correspondence Theory (McCarthy and Prince, 1995), and develop an extended
version of the Earley Algorithm (Earley, 1970) to apply the constraints to a
reduplicating candidate set.Comment: 11 pages, 5 figures, worksho
On the Complexity and Performance of Parsing with Derivatives
Current algorithms for context-free parsing inflict a trade-off between ease
of understanding, ease of implementation, theoretical complexity, and practical
performance. No algorithm achieves all of these properties simultaneously.
Might et al. (2011) introduced parsing with derivatives, which handles
arbitrary context-free grammars while being both easy to understand and simple
to implement. Despite much initial enthusiasm and a multitude of independent
implementations, its worst-case complexity has never been proven to be better
than exponential. In fact, high-level arguments claiming it is fundamentally
exponential have been advanced and even accepted as part of the folklore.
Performance ended up being sluggish in practice, and this sluggishness was
taken as informal evidence of exponentiality.
In this paper, we reexamine the performance of parsing with derivatives. We
have discovered that it is not exponential but, in fact, cubic. Moreover,
simple (though perhaps not obvious) modifications to the implementation by
Might et al. (2011) lead to an implementation that is not only easy to
understand but also highly performant in practice.Comment: 13 pages; 12 figures; implementation at
http://bitbucket.org/ucombinator/parsing-with-derivatives/ ; published in
PLDI '16, Proceedings of the 37th ACM SIGPLAN Conference on Programming
Language Design and Implementation, June 13 - 17, 2016, Santa Barbara, CA,
US
Practical experiments with regular approximation of context-free languages
Several methods are discussed that construct a finite automaton given a
context-free grammar, including both methods that lead to subsets and those
that lead to supersets of the original context-free language. Some of these
methods of regular approximation are new, and some others are presented here in
a more refined form with respect to existing literature. Practical experiments
with the different methods of regular approximation are performed for
spoken-language input: hypotheses from a speech recognizer are filtered through
a finite automaton.Comment: 28 pages. To appear in Computational Linguistics 26(1), March 200
Synchronous Context-Free Grammars and Optimal Linear Parsing Strategies
Synchronous Context-Free Grammars (SCFGs), also known as syntax-directed
translation schemata, are unlike context-free grammars in that they do not have
a binary normal form. In general, parsing with SCFGs takes space and time
polynomial in the length of the input strings, but with the degree of the
polynomial depending on the permutations of the SCFG rules. We consider linear
parsing strategies, which add one nonterminal at a time. We show that for a
given input permutation, the problems of finding the linear parsing strategy
with the minimum space and time complexity are both NP-hard
- …