12,751 research outputs found
Graph Interpolation Grammars: a Rule-based Approach to the Incremental Parsing of Natural Languages
Graph Interpolation Grammars are a declarative formalism with an operational
semantics. Their goal is to emulate salient features of the human parser, and
notably incrementality. The parsing process defined by GIGs incrementally
builds a syntactic representation of a sentence as each successive lexeme is
read. A GIG rule specifies a set of parse configurations that trigger its
application and an operation to perform on a matching configuration. Rules are
partly context-sensitive; furthermore, they are reversible, meaning that their
operations can be undone, which allows the parsing process to be
nondeterministic. These two factors confer enough expressive power to the
formalism for parsing natural languages.Comment: 41 pages, Postscript onl
Evaluating two methods for Treebank grammar compaction
Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage grammars. In the simplest case, rules can simply be ‘read off’ the parse-annotations of the corpus, producing either a simple or probabilistic context-free grammar. Such grammars, however, can be very large, presenting problems for the subsequent computational costs of parsing under the grammar.
In this paper, we explore ways by which a treebank grammar can be reduced in size or ‘compacted’, which involve the use of two kinds of technique: (i) thresholding of rules by their number of occurrences; and (ii) a method of rule-parsing, which has both probabilistic and non-probabilistic variants. Our results show that by a combined use of these two techniques, a probabilistic context-free grammar can be reduced in size by 62% without any loss in parsing performance, and by 71% to give a gain in recall, but some loss in precision
An Alternative Conception of Tree-Adjoining Derivation
The precise formulation of derivation for tree-adjoining grammars has
important ramifications for a wide variety of uses of the formalism, from
syntactic analysis to semantic interpretation and statistical language
modeling. We argue that the definition of tree-adjoining derivation must be
reformulated in order to manifest the proper linguistic dependencies in
derivations. The particular proposal is both precisely characterizable through
a definition of TAG derivations as equivalence classes of ordered derivation
trees, and computationally operational, by virtue of a compilation to linear
indexed grammars together with an efficient algorithm for recognition and
parsing according to the compiled grammar.Comment: 33 page
- …