3,446 research outputs found

    Multiple Context-Free Tree Grammars: Lexicalization and Characterization

    Get PDF
    Multiple (simple) context-free tree grammars are investigated, where "simple" means "linear and nondeleting". Every multiple context-free tree grammar that is finitely ambiguous can be lexicalized; i.e., it can be transformed into an equivalent one (generating the same tree language) in which each rule of the grammar contains a lexical symbol. Due to this transformation, the rank of the nonterminals increases at most by 1, and the multiplicity (or fan-out) of the grammar increases at most by the maximal rank of the lexical symbols; in particular, the multiplicity does not increase when all lexical symbols have rank 0. Multiple context-free tree grammars have the same tree generating power as multi-component tree adjoining grammars (provided the latter can use a root-marker). Moreover, every multi-component tree adjoining grammar that is finitely ambiguous can be lexicalized. Multiple context-free tree grammars have the same string generating power as multiple context-free (string) grammars and polynomial time parsing algorithms. A tree language can be generated by a multiple context-free tree grammar if and only if it is the image of a regular tree language under a deterministic finite-copying macro tree transducer. Multiple context-free tree grammars can be used as a synchronous translation device.Comment: 78 pages, 13 figure

    Colored operads, series on colored operads, and combinatorial generating systems

    Full text link
    We introduce bud generating systems, which are used for combinatorial generation. They specify sets of various kinds of combinatorial objects, called languages. They can emulate context-free grammars, regular tree grammars, and synchronous grammars, allowing us to work with all these generating systems in a unified way. The theory of bud generating systems uses colored operads. Indeed, an object is generated by a bud generating system if it satisfies a certain equation in a colored operad. To compute the generating series of the languages of bud generating systems, we introduce formal power series on colored operads and several operations on these. Series on colored operads are crucial to express the languages specified by bud generating systems and allow us to enumerate combinatorial objects with respect to some statistics. Some examples of bud generating systems are constructed; in particular to specify some sorts of balanced trees and to obtain recursive formulas enumerating these.Comment: 48 page

    An Alternative Conception of Tree-Adjoining Derivation

    Get PDF
    The precise formulation of derivation for tree-adjoining grammars has important ramifications for a wide variety of uses of the formalism, from syntactic analysis to semantic interpretation and statistical language modeling. We argue that the definition of tree-adjoining derivation must be reformulated in order to manifest the proper linguistic dependencies in derivations. The particular proposal is both precisely characterizable through a definition of TAG derivations as equivalence classes of ordered derivation trees, and computationally operational, by virtue of a compilation to linear indexed grammars together with an efficient algorithm for recognition and parsing according to the compiled grammar.Comment: 33 page

    Lexicalization and Grammar Development

    Get PDF
    In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing, ranging from wide-coverage grammar development to parsing and machine translation. We also present a method for compact and efficient representation of lexicalized trees.Comment: ps file. English w/ German abstract. 10 page

    Separating Dependency from Constituency in a Tree Rewriting System

    Full text link
    In this paper we present a new tree-rewriting formalism called Link-Sharing Tree Adjoining Grammar (LSTAG) which is a variant of synchronous TAGs. Using LSTAG we define an approach towards coordination where linguistic dependency is distinguished from the notion of constituency. Such an approach towards coordination that explicitly distinguishes dependencies from constituency gives a better formal understanding of its representation when compared to previous approaches that use tree-rewriting systems which conflate the two issues.Comment: 7 pages, 6 Postscript figures, uses fullname.st

    Restricting the Weak-Generative Capacity of Synchronous Tree-Adjoining Grammars

    Get PDF
    The formalism of synchronous tree-adjoining grammars, a variant of standard tree-adjoining grammars (TAG), was intended to allow the use of TAGs for language transduction in addition to language specification. In previous work, the definition of the transduction relation defined by a synchronous TAG was given by appeal to an iterative rewriting process. The rewriting definition of derivation is problematic in that it greatly extends the expressivity of the formalism and makes the design of parsing algorithms difficult if not impossible. We introduce a simple, natural definition of synchronous tree-adjoining derivation, based on isomorphisms between standard tree-adjoining derivations, that avoids the expressivity and implementability problems of the original rewriting definition. The decrease in expressivity, which would otherwise make the method unusable, is offset by the incorporation of an alternative definition of standard tree-adjoining derivation, previously proposed for completely separate reasons, thereby making it practical to entertain using the natural definition of synchronous derivation. Nonetheless, some remaining problematic cases call for yet more flexibility in the definition; the isomorphism requirement may have to be relaxed. It remains for future research to tune the exact requirements on the allowable mappings.Comment: 21 pages, uses lingmacros.sty, psfig.sty, fullname.sty; minor typographical changes onl

    Synchronous Context-Free Grammars and Optimal Linear Parsing Strategies

    Full text link
    Synchronous Context-Free Grammars (SCFGs), also known as syntax-directed translation schemata, are unlike context-free grammars in that they do not have a binary normal form. In general, parsing with SCFGs takes space and time polynomial in the length of the input strings, but with the degree of the polynomial depending on the permutations of the SCFG rules. We consider linear parsing strategies, which add one nonterminal at a time. We show that for a given input permutation, the problems of finding the linear parsing strategy with the minimum space and time complexity are both NP-hard
    • …
    corecore