1,264 research outputs found

    Rule-restricted Automaton-grammar transducers: Power and Linguistic Applications

    Get PDF
    This paper introduces the notion of a new transducer as a two-component system, which consists of a nite automaton and a context-free grammar. In essence, while the automaton reads its input string, the grammar produces its output string, and their cooperation is controlled by a set, which restricts the usage of their rules. From a theoretical viewpoint, the present paper discusses the power of this system working in an ordinary way as well as in a leftmost way. In addition, the paper introduces an appearance checking, which allows us to check whether some symbols are present in the rewritten string, and studies its e ect on the power. It achieves the following three main results. First, the system generates and accepts languages de ned by matrix grammars and partially blind multi-counter automata, respectively. Second, if we place a leftmost restriction on derivation in the context-free grammar, both accepting and generating power of the system is equal to generative power of context-free grammars. Third, the system with appearance checking can accept and generate all recursively enumerable languages. From more pragmatical viewpoint, this paper describes several linguistic applications. A special attention is paid to the Japanese-Czech translation

    D-Tree Grammars

    Full text link
    DTG are designed to share some of the advantages of TAG while overcoming some of its limitations. DTG involve two composition operations called subsertion and sister-adjunction. The most distinctive feature of DTG is that, unlike TAG, there is complete uniformity in the way that the two DTG operations relate lexical items: subsertion always corresponds to complementation and sister-adjunction to modification. Furthermore, DTG, unlike TAG, can provide a uniform analysis for em wh-movement in English and Kashmiri, despite the fact that the em wh element in Kashmiri appears in sentence-second position, and not sentence-initial position as in English.Comment: Latex source, needs aclap.sty, 8 pages, to appear in ACL-9

    Canonical Derivations in Programmed Grammars

    Get PDF
    V této práci jsou studovány kanonické derivace (se zaměřením na nejlevější derivace) v programovaných gramatikách a rozsah levého omezení. Je ukázáno, že zavedením n-limitovaných derivací v programovaných gramatikách tak, jako byly zavedeny pro stavové gramatiky, dostaneme nekonečnou hierarchii jazykových tříd vyplývající z n-limitovaných programovaných gramatik, takže rozsah levého omezení ovlivňuje generativní sílu n-limitovaných programovaných gramatik. Tento výsledek má význam pro syntaktickou analýzu založenou na programovaných gramatikách.This work studies canonical derivations (with focus on leftmost derivations) in programmed grammars and left restriction range. It is shown that if we introduce n-limited derivations in programmed grammars as they were defined for state grammars, we get an infinite hierarchy of language families resulting from n-limited programmed grammars, so the left restriction range affects the generative power of n-limited programmed grammars. This result is significant for syntactical analysis based on programmed grammars.

    Structure preserving transformations on non-left-recursive grammars

    Get PDF
    We will be concerned with grammar covers, The first part of this paper presents a general framework for covers. The second part introduces a transformation from nonleft-recursive grammars to grammars in Greibach normal form. An investigation of the structure preserving properties of this transformation, which serves also as an illustration of our framework for covers, is presented

    Data-Oriented Language Processing. An Overview

    Full text link
    During the last few years, a new approach to language processing has started to emerge, which has become known under various labels such as "data-oriented parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den Berg et al. 1994; Bod 1992-96; Bod et al. 1996a/b; Bonnema 1996; Charniak 1996a/b; Goodman 1996; Kaplan 1996; Rajman 1995a/b; Scha 1990-92; Sekine & Grishman 1995; Sima'an et al. 1994; Sima'an 1995-96; Tugwell 1995). This approach, which we will call "data-oriented processing" or "DOP", embodies the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract linguistic rules. The models that instantiate this approach therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus; the occurrence-frequencies of the fragments are used to estimate which analysis is the most probable one. In this paper we give an in-depth discussion of a data-oriented processing model which employs a corpus of labelled phrase-structure trees. Then we review some other models that instantiate the DOP approach. Many of these models also employ labelled phrase-structure trees, but use different criteria for extracting fragments from the corpus or employ different disambiguation strategies (Bod 1996b; Charniak 1996a/b; Goodman 1996; Rajman 1995a/b; Sekine & Grishman 1995; Sima'an 1995-96); other models use richer formalisms for their corpus annotations (van den Berg et al. 1994; Bod et al., 1996a/b; Bonnema 1996; Kaplan 1996; Tugwell 1995).Comment: 34 pages, Postscrip
    corecore