1,515 research outputs found

    Efficient Tabular LR Parsing

    Get PDF
    We give a new treatment of tabular LR parsing, which is an alternative to Tomita's generalized LR algorithm. The advantage is twofold. Firstly, our treatment is conceptually more attractive because it uses simpler concepts, such as grammar transformations and standard tabulation techniques also know as chart parsing. Secondly, the static and dynamic complexity of parsing, both in space and time, is significantly reduced.Comment: 8 pages, uses aclap.st

    Tabular Parsing

    Full text link
    This is a tutorial on tabular parsing, on the basis of tabulation of nondeterministic push-down automata. Discussed are Earley's algorithm, the Cocke-Kasami-Younger algorithm, tabular LR parsing, the construction of parse trees, and further issues.Comment: 21 pages, 14 figure

    Probabilistic Parsing Strategies

    Full text link
    We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results we also derive negative results on so-called generalized LR parsing.Comment: 36 pages, 1 figur

    Practical experiments with regular approximation of context-free languages

    Get PDF
    Several methods are discussed that construct a finite automaton given a context-free grammar, including both methods that lead to subsets and those that lead to supersets of the original context-free language. Some of these methods of regular approximation are new, and some others are presented here in a more refined form with respect to existing literature. Practical experiments with the different methods of regular approximation are performed for spoken-language input: hypotheses from a speech recognizer are filtered through a finite automaton.Comment: 28 pages. To appear in Computational Linguistics 26(1), March 200

    Simple chain grammars and languages

    Get PDF
    A subclass of the LR(0)-grammars, the class of simple chain grammars is introduced. Although there exist simple chain grammars which are not LL(k) for any k>0, this new class of grammars is very closely related to the LL(1) and simple LL(1) grammars. In fact it can be shown that every simple chain grammar has an equivalent simple LL(1) grammar. Cover properties for simple chain grammars are investigated and a deterministic pushdown transducer which acts as a right parser for simple chain grammars is presented

    Generalizing input-driven languages: theoretical and practical benefits

    Get PDF
    Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields. Practically all of their related problems are decidable, so that they support automatic verification algorithms. Also, they can be recognized in real-time. Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t. RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms. Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques. Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields. After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families

    Probabilistic parsing

    Get PDF
    Postprin

    Precedence Automata and Languages

    Full text link
    Operator precedence grammars define a classical Boolean and deterministic context-free family (called Floyd languages or FLs). FLs have been shown to strictly include the well-known visibly pushdown languages, and enjoy the same nice closure properties. We introduce here Floyd automata, an equivalent operational formalism for defining FLs. This also permits to extend the class to deal with infinite strings to perform for instance model checking.Comment: Extended version of the paper which appeared in Proceedings of CSR 2011, Lecture Notes in Computer Science, vol. 6651, pp. 291-304, 2011. Theorem 1 has been corrected and a complete proof is given in Appendi
    corecore