630 research outputs found

    Probabilistic parsing

    Get PDF
    Postprin

    Extended macro grammars and stack controlled machines

    Get PDF
    K-extended basic macro grammars are introduced, where K is any class of languages. The class B(K) of languages generated by such grammars is investigated, together with the class LB(K) of languages generated by the corresponding linear basic grammars. For any full semi-AFL K, B(K) is a full AFL closed under iterated LB(K)-substitution, but not necessarily under substitution. For any machine type D, the stack controlled machine type corresponding to D is introduced, denoted S(D), and the checking-stack controlled machine type CS(D). The data structure of this machine is a stack which controls a pushdown of data structures from D. If D accepts K, then S(D) accepts B(K) and CS(D) accepts LB(K). Thus the classes B(K) are characterized by stack controlled machines and the classes LB(K), i.e., the full hyper-AFLs, by checking-stack controlled machines. A full basic-AFL is a full AFL K such that B(K)C K. Every full basic-AFL is a full hyper-AFL, but not vice versa. The class of OI macro languages (i.e., indexed languages, i.e., nested stack automaton languages) is a full basic-AFL, properly containing the smallest full basic-AFL. The latter is generated by the ultrabasic macro grammars and accepted by the nested stack automata with bounded depth of nesting (and properly contains the stack languages, the ETOL languages, i.e., the smallest full hyper-AFL, and the basic macro languages). The full basic-AFLs are characterized by bounded nested stack controlled machines

    A Note on the Complexity of Restricted Attribute-Value Grammars

    Full text link
    The recognition problem for attribute-value grammars (AVGs) was shown to be undecidable by Johnson in 1988. Therefore, the general form of AVGs is of no practical use. In this paper we study a very restricted form of AVG, for which the recognition problem is decidable (though still NP-complete), the R-AVG. We show that the R-AVG formalism captures all of the context free languages and more, and introduce a variation on the so-called `off-line parsability constraint', the `honest parsability constraint', which lets different types of R-AVG coincide precisely with well-known time complexity classes.Comment: 18 pages, also available by (1) anonymous ftp at ftp://ftp.fwi.uva.nl/pub/theory/illc/researchReports/CT-95-02.ps.gz ; (2) WWW from http://www.fwi.uva.nl/~mtrautwe

    Cooperating Distributed Grammar Systems of Finite Index Working in Hybrid Modes

    Full text link
    We study cooperating distributed grammar systems working in hybrid modes in connection with the finite index restriction in two different ways: firstly, we investigate cooperating distributed grammar systems working in hybrid modes which characterize programmed grammars with the finite index restriction; looking at the number of components of such systems, we obtain surprisingly rich lattice structures for the inclusion relations between the corresponding language families. Secondly, we impose the finite index restriction on cooperating distributed grammar systems working in hybrid modes themselves, which leads us to new characterizations of programmed grammars of finite index.Comment: In Proceedings AFL 2014, arXiv:1405.527

    Computation of distances for regular and context-free probabilistic languages

    Get PDF
    Several mathematical distances between probabilistic languages have been investigated in the literature, motivated by applications in language modeling, computational biology, syntactic pattern matching and machine learning. In most cases, only pairs of probabilistic regular languages were considered. In this paper we extend the previous results to pairs of languages generated by a probabilistic context-free grammar and a probabilistic finite automaton.PostprintPeer reviewe

    Generalizing input-driven languages: theoretical and practical benefits

    Get PDF
    Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields. Practically all of their related problems are decidable, so that they support automatic verification algorithms. Also, they can be recognized in real-time. Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t. RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms. Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques. Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields. After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families

    Multiple Context-Free Tree Grammars: Lexicalization and Characterization

    Get PDF
    Multiple (simple) context-free tree grammars are investigated, where "simple" means "linear and nondeleting". Every multiple context-free tree grammar that is finitely ambiguous can be lexicalized; i.e., it can be transformed into an equivalent one (generating the same tree language) in which each rule of the grammar contains a lexical symbol. Due to this transformation, the rank of the nonterminals increases at most by 1, and the multiplicity (or fan-out) of the grammar increases at most by the maximal rank of the lexical symbols; in particular, the multiplicity does not increase when all lexical symbols have rank 0. Multiple context-free tree grammars have the same tree generating power as multi-component tree adjoining grammars (provided the latter can use a root-marker). Moreover, every multi-component tree adjoining grammar that is finitely ambiguous can be lexicalized. Multiple context-free tree grammars have the same string generating power as multiple context-free (string) grammars and polynomial time parsing algorithms. A tree language can be generated by a multiple context-free tree grammar if and only if it is the image of a regular tree language under a deterministic finite-copying macro tree transducer. Multiple context-free tree grammars can be used as a synchronous translation device.Comment: 78 pages, 13 figure
    corecore