51,974 research outputs found

    Analysis-oriented two-level grammars

    Get PDF
    Summary: p. 2-3

    Computational Analyses of Arabic Morphology

    Full text link
    This paper demonstrates how a (multi-tape) two-level formalism can be used to write two-level grammars for Arabic non-linear morphology using a high level, but computationally tractable, notation. Three illustrative grammars are provided based on CV-, moraic- and affixational analyses. These are complemented by a proposal for handling the hitherto computationally untreated problem of the broken plural. It will be shown that the best grammars for describing Arabic non-linear morphology are moraic in the case of templatic stems, and affixational in the case of a-templatic stems. The paper will demonstrate how the broken plural can be derived under two-level theory via the `implicit' derivation of the singular.Comment: to appear in Narayanan A., Ditters E. (eds). The Linguistic Computation of Arabic. uuencoded, compressed .ps file, 27 page

    Two-level grammars: Some interesting properties of van Wijngaarden grammars.

    Get PDF
    The van Wijngaarden grammars are two-level grammars that present many interesting properties. In the present article I elaborate on six of these properties, to wit, (i) their being constituted by two grammars, (ii) their ability to generate (possibly infinitely many) strict languages and their own metalanguage, (iii) their context-sensitivity, (iv) their high descriptive power, (v) their productivity, or the ability to generate an infinite number of production rules, and (vi) their equivalence with the unrestricted, or Type-0, Chomsky grammars

    Two-level grammars: Some interesting properties of van Wijngaarden grammars.

    Get PDF
    The van Wijngaarden grammars are two-level grammars that present many interesting properties. In the present article I elaborate on six of these properties, to wit, (i) their being constituted by two grammars, (ii) their ability to generate (possibly infinitely many) strict languages and their own metalanguage, (iii) their context-sensitivity, (iv) their high descriptive power, (v) their productivity, or the ability to generate an infinite number of production rules, and (vi) their equivalence with the unrestricted, or Type-0, Chomsky grammars

    Toric grammars: a new statistical approach to natural language modeling

    Full text link
    We propose a new statistical model for computational linguistics. Rather than trying to estimate directly the probability distribution of a random sentence of the language, we define a Markov chain on finite sets of sentences with many finite recurrent communicating classes and define our language model as the invariant probability measures of the chain on each recurrent communicating class. This Markov chain, that we call a communication model, recombines at each step randomly the set of sentences forming its current state, using some grammar rules. When the grammar rules are fixed and known in advance instead of being estimated on the fly, we can prove supplementary mathematical properties. In particular, we can prove in this case that all states are recurrent states, so that the chain defines a partition of its state space into finite recurrent communicating classes. We show that our approach is a decisive departure from Markov models at the sentence level and discuss its relationships with Context Free Grammars. Although the toric grammars we use are closely related to Context Free Grammars, the way we generate the language from the grammar is qualitatively different. Our communication model has two purposes. On the one hand, it is used to define indirectly the probability distribution of a random sentence of the language. On the other hand it can serve as a (crude) model of language transmission from one speaker to another speaker through the communication of a (large) set of sentences
    • …
    corecore