4,081 research outputs found

    A Bibliography on Fuzzy Automata, Grammars and Lanuages

    Get PDF
    This bibliography contains references to papers on fuzzy formal languages, the generation of fuzzy languages by means of fuzzy grammars, the recognition of fuzzy languages by fuzzy automata and machines, as well as some applications of fuzzy set theory to syntactic pattern recognition, linguistics and natural language processing

    Probabilistic regular graphs

    Get PDF
    Deterministic graph grammars generate regular graphs, that form a structural extension of configuration graphs of pushdown systems. In this paper, we study a probabilistic extension of regular graphs obtained by labelling the terminal arcs of the graph grammars by probabilities. Stochastic properties of these graphs are expressed using PCTL, a probabilistic extension of computation tree logic. We present here an algorithm to perform approximate verification of PCTL formulae. Moreover, we prove that the exact model-checking problem for PCTL on probabilistic regular graphs is undecidable, unless restricting to qualitative properties. Our results generalise those of EKM06, on probabilistic pushdown automata, using similar methods combined with graph grammars techniques.Comment: In Proceedings INFINITY 2010, arXiv:1010.611

    Calibrating Generative Models: The Probabilistic Chomsky-SchĆ¼tzenberger Hierarchy

    Get PDF
    A probabilistic Chomskyā€“SchĆ¼tzenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning

    Computation of distances for regular and context-free probabilistic languages

    Get PDF
    Several mathematical distances between probabilistic languages have been investigated in the literature, motivated by applications in language modeling, computational biology, syntactic pattern matching and machine learning. In most cases, only pairs of probabilistic regular languages were considered. In this paper we extend the previous results to pairs of languages generated by a probabilistic context-free grammar and a probabilistic finite automaton.PostprintPeer reviewe

    Probabilistic Parsing Strategies

    Full text link
    We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results we also derive negative results on so-called generalized LR parsing.Comment: 36 pages, 1 figur

    Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

    Get PDF
    Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

    Probabilistic parsing

    Get PDF
    Postprin
    • ā€¦
    corecore