2,165 research outputs found

    Calibrating Generative Models: The Probabilistic Chomsky-SchĂŒtzenberger Hierarchy

    Get PDF
    A probabilistic Chomsky–SchĂŒtzenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning

    Probabilistic Parsing Strategies

    Full text link
    We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results we also derive negative results on so-called generalized LR parsing.Comment: 36 pages, 1 figur

    Computation of distances for regular and context-free probabilistic languages

    Get PDF
    Several mathematical distances between probabilistic languages have been investigated in the literature, motivated by applications in language modeling, computational biology, syntactic pattern matching and machine learning. In most cases, only pairs of probabilistic regular languages were considered. In this paper we extend the previous results to pairs of languages generated by a probabilistic context-free grammar and a probabilistic finite automaton.PostprintPeer reviewe

    Probabilistic parsing

    Get PDF
    Postprin

    DNA ANALYSIS USING GRAMMATICAL INFERENCE

    Get PDF
    An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA. An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm. Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology. To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly

    Decidable and undecidable problems about quantum automata

    Get PDF
    We study the following decision problem: is the language recognized by a quantum finite automaton empty or non-empty? We prove that this problem is decidable or undecidable depending on whether recognition is defined by strict or non-strict thresholds. This result is in contrast with the corresponding situation for probabilistic finite automata for which it is known that strict and non-strict thresholds both lead to undecidable problems.Comment: 10 page
    • 

    corecore