1,514 research outputs found

    Calibrating Generative Models: The Probabilistic Chomsky-SchĂĽtzenberger Hierarchy

    Get PDF
    A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning

    Unsupervised Statistical Learning of Context-free Grammar

    Get PDF
    In this paper, we address the problem of inducing (weighted) context-free grammar (WCFG) on data given. The induction is performed by using a new model of grammatical inference, i.e., weighted Grammar-based Classifier System (wGCS). wGCS derives from learning classifier systems and searches grammar structure using a genetic algorithm and covering. Weights of rules are estimated by using a novelty Inside-Outside Contrastive Estimation algorithm. The proposed method employs direct negative evidence and learns WCFG both form positive and negative samples. Results of experiments on three synthetic context-free languages show that wGCS is competitive with other statistical-based method for unsupervised CFG learning

    07441 Abstracts Collection -- Algorithmic-Logical Theory of Infinite Structures

    Get PDF
    From 28.10. to 02.11.2007, the Dagstuhl Seminar 07441 ``Algorithmic-Logical Theory of Infinite Structures\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Series, Weighted Automata, Probabilistic Automata and Probability Distributions for Unranked Trees.

    Get PDF
    We study tree series and weighted tree automata over unranked trees. The message is that recognizable tree series for unranked trees can be defined and studied from recognizable tree series for binary representations of unranked trees. For this we prove results of Denis et al (2007) as follows. We extend hedge automata -- a class of tree automata for unranked trees -- to weighted hedge automata. We define weighted stepwise automata as weighted tree automata for binary representations of unranked trees. We show that recognizable tree series can be equivalently defined by weighted hedge automata or weighted stepwise automata. Then we consider real-valued tree series and weighted tree automata over the field of real numbers. We show that the result also holds for probabilistic automata -- weighted automata with normalisation conditions for rules. We also define convergent tree series and show that convergence properties for recognizable tree series are preserved via binary encoding. From Etessami and Yannakakis (2009), we present decidability results on probabilistic tree automata and algorithms for computing sums of convergent series. Last we show that streaming algorithms for unranked trees can be seen as slight transformations of algorithms on the binary representations

    A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars

    Full text link
    The paper gives a brief review of the expectation-maximization algorithm (Dempster 1977) in the comprehensible framework of discrete mathematics. In Section 2, two prominent estimation methods, the relative-frequency estimation and the maximum-likelihood estimation are presented. Section 3 is dedicated to the expectation-maximization algorithm and a simpler variant, the generalized expectation-maximization algorithm. In Section 4, two loaded dice are rolled. A more interesting example is presented in Section 5: The estimation of probabilistic context-free grammars.Comment: Presented at the 15th European Summer School in Logic, Language and Information (ESSLLI 2003). Example 5 extended (and partially corrected
    • …
    corecore