1,299 research outputs found

    Probabilistic parsing

    Get PDF
    Postprin

    Probabilistic Parsing Strategies

    Full text link
    We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results we also derive negative results on so-called generalized LR parsing.Comment: 36 pages, 1 figur

    Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

    Get PDF
    Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

    Inducing Probabilistic Grammars by Bayesian Model Merging

    Full text link
    We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based nn-grams, and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 page

    Computation in Finitary Stochastic and Quantum Processes

    Full text link
    We introduce stochastic and quantum finite-state transducers as computation-theoretic models of classical stochastic and quantum finitary processes. Formal process languages, representing the distribution over a process's behaviors, are recognized and generated by suitable specializations. We characterize and compare deterministic and nondeterministic versions, summarizing their relative computational power in a hierarchy of finitary process languages. Quantum finite-state transducers and generators are a first step toward a computation-theoretic analysis of individual, repeatedly measured quantum dynamical systems. They are explored via several physical systems, including an iterated beam splitter, an atom in a magnetic field, and atoms in an ion trap--a special case of which implements the Deutsch quantum algorithm. We show that these systems' behaviors, and so their information processing capacity, depends sensitively on the measurement protocol.Comment: 25 pages, 16 figures, 1 table; http://cse.ucdavis.edu/~cmg; numerous corrections and update
    • ā€¦
    corecore