1,176 research outputs found
Weakly Restricted Stochastic Grammars
A new type of stochastic grammars is introduced for investigation: weakly restricted stochastic grammars. In this paper we will concentrate on the consistency problem. To find conditions for stochastic grammars to be consistent, the theory of multitype Galton-Watson branching processes and generating functions is of central importance.\ud
The unrestricted stochastic grammar formalism generates the same class of languages as the weakly restricted formalism. The inside-outside algorithm is adapted for use with weakly restricted grammars
Toric grammars: a new statistical approach to natural language modeling
We propose a new statistical model for computational linguistics. Rather than
trying to estimate directly the probability distribution of a random sentence
of the language, we define a Markov chain on finite sets of sentences with many
finite recurrent communicating classes and define our language model as the
invariant probability measures of the chain on each recurrent communicating
class. This Markov chain, that we call a communication model, recombines at
each step randomly the set of sentences forming its current state, using some
grammar rules. When the grammar rules are fixed and known in advance instead of
being estimated on the fly, we can prove supplementary mathematical properties.
In particular, we can prove in this case that all states are recurrent states,
so that the chain defines a partition of its state space into finite recurrent
communicating classes. We show that our approach is a decisive departure from
Markov models at the sentence level and discuss its relationships with Context
Free Grammars. Although the toric grammars we use are closely related to
Context Free Grammars, the way we generate the language from the grammar is
qualitatively different. Our communication model has two purposes. On the one
hand, it is used to define indirectly the probability distribution of a random
sentence of the language. On the other hand it can serve as a (crude) model of
language transmission from one speaker to another speaker through the
communication of a (large) set of sentences
Inducing Probabilistic Grammars by Bayesian Model Merging
We describe a framework for inducing probabilistic grammars from corpora of
positive samples. First, samples are {\em incorporated} by adding ad-hoc rules
to a working grammar; subsequently, elements of the model (such as states or
nonterminals) are {\em merged} to achieve generalization and a more compact
representation. The choice of what to merge and when to stop is governed by the
Bayesian posterior probability of the grammar given the data, which formalizes
a trade-off between a close fit to the data and a default preference for
simpler models (`Occam's Razor'). The general scheme is illustrated using three
types of probabilistic grammars: Hidden Markov models, class-based -grams,
and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second
International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13
page
Calibrating Generative Models: The Probabilistic Chomsky-Schützenberger Hierarchy
A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning
Equilibria, Fixed Points, and Complexity Classes
Many models from a variety of areas involve the computation of an equilibrium
or fixed point of some kind. Examples include Nash equilibria in games; market
equilibria; computing optimal strategies and the values of competitive games
(stochastic and other games); stable configurations of neural networks;
analysing basic stochastic models for evolution like branching processes and
for language like stochastic context-free grammars; and models that incorporate
the basic primitives of probability and recursion like recursive Markov chains.
It is not known whether these problems can be solved in polynomial time. There
are certain common computational principles underlying different types of
equilibria, which are captured by the complexity classes PLS, PPAD, and FIXP.
Representative complete problems for these classes are respectively, pure Nash
equilibria in games where they are guaranteed to exist, (mixed) Nash equilibria
in 2-player normal form games, and (mixed) Nash equilibria in normal form games
with 3 (or more) players. This paper reviews the underlying computational
principles and the corresponding classes
Evaluating two methods for Treebank grammar compaction
Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage grammars. In the simplest case, rules can simply be ‘read off’ the parse-annotations of the corpus, producing either a simple or probabilistic context-free grammar. Such grammars, however, can be very large, presenting problems for the subsequent computational costs of parsing under the grammar.
In this paper, we explore ways by which a treebank grammar can be reduced in size or ‘compacted’, which involve the use of two kinds of technique: (i) thresholding of rules by their number of occurrences; and (ii) a method of rule-parsing, which has both probabilistic and non-probabilistic variants. Our results show that by a combined use of these two techniques, a probabilistic context-free grammar can be reduced in size by 62% without any loss in parsing performance, and by 71% to give a gain in recall, but some loss in precision
- …