30,844 research outputs found
Controlled non uniform random generation of decomposable structures
Consider a class of decomposable combinatorial structures, using different
types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the
random generation of such structures with respect to a size and a targeted
distribution in of its \emph{distinguished} atoms. We consider two
variations on this problem. In the first alternative, the targeted distribution
is given by real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 <
\TargFreq_i < 1 for all and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We
aim to generate random structures among the whole set of structures of a given
size , in such a way that the {\em expected} frequency of any distinguished
atom \At_i equals \TargFreq_i. We address this problem by weighting the
atoms with a -tuple \Weights of real-valued weights, inducing a weighted
distribution over the set of structures of size . We first adapt the
classical recursive random generation scheme into an algorithm taking
\bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw structures from
the \Weights-weighted distribution. Secondly, we address the analytical
computation of weights such that the targeted frequencies are achieved
asymptotically, i. e. for large values of . We derive systems of functional
equations whose resolution gives an explicit relationship between \Weights
and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in
\bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies
associated with a given -tuple \Weights of weights, and an optimized
version in \bigO{k n^2} in the case of context-free languages. This allows
for a heuristic resolution of the weights/frequencies relationship suitable for
complex specifications. In the second alternative, the targeted distribution is
given by a natural numbers such that
where is the number of undistinguished atoms.
The structures must be generated uniformly among the set of structures of size
that contain {\em exactly} atoms \At_i (). We give
a \bigO{r^2\prod_{i=1}^k n_i^2 +m n k \log n} algorithm for generating
structures, which simplifies into a \bigO{r\prod_{i=1}^k n_i +m n} for
regular specifications
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
Precise n-gram Probabilities from Stochastic Context-free Grammars
We present an algorithm for computing n-gram probabilities from stochastic
context-free grammars, a procedure that can alleviate some of the standard
problems associated with n-grams (estimation from sparse data, lack of
linguistic structure, among others). The method operates via the computation of
substring expectations, which in turn is accomplished by solving systems of
linear equations derived from the grammar. We discuss efficient implementation
of the algorithm and report our practical experience with it.Comment: 12 pages, to appear in ACL-9
Calibrating Generative Models: The Probabilistic Chomsky-SchĂŒtzenberger Hierarchy
A probabilistic ChomskyâSchĂŒtzenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning
On Buffon Machines and Numbers
The well-know needle experiment of Buffon can be regarded as an analog (i.e.,
continuous) device that stochastically "computes" the number 2/pi ~ 0.63661,
which is the experiment's probability of success. Generalizing the experiment
and simplifying the computational framework, we consider probability
distributions, which can be produced perfectly, from a discrete source of
unbiased coin flips. We describe and analyse a few simple Buffon machines that
generate geometric, Poisson, and logarithmic-series distributions. We provide
human-accessible Buffon machines, which require a dozen coin flips or less, on
average, and produce experiments whose probabilities of success are expressible
in terms of numbers such as, exp(-1), log 2, sqrt(3), cos(1/4), aeta(5).
Generally, we develop a collection of constructions based on simple
probabilistic mechanisms that enable one to design Buffon experiments involving
compositions of exponentials and logarithms, polylogarithms, direct and inverse
trigonometric functions, algebraic and hypergeometric functions, as well as
functions defined by integrals, such as the Gaussian error function.Comment: Largely revised version with references and figures added. 12 pages.
In ACM-SIAM Symposium on Discrete Algorithms (SODA'2011
More ties than we thought
We extend the existing enumeration of neck tie-knots to include tie-knots
with a textured front, tied with the narrow end of a tie. These tie-knots have
gained popularity in recent years, based on reconstructions of a costume detail
from The Matrix Reloaded, and are explicitly ruled out in the enumeration by
Fink and Mao (2000).
We show that the relaxed tie-knot description language that comprehensively
describes these extended tie-knot classes is context free. It has a regular
sub-language that covers all the knots that originally inspired the work.
From the full language, we enumerate 266 682 distinct tie-knots that seem
tie-able with a normal neck-tie. Out of these 266 682, we also enumerate 24 882
tie-knots that belong to the regular sub-language.Comment: Accepted at PeerJ Computer Science 12 pages, 6 color photograph
- âŠ