94 research outputs found
Analyzing Boltzmann Samplers for Bose-Einstein Condensates with Dirichlet Generating Functions
Boltzmann sampling is commonly used to uniformly sample objects of a
particular size from large combinatorial sets. For this technique to be
effective, one needs to prove that (1) the sampling procedure is efficient and
(2) objects of the desired size are generated with sufficiently high
probability. We use this approach to give a provably efficient sampling
algorithm for a class of weighted integer partitions related to Bose-Einstein
condensation from statistical physics. Our sampling algorithm is a
probabilistic interpretation of the ordinary generating function for these
objects, derived from the symbolic method of analytic combinatorics. Using the
Khintchine-Meinardus probabilistic method to bound the rejection rate of our
Boltzmann sampler through singularity analysis of Dirichlet generating
functions, we offer an alternative approach to analyze Boltzmann samplers for
objects with multiplicative structure.Comment: 20 pages, 1 figur
Infinite Boltzmann Samplers and Applications to Branching Processes
National audienceIn this short note, we extend the Boltzmann model for combinatorial random sampling [8] to allow for inļ¬nite size objects; in particular, this extension now fully includes Galton-Watson processes. We then illustrate our idea with two examples, one of which is the generation of preļ¬xes of inļ¬nite Cayley trees
Polynomial tuning of multiparametric combinatorial samplers
Boltzmann samplers and the recursive method are prominent algorithmic
frameworks for the approximate-size and exact-size random generation of large
combinatorial structures, such as maps, tilings, RNA sequences or various
tree-like structures. In their multiparametric variants, these samplers allow
to control the profile of expected values corresponding to multiple
combinatorial parameters. One can control, for instance, the number of leaves,
profile of node degrees in trees or the number of certain subpatterns in
strings. However, such a flexible control requires an additional non-trivial
tuning procedure. In this paper, we propose an efficient polynomial-time, with
respect to the number of tuned parameters, tuning algorithm based on convex
optimisation techniques. Finally, we illustrate the efficiency of our approach
using several applications of rational, algebraic and P\'olya structures
including polyomino tilings with prescribed tile frequencies, planar trees with
a given specific node degree distribution, and weighted partitions.Comment: Extended abstract, accepted to ANALCO2018. 20 pages, 6 figures,
colours. Implementation and examples are available at [1]
https://github.com/maciej-bendkowski/boltzmann-brain [2]
https://github.com/maciej-bendkowski/multiparametric-combinatorial-sampler
Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling
The beta-negative binomial process (BNBP), an integer-valued stochastic
process, is employed to partition a count vector into a latent random count
matrix. As the marginal probability distribution of the BNBP that governs the
exchangeable random partitions of grouped data has not yet been developed,
current inference for the BNBP has to truncate the number of atoms of the beta
process. This paper introduces an exchangeable partition probability function
to explicitly describe how the BNBP clusters the data points of each group into
a random number of exchangeable partitions, which are shared across all the
groups. A fully collapsed Gibbs sampler is developed for the BNBP, leading to a
novel nonparametric Bayesian topic model that is distinct from existing ones,
with simple implementation, fast convergence, good mixing, and state-of-the-art
predictive performance.Comment: in Neural Information Processing Systems (NIPS) 2014. 9 pages + 3
page appendi
Bayesian Poisson process partition calculus with an application to Bayesian L\'evy moving averages
This article develops, and describes how to use, results concerning
disintegrations of Poisson random measures. These results are fashioned as
simple tools that can be tailor-made to address inferential questions arising
in a wide range of Bayesian nonparametric and spatial statistical models. The
Poisson disintegration method is based on the formal statement of two results
concerning a Laplace functional change of measure and a Poisson Palm/Fubini
calculus in terms of random partitions of the integers {1,...,n}. The
techniques are analogous to, but much more general than, techniques for the
Dirichlet process and weighted gamma process developed in [Ann. Statist. 12
(1984) 351-357] and [Ann. Inst. Statist. Math. 41 (1989) 227-245]. In order to
illustrate the flexibility of the approach, large classes of random probability
measures and random hazards or intensities which can be expressed as
functionals of Poisson random measures are described. We describe a unified
posterior analysis of classes of discrete random probability which identifies
and exploits features common to all these models. The analysis circumvents many
of the difficult issues involved in Bayesian nonparametric calculus, including
a combinatorial component. This allows one to focus on the unique features of
each process which are characterized via real valued functions h. The
applicability of the technique is further illustrated by obtaining explicit
posterior expressions for L\'evy-Cox moving average processes within the
general setting of multiplicative intensity models.Comment: Published at http://dx.doi.org/10.1214/009053605000000336 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Gamma Processes, Stick-Breaking, and Variational Inference
While most Bayesian nonparametric models in machine learning have focused on
the Dirichlet process, the beta process, or their variants, the gamma process
has recently emerged as a useful nonparametric prior in its own right. Current
inference schemes for models involving the gamma process are restricted to
MCMC-based methods, which limits their scalability. In this paper, we present a
variational inference framework for models involving gamma process priors. Our
approach is based on a novel stick-breaking constructive definition of the
gamma process. We prove correctness of this stick-breaking process by using the
characterization of the gamma process as a completely random measure (CRM), and
we explicitly derive the rate measure of our construction using Poisson process
machinery. We also derive error bounds on the truncation of the infinite
process required for variational inference, similar to the truncation analyses
for other nonparametric models based on the Dirichlet and beta processes. Our
representation is then used to derive a variational inference algorithm for a
particular Bayesian nonparametric latent structure formulation known as the
infinite Gamma-Poisson model, where the latent variables are drawn from a gamma
process prior with Poisson likelihoods. Finally, we present results for our
algorithms on nonnegative matrix factorization tasks on document corpora, and
show that we compare favorably to both sampling-based techniques and
variational approaches based on beta-Bernoulli priors
Rapid Mixing of Gibbs Sampling on Graphs that are Sparse on Average
In this work we show that for every and the Ising model defined
on , there exists a , such that for all with probability going to 1 as , the mixing time of the
dynamics on is polynomial in . Our results are the first
polynomial time mixing results proven for a natural model on for where the parameters of the model do not depend on . They also provide
a rare example where one can prove a polynomial time mixing of Gibbs sampler in
a situation where the actual mixing time is slower than n \polylog(n). Our
proof exploits in novel ways the local treelike structure of Erd\H{o}s-R\'enyi
random graphs, comparison and block dynamics arguments and a recent result of
Weitz.
Our results extend to much more general families of graphs which are sparse
in some average sense and to much more general interactions. In particular,
they apply to any graph for which every vertex of the graph has a
neighborhood of radius in which the induced sub-graph is a
tree union at most edges and where for each simple path in
the sum of the vertex degrees along the path is . Moreover, our
result apply also in the case of arbitrary external fields and provide the
first FPRAS for sampling the Ising distribution in this case. We finally
present a non Markov Chain algorithm for sampling the distribution which is
effective for a wider range of parameters. In particular, for it
applies for all external fields and , where is the critical point for decay of correlation for the Ising model on
.Comment: Corrected proof of Lemma 2.
Support Size Estimation: The Power of Conditioning
We consider the problem of estimating the support size of a distribution .
Our investigations are pursued through the lens of distribution testing and
seek to understand the power of conditional sampling (denoted as COND), wherein
one is allowed to query the given distribution conditioned on an arbitrary
subset . The primary contribution of this work is to introduce a new
approach to lower bounds for the COND model that relies on using powerful tools
from information theory and communication complexity.
Our approach allows us to obtain surprisingly strong lower bounds for the
COND model and its extensions.
1) We bridge the longstanding gap between the upper () and the lower bound for
COND model by providing a nearly matching lower bound. Surprisingly, we show
that even if we get to know the actual probabilities along with COND samples,
still queries
are necessary.
2) We obtain the first non-trivial lower bound for COND equipped with an
additional oracle that reveals the conditional probabilities of the samples (to
the best of our knowledge, this subsumes all of the models previously studied):
in particular, we demonstrate that queries are necessary
- ā¦