51,118 research outputs found
The two-sided infinite extension of the Mallows model for random permutations
We introduce a probability distribution Q on the group of permutations of the
set Z of integers. Distribution Q is a natural extension of the Mallows
distribution on the finite symmetric group. A one-sided infinite counterpart of
Q, supported by the group of permutations of the set N of natural numbers, was
studied previously in our paper [Gnedin and Olshanski, Ann. Prob. 38 (2010),
2103-2135; arXiv:0907.3275]. We analyze various features of Q such as its
symmetries, the support, and the marginal distributions.Comment: 29 pages, Late
A New Approach to Speeding Up Topic Modeling
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic
modeling paradigm, and recently finds many applications in computer vision and
computational biology. In this paper, we propose a fast and accurate batch
algorithm, active belief propagation (ABP), for training LDA. Usually batch LDA
algorithms require repeated scanning of the entire corpus and searching the
complete topic space. To process massive corpora having a large number of
topics, the training iteration of batch LDA algorithms is often inefficient and
time-consuming. To accelerate the training speed, ABP actively scans the subset
of corpus and searches the subset of topic space for topic modeling, therefore
saves enormous training time in each iteration. To ensure accuracy, ABP selects
only those documents and topics that contribute to the largest residuals within
the residual belief propagation (RBP) framework. On four real-world corpora,
ABP performs around to times faster than state-of-the-art batch LDA
algorithms with a comparable topic modeling accuracy.Comment: 14 pages, 12 figure
Brownian Web and Oriented Percolation: Density Bounds
In a recent work, we proved that under diffusive scaling, the collection of
rightmost infinite open paths in a supercritical oriented percolation
configuration on the space-time lattice Z^2 converges in distribution to the
Brownian web. In that proof, the FKG inequality played an important role in
establishing a density bound, which is a part of the convergence criterion for
the Brownian web formulated by Fontes et al (2004). In this note, we illustrate
how an alternative convergence criterion formulated by Newman et al (2005) can
be verified in this case, which involves a dual density bound that can be
established without using the FKG inequality. This alternative approach is in
some sense more robust. We will also show that the spatial density of the
collection of rightmost infinite open paths starting at time 0 decays
asymptotically in time as c/\sqrt{t} for some c>0.Comment: 12 pages. This is a proceeding article for the RIMS workshop
"Applications of Renormalization Group Methods in Mathematical Sciences",
held at Kyoto University from September 12th to 14th, 2011. Submitted to the
RIMS Kokyuroku serie
Weighted dependency graphs
The theory of dependency graphs is a powerful toolbox to prove asymptotic
normality of sums of random variables. In this article, we introduce a more
general notion of weighted dependency graphs and give normality criteria in
this context. We also provide generic tools to prove that some weighted graph
is a weighted dependency graph for a given family of random variables.
To illustrate the power of the theory, we give applications to the following
objects: uniform random pair partitions, the random graph model ,
uniform random permutations, the symmetric simple exclusion process and
multilinear statistics on Markov chains. The application to random permutations
gives a bivariate extension of a functional central limit theorem of Janson and
Barbour. On Markov chains, we answer positively an open question of Bourdon and
Vall\'ee on the asymptotic normality of subword counts in random texts
generated by a Markovian source.Comment: 57 pages. Third version: minor modifications, after review proces
- …