34,245 research outputs found
Efficient Sampling of Random Permutations
International audienceWe show how to uniformly distribute data at random (not to be confounded with permutation routing) in two settings that are able to deal with massive data: coarse grained parallelism and external memory. In contrast to previously known work for parallel setups, our method is able to fulfill the three criteria of uniformity, work-optimality and balance among the processors simultaneously. To guarantee the uniformity we investigate the matrix of communication requests between the processors. We show that its distribution is a generalization of the multivariate hypergeometric distribution and we give algorithms to sample it efficiently in the two settings
An R package for permutations, Mallows and Generalized Mallows models
[EN]Probability models on permutations associate a probability value to each of the permutations on n items. This paper considers two popular probability models, the Mallows model and the Generalized Mallows model. We describe methods for making inference, sampling and learning such distributions, some of which are novel in the literature. This paper also describes operations for permutations, with special attention in those related with the Kendall and Cayley distances and the random generation of permutations. These operations are of key importance for the efficient computation of the operations on distributions. These algorithms are implemented in the associated R package. Moreover, the internal code is written in C++
Approximately Sampling Elements with Fixed Rank in Graded Posets
Graded posets frequently arise throughout combinatorics, where it is natural
to try to count the number of elements of a fixed rank. These counting problems
are often -complete, so we consider approximation algorithms for
counting and uniform sampling. We show that for certain classes of posets,
biased Markov chains that walk along edges of their Hasse diagrams allow us to
approximately generate samples with any fixed rank in expected polynomial time.
Our arguments do not rely on the typical proofs of log-concavity, which are
used to construct a stationary distribution with a specific mode in order to
give a lower bound on the probability of outputting an element of the desired
rank. Instead, we infer this directly from bounds on the mixing time of the
chains through a method we call .
A noteworthy application of our method is sampling restricted classes of
integer partitions of . We give the first provably efficient Markov chain
algorithm to uniformly sample integer partitions of from general restricted
classes. Several observations allow us to improve the efficiency of this chain
to require space, and for unrestricted integer partitions,
expected time. Related applications include sampling permutations
with a fixed number of inversions and lozenge tilings on the triangular lattice
with a fixed average height.Comment: 23 pages, 12 figure
Permutation sampling in Path Integral Monte Carlo
A simple algorithm is described to sample permutations of identical particles
in Path Integral Monte Carlo (PIMC) simulations of continuum many-body systems.
The sampling strategy illustrated here is fairly general, and can be easily
incorporated in any PIMC implementation based on the staging algorithm.
Although it is similar in spirit to an existing prescription, it differs from
it in some key aspects. It allows one to sample permutations efficiently, even
if long paths (e.g., hundreds, or thousands of slices) are needed. We
illustrate its effectiveness by presenting results of a PIMC calculation of
thermodynamic properties of superfluid Helium-four, in which a very simple
approximation for the high-temperature density matrix was utilized
Efficient generation of random derangements with the expected distribution of cycle lengths
We show how to generate random derangements efficiently by two different
techniques: random restricted transpositions and sequential importance
sampling. The algorithm employing restricted transpositions can also be used to
generate random fixed-point-free involutions only, a.k.a. random perfect
matchings on the complete graph. Our data indicate that the algorithms generate
random samples with the expected distribution of cycle lengths, which we
derive, and for relatively small samples, which can actually be very large in
absolute numbers, we argue that they generate samples indistinguishable from
the uniform distribution. Both algorithms are simple to understand and
implement and possess a performance comparable to or better than those of
currently known methods. Simulations suggest that the mixing time of the
algorithm based on random restricted transpositions (in the total variance
distance with respect to the distribution of cycle lengths) is
with and the length of the
derangement. We prove that the sequential importance sampling algorithm
generates random derangements in time with probability of
failing.Comment: This version corrected and updated; 14 pages, 2 algorithms, 2 tables,
4 figure
Using parallel computation to improve Independent Metropolis--Hastings based estimation
In this paper, we consider the implications of the fact that parallel
raw-power can be exploited by a generic Metropolis--Hastings algorithm if the
proposed values are independent. In particular, we present improvements to the
independent Metropolis--Hastings algorithm that significantly decrease the
variance of any estimator derived from the MCMC output, for a null computing
cost since those improvements are based on a fixed number of target density
evaluations. Furthermore, the techniques developed in this paper do not
jeopardize the Markovian convergence properties of the algorithm, since they
are based on the Rao--Blackwell principles of Gelfand and Smith (1990), already
exploited in Casella and Robert (1996), Atchade and Perron (2005) and Douc and
Robert (2010). We illustrate those improvements both on a toy normal example
and on a classical probit regression model, but stress the fact that they are
applicable in any case where the independent Metropolis-Hastings is applicable.Comment: 19 pages, 8 figures, to appear in Journal of Computational and
Graphical Statistic
Optimal Discrete Uniform Generation from Coin Flips, and Applications
This article introduces an algorithm to draw random discrete uniform
variables within a given range of size n from a source of random bits. The
algorithm aims to be simple to implement and optimal both with regards to the
amount of random bits consumed, and from a computational perspective---allowing
for faster and more efficient Monte-Carlo simulations in computational physics
and biology. I also provide a detailed analysis of the number of bits that are
spent per variate, and offer some extensions and applications, in particular to
the optimal random generation of permutations.Comment: first draft, 22 pages, 5 figures, C code implementation of algorith
- …