74,354 research outputs found
Rational Construction of Stochastic Numerical Methods for Molecular Sampling
In this article, we focus on the sampling of the configurational
Gibbs-Boltzmann distribution, that is, the calculation of averages of functions
of the position coordinates of a molecular -body system modelled at constant
temperature. We show how a formal series expansion of the invariant measure of
a Langevin dynamics numerical method can be obtained in a straightforward way
using the Baker-Campbell-Hausdorff lemma. We then compare Langevin dynamics
integrators in terms of their invariant distributions and demonstrate a
superconvergence property (4th order accuracy where only 2nd order would be
expected) of one method in the high friction limit; this method, moreover, can
be reduced to a simple modification of the Euler-Maruyama method for Brownian
dynamics involving a non-Markovian (coloured noise) random process. In the
Brownian dynamics case, 2nd order accuracy of the invariant density is
achieved. All methods considered are efficient for molecular applications
(requiring one force evaluation per timestep) and of a simple form. In fully
resolved (long run) molecular dynamics simulations, for our favoured method, we
observe up to two orders of magnitude improvement in configurational sampling
accuracy for given stepsize with no evident reduction in the size of the
largest usable timestep compared to common alternative methods
Improved Stein Variational Gradient Descent with Importance Weights
Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm
used in various machine learning tasks. It is well known that SVGD arises from
a discretization of the kernelized gradient flow of the Kullback-Leibler
divergence , where is the target
distribution. In this work, we propose to enhance SVGD via the introduction of
importance weights, which leads to a new method for which we coin the name
-SVGD. In the continuous time and infinite particles regime, the time
for this flow to converge to the equilibrium distribution , quantified by
the Stein Fisher information, depends on and very weakly. This
is very different from the kernelized gradient flow of Kullback-Leibler
divergence, whose time complexity depends on
. Under certain assumptions, we provide a
descent lemma for the population limit -SVGD, which covers the descent
lemma for the population limit SVGD when . We also illustrate the
advantages of -SVGD over SVGD by simple experiments.Comment: 24 page
Rapid Mixing of Gibbs Sampling on Graphs that are Sparse on Average
In this work we show that for every and the Ising model defined
on , there exists a , such that for all with probability going to 1 as , the mixing time of the
dynamics on is polynomial in . Our results are the first
polynomial time mixing results proven for a natural model on for where the parameters of the model do not depend on . They also provide
a rare example where one can prove a polynomial time mixing of Gibbs sampler in
a situation where the actual mixing time is slower than n \polylog(n). Our
proof exploits in novel ways the local treelike structure of Erd\H{o}s-R\'enyi
random graphs, comparison and block dynamics arguments and a recent result of
Weitz.
Our results extend to much more general families of graphs which are sparse
in some average sense and to much more general interactions. In particular,
they apply to any graph for which every vertex of the graph has a
neighborhood of radius in which the induced sub-graph is a
tree union at most edges and where for each simple path in
the sum of the vertex degrees along the path is . Moreover, our
result apply also in the case of arbitrary external fields and provide the
first FPRAS for sampling the Ising distribution in this case. We finally
present a non Markov Chain algorithm for sampling the distribution which is
effective for a wider range of parameters. In particular, for it
applies for all external fields and , where is the critical point for decay of correlation for the Ising model on
.Comment: Corrected proof of Lemma 2.
General bounds on Bayes errors for regression with Gaussian processes
Based on a simple convexity lemma, we develop bounds for different types of Bayesian prediction errors for regression with Gaussian processes. The basic bounds are formulated for a fixed training set. Simpler expressions are obtained for sampling from an input distribution which equals the weight function of the covariance kernel, yielding asymptotically tight results. The results are compared with numerical experiments
Uniform Sampling for Matrix Approximation
Random sampling has become a critical tool in solving massive matrix
problems. For linear regression, a small, manageable set of data rows can be
randomly selected to approximate a tall, skinny data matrix, improving
processing time significantly. For theoretical performance guarantees, each row
must be sampled with probability proportional to its statistical leverage
score. Unfortunately, leverage scores are difficult to compute.
A simple alternative is to sample rows uniformly at random. While this often
works, uniform sampling will eliminate critical row information for many
natural instances. We take a fresh look at uniform sampling by examining what
information it does preserve. Specifically, we show that uniform sampling
yields a matrix that, in some sense, well approximates a large fraction of the
original. While this weak form of approximation is not enough for solving
linear regression directly, it is enough to compute a better approximation.
This observation leads to simple iterative row sampling algorithms for matrix
approximation that run in input-sparsity time and preserve row structure and
sparsity at all intermediate steps. In addition to an improved understanding of
uniform sampling, our main proof introduces a structural result of independent
interest: we show that every matrix can be made to have low coherence by
reweighting a small subset of its rows
Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics
Let I_1,...,I_n be independent but not necessarily identically distributed
Bernoulli random variables, and let X_n=\sum_{j=1}^nI_j. For \nu in a bounded
region, a local central limit theorem expansion of P(X_n=EX_n+\nu) is developed
to any given degree. By conditioning, this expansion provides information on
the high-order correlation structure of dependent, weighted sampling schemes of
a population E (a special case of which is simple random sampling), where a set
d\subset E is sampled with probability proportional to \prod_{A\in d}x_A, where
x_A are positive weights associated with individuals A\in E. These results are
used to determine the asymptotic information, and demonstrate the consistency
and asymptotic normality of the conditional and unconditional logistic
likelihood estimator for unmatched case-control study designs in which sets of
controls of the same size are sampled with equal probability.Comment: Published at http://dx.doi.org/10.1214/009053604000000706 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …