74,354 research outputs found

    Rational Construction of Stochastic Numerical Methods for Molecular Sampling

    Get PDF
    In this article, we focus on the sampling of the configurational Gibbs-Boltzmann distribution, that is, the calculation of averages of functions of the position coordinates of a molecular NN-body system modelled at constant temperature. We show how a formal series expansion of the invariant measure of a Langevin dynamics numerical method can be obtained in a straightforward way using the Baker-Campbell-Hausdorff lemma. We then compare Langevin dynamics integrators in terms of their invariant distributions and demonstrate a superconvergence property (4th order accuracy where only 2nd order would be expected) of one method in the high friction limit; this method, moreover, can be reduced to a simple modification of the Euler-Maruyama method for Brownian dynamics involving a non-Markovian (coloured noise) random process. In the Brownian dynamics case, 2nd order accuracy of the invariant density is achieved. All methods considered are efficient for molecular applications (requiring one force evaluation per timestep) and of a simple form. In fully resolved (long run) molecular dynamics simulations, for our favoured method, we observe up to two orders of magnitude improvement in configurational sampling accuracy for given stepsize with no evident reduction in the size of the largest usable timestep compared to common alternative methods

    Improved Stein Variational Gradient Descent with Importance Weights

    Full text link
    Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence DKL(π)D_{KL}\left(\cdot\mid\pi\right), where π\pi is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name β\beta-SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution π\pi, quantified by the Stein Fisher information, depends on ρ0\rho_0 and π\pi very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on DKL(ρ0π)D_{KL}\left(\rho_0\mid\pi\right). Under certain assumptions, we provide a descent lemma for the population limit β\beta-SVGD, which covers the descent lemma for the population limit SVGD when β0\beta\to 0. We also illustrate the advantages of β\beta-SVGD over SVGD by simple experiments.Comment: 24 page

    Rapid Mixing of Gibbs Sampling on Graphs that are Sparse on Average

    Get PDF
    In this work we show that for every d<d < \infty and the Ising model defined on G(n,d/n)G(n,d/n), there exists a βd>0\beta_d > 0, such that for all β<βd\beta < \beta_d with probability going to 1 as nn \to \infty, the mixing time of the dynamics on G(n,d/n)G(n,d/n) is polynomial in nn. Our results are the first polynomial time mixing results proven for a natural model on G(n,d/n)G(n,d/n) for d>1d > 1 where the parameters of the model do not depend on nn. They also provide a rare example where one can prove a polynomial time mixing of Gibbs sampler in a situation where the actual mixing time is slower than n \polylog(n). Our proof exploits in novel ways the local treelike structure of Erd\H{o}s-R\'enyi random graphs, comparison and block dynamics arguments and a recent result of Weitz. Our results extend to much more general families of graphs which are sparse in some average sense and to much more general interactions. In particular, they apply to any graph for which every vertex vv of the graph has a neighborhood N(v)N(v) of radius O(logn)O(\log n) in which the induced sub-graph is a tree union at most O(logn)O(\log n) edges and where for each simple path in N(v)N(v) the sum of the vertex degrees along the path is O(logn)O(\log n). Moreover, our result apply also in the case of arbitrary external fields and provide the first FPRAS for sampling the Ising distribution in this case. We finally present a non Markov Chain algorithm for sampling the distribution which is effective for a wider range of parameters. In particular, for G(n,d/n)G(n,d/n) it applies for all external fields and β<βd\beta < \beta_d, where dtanh(βd)=1d \tanh(\beta_d) = 1 is the critical point for decay of correlation for the Ising model on G(n,d/n)G(n,d/n).Comment: Corrected proof of Lemma 2.

    General bounds on Bayes errors for regression with Gaussian processes

    Get PDF
    Based on a simple convexity lemma, we develop bounds for different types of Bayesian prediction errors for regression with Gaussian processes. The basic bounds are formulated for a fixed training set. Simpler expressions are obtained for sampling from an input distribution which equals the weight function of the covariance kernel, yielding asymptotically tight results. The results are compared with numerical experiments

    Uniform Sampling for Matrix Approximation

    Full text link
    Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear regression directly, it is enough to compute a better approximation. This observation leads to simple iterative row sampling algorithms for matrix approximation that run in input-sparsity time and preserve row structure and sparsity at all intermediate steps. In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows

    Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics

    Full text link
    Let I_1,...,I_n be independent but not necessarily identically distributed Bernoulli random variables, and let X_n=\sum_{j=1}^nI_j. For \nu in a bounded region, a local central limit theorem expansion of P(X_n=EX_n+\nu) is developed to any given degree. By conditioning, this expansion provides information on the high-order correlation structure of dependent, weighted sampling schemes of a population E (a special case of which is simple random sampling), where a set d\subset E is sampled with probability proportional to \prod_{A\in d}x_A, where x_A are positive weights associated with individuals A\in E. These results are used to determine the asymptotic information, and demonstrate the consistency and asymptotic normality of the conditional and unconditional logistic likelihood estimator for unmatched case-control study designs in which sets of controls of the same size are sampled with equal probability.Comment: Published at http://dx.doi.org/10.1214/009053604000000706 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore