19 research outputs found

    Information-geometric Markov Chain Monte Carlo methods using Diffusions

    Get PDF
    Recent work incorporating geometric ideas in Markov chain Monte Carlo is reviewed in order to highlight these advances and their possible application in a range of domains beyond Statistics. A full exposition of Markov chains and their use in Monte Carlo simulation for Statistical inference and molecular dynamics is provided, with particular emphasis on methods based on Langevin diffusions. After this geometric concepts in Markov chain Monte Carlo are introduced. A full derivation of the Langevin diffusion on a Riemannian manifold is given, together with a discussion of appropriate Riemannian metric choice for different problems. A survey of applications is provided, and some open questions are discussed.Comment: 22 pages, 2 figure

    Quasi-symplectic Langevin Variational Autoencoder

    Get PDF
    Variational autoencoder (VAE) is a very popular and well-investigated generative model vastly used in neural learning research. To leverage VAE in practical tasks dealing with a massive dataset of large dimensions it is required to deal with the difficulty of building low variance evidence lower bounds (ELBO). Markov ChainMonte Carlo (MCMC) is one of the effective approaches to tighten the ELBO for approximating the posterior distribution. Hamiltonian Variational Autoencoder(HVAE) is an effective MCMC inspired approach for constructing a low-variance ELBO which is also amenable to the reparameterization trick. In this work, we propose a Quasi-symplectic Langevin Variational autoencoder (Langevin-VAE) by incorporating the gradients information in the inference process through the Langevin dynamic. We show the effectiveness of the proposed approach by toy and real-world examples

    Geometric ergodicity of the Random Walk Metropolis with position-dependent proposal covariance

    Get PDF
    We consider a Metropolis-Hastings method with proposal kernel N(x,hG1(x))\mathcal{N}(x,hG^{-1}(x)), where xx is the current state. After discussing specific cases from the literature, we analyse the ergodicity properties of the resulting Markov chains. In one dimension we find that suitable choice of G1(x)G^{-1}(x) can change the ergodicity properties compared to the Random Walk Metropolis case N(x,hΣ)\mathcal{N}(x,h\Sigma), either for the better or worse. In higher dimensions we use a specific example to show that judicious choice of G1(x)G^{-1}(x) can produce a chain which will converge at a geometric rate to its limiting distribution when probability concentrates on an ever narrower ridge as x|x| grows, something which is not true for the Random Walk Metropolis.Comment: 15 pages + appendices, 4 figure

    Affine invariant interacting Langevin dynamics for Bayesian inference

    Get PDF
    We propose a computational method (with acronym ALDI) for sampling from a given target distribution based on first-order (overdamped) Langevin dynamics which satisfies the property of affine invariance. The central idea of ALDI is to run an ensemble of particles with their empirical covariance serving as a preconditioner for their underlying Langevin dynamics. ALDI does not require taking the inverse or square root of the empirical covariance matrix, which enables application to high-dimensional sampling problems. The theoretical properties of ALDI are studied in terms of non-degeneracy and ergodicity. Furthermore, we study its connections to diffusions on Riemannian manifolds and Wasserstein gradient flows. Bayesian inference serves as a main application area for ALDI. In case of a forward problem with additive Gaussian measurement errors, ALDI allows for a gradient-free implementation in the spirit of the ensemble Kalman filter. A computational comparison between gradient-free and gradient-based ALDI is provided for a PDE constrained Bayesian inverse problem

    Affine invariant interacting Langevin dynamics for Bayesian inference

    Get PDF
    We propose a computational method (with acronym ALDI) for sampling from a given target distribution based on first-order (overdamped) Langevin dynamics which satisfies the property of affine invariance. The central idea of ALDI is to run an ensemble of particles with their empirical covariance serving as a preconditioner for their underlying Langevin dynamics. ALDI does not require taking the inverse or square root of the empirical covariance matrix, which enables application to high-dimensional sampling problems. The theoretical properties of ALDI are studied in terms of non-degeneracy and ergodicity. Furthermore, we study its connections to diffusion on Riemannian manifolds and Wasserstein gradient flows. Bayesian inference serves as a main application area for ALDI. In case of a forward problem with additive Gaussian measurement errors, ALDI allows for a gradient-free approximation in the spirit of the ensemble Kalman filter. A computational comparison between gradient-free and gradient-based ALDI is provided for a PDE constrained Bayesian inverse problem

    Langevin and Hamiltonian based Sequential MCMC for Efficient Bayesian Filtering in High-dimensional Spaces

    Full text link
    Nonlinear non-Gaussian state-space models arise in numerous applications in statistics and signal processing. In this context, one of the most successful and popular approximation techniques is the Sequential Monte Carlo (SMC) algorithm, also known as particle filtering. Nevertheless, this method tends to be inefficient when applied to high dimensional problems. In this paper, we focus on another class of sequential inference methods, namely the Sequential Markov Chain Monte Carlo (SMCMC) techniques, which represent a promising alternative to SMC methods. After providing a unifying framework for the class of SMCMC approaches, we propose novel efficient strategies based on the principle of Langevin diffusion and Hamiltonian dynamics in order to cope with the increasing number of high-dimensional applications. Simulation results show that the proposed algorithms achieve significantly better performance compared to existing algorithms
    corecore