19 research outputs found
Information-geometric Markov Chain Monte Carlo methods using Diffusions
Recent work incorporating geometric ideas in Markov chain Monte Carlo is
reviewed in order to highlight these advances and their possible application in
a range of domains beyond Statistics. A full exposition of Markov chains and
their use in Monte Carlo simulation for Statistical inference and molecular
dynamics is provided, with particular emphasis on methods based on Langevin
diffusions. After this geometric concepts in Markov chain Monte Carlo are
introduced. A full derivation of the Langevin diffusion on a Riemannian
manifold is given, together with a discussion of appropriate Riemannian metric
choice for different problems. A survey of applications is provided, and some
open questions are discussed.Comment: 22 pages, 2 figure
Quasi-symplectic Langevin Variational Autoencoder
Variational autoencoder (VAE) is a very popular and well-investigated
generative model vastly used in neural learning research. To leverage VAE in
practical tasks dealing with a massive dataset of large dimensions it is
required to deal with the difficulty of building low variance evidence lower
bounds (ELBO). Markov ChainMonte Carlo (MCMC) is one of the effective
approaches to tighten the ELBO for approximating the posterior distribution.
Hamiltonian Variational Autoencoder(HVAE) is an effective MCMC inspired
approach for constructing a low-variance ELBO which is also amenable to the
reparameterization trick. In this work, we propose a Quasi-symplectic Langevin
Variational autoencoder (Langevin-VAE) by incorporating the gradients
information in the inference process through the Langevin dynamic. We show the
effectiveness of the proposed approach by toy and real-world examples
Geometric ergodicity of the Random Walk Metropolis with position-dependent proposal covariance
We consider a Metropolis-Hastings method with proposal kernel
, where is the current state. After discussing
specific cases from the literature, we analyse the ergodicity properties of the
resulting Markov chains. In one dimension we find that suitable choice of
can change the ergodicity properties compared to the Random Walk
Metropolis case , either for the better or worse. In
higher dimensions we use a specific example to show that judicious choice of
can produce a chain which will converge at a geometric rate to its
limiting distribution when probability concentrates on an ever narrower ridge
as grows, something which is not true for the Random Walk Metropolis.Comment: 15 pages + appendices, 4 figure
Affine invariant interacting Langevin dynamics for Bayesian inference
We propose a computational method (with acronym ALDI) for sampling from a given target distribution based on first-order (overdamped) Langevin dynamics which satisfies the property of affine invariance. The central idea of ALDI is to run an ensemble of particles with their empirical covariance serving as a preconditioner for their underlying Langevin dynamics. ALDI does not require taking the inverse or square root of the empirical covariance matrix, which enables application to high-dimensional sampling problems. The theoretical properties of ALDI are studied in terms of non-degeneracy and ergodicity. Furthermore, we study its connections to diffusions on Riemannian manifolds and Wasserstein gradient flows.
Bayesian inference serves as a main application area for ALDI. In case of a forward problem with additive Gaussian measurement errors, ALDI allows for a gradient-free implementation in the spirit of the ensemble Kalman filter. A computational comparison between gradient-free and gradient-based ALDI is provided for a PDE constrained Bayesian inverse problem
Affine invariant interacting Langevin dynamics for Bayesian inference
We propose a computational method (with acronym ALDI) for sampling from a
given target distribution based on first-order (overdamped) Langevin dynamics
which satisfies the property of affine invariance. The central idea of ALDI is
to run an ensemble of particles with their empirical covariance serving as a
preconditioner for their underlying Langevin dynamics. ALDI does not require
taking the inverse or square root of the empirical covariance matrix, which
enables application to high-dimensional sampling problems. The theoretical
properties of ALDI are studied in terms of non-degeneracy and ergodicity.
Furthermore, we study its connections to diffusion on Riemannian manifolds and
Wasserstein gradient flows.
Bayesian inference serves as a main application area for ALDI. In case of a
forward problem with additive Gaussian measurement errors, ALDI allows for a
gradient-free approximation in the spirit of the ensemble Kalman filter. A
computational comparison between gradient-free and gradient-based ALDI is
provided for a PDE constrained Bayesian inverse problem
Langevin and Hamiltonian based Sequential MCMC for Efficient Bayesian Filtering in High-dimensional Spaces
Nonlinear non-Gaussian state-space models arise in numerous applications in
statistics and signal processing. In this context, one of the most successful
and popular approximation techniques is the Sequential Monte Carlo (SMC)
algorithm, also known as particle filtering. Nevertheless, this method tends to
be inefficient when applied to high dimensional problems. In this paper, we
focus on another class of sequential inference methods, namely the Sequential
Markov Chain Monte Carlo (SMCMC) techniques, which represent a promising
alternative to SMC methods. After providing a unifying framework for the class
of SMCMC approaches, we propose novel efficient strategies based on the
principle of Langevin diffusion and Hamiltonian dynamics in order to cope with
the increasing number of high-dimensional applications. Simulation results show
that the proposed algorithms achieve significantly better performance compared
to existing algorithms