98 research outputs found
Dimension-Independent MCMC Sampling for Inverse Problems with Non-Gaussian Priors
The computational complexity of MCMC methods for the exploration of complex
probability measures is a challenging and important problem. A challenge of
particular importance arises in Bayesian inverse problems where the target
distribution may be supported on an infinite dimensional space. In practice
this involves the approximation of measures defined on sequences of spaces of
increasing dimension. Motivated by an elliptic inverse problem with
non-Gaussian prior, we study the design of proposal chains for the
Metropolis-Hastings algorithm with dimension independent performance.
Dimension-independent bounds on the Monte-Carlo error of MCMC sampling for
Gaussian prior measures have already been established. In this paper we provide
a simple recipe to obtain these bounds for non-Gaussian prior measures. To
illustrate the theory we consider an elliptic inverse problem arising in
groundwater flow. We explicitly construct an efficient Metropolis-Hastings
proposal based on local proposals, and we provide numerical evidence which
supports the theory.Comment: 26 pages, 7 figure
Spectral gaps for a Metropolis-Hastings algorithm in infinite dimensions
We study the problem of sampling high and infinite dimensional target
measures arising in applications such as conditioned diffusions and inverse
problems. We focus on those that arise from approximating measures on Hilbert
spaces defined via a density with respect to a Gaussian reference measure. We
consider the Metropolis-Hastings algorithm that adds an accept-reject mechanism
to a Markov chain proposal in order to make the chain reversible with respect
to the target measure. We focus on cases where the proposal is either a
Gaussian random walk (RWM) with covariance equal to that of the reference
measure or an Ornstein-Uhlenbeck proposal (pCN) for which the reference measure
is invariant. Previous results in terms of scaling and diffusion limits
suggested that the pCN has a convergence rate that is independent of the
dimension while the RWM method has undesirable dimension-dependent behaviour.
We confirm this claim by exhibiting a dimension-independent Wasserstein
spectral gap for pCN algorithm for a large class of target measures. In our
setting this Wasserstein spectral gap implies an -spectral gap. We use
both spectral gaps to show that the ergodic average satisfies a strong law of
large numbers, the central limit theorem and nonasymptotic bounds on the mean
square error, all dimension independent. In contrast we show that the spectral
gap of the RWM algorithm applied to the reference measures degenerates as the
dimension tends to infinity.Comment: Published in at http://dx.doi.org/10.1214/13-AAP982 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Recommended from our members
The Emergence of Three Human Development Clubs
We examine the joint distribution of levels of income per capita, life expectancy, and years of schooling across countries in 1960 and in 2000. In 1960 countries were clustered in two groups; a rich, highly educated, high longevity “developed” group and a poor, less educated, high mortality, “underdeveloped” group. By 2000 however we see the emergence of three groups; one underdeveloped group remaining near 1960 levels, a developed group with higher levels of education, income, and health than in 1960, and an intermediate group lying between these two. This finding is consistent with both the ideas of a new “middle income trap” that countries face even if they escape the “low income trap”, as well as the notion that countries which escaped the poverty trap form a temporary “transition regime” along their path to the “developed” group
Multilevel Monte Carlo methods for the approximation of invariant measures of stochastic differential equations
We develop a framework that allows the use of the multi-level Monte Carlo
(MLMC) methodology (Giles2015) to calculate expectations with respect to the
invariant measure of an ergodic SDE. In that context, we study the
(over-damped) Langevin equations with a strongly concave potential. We show
that, when appropriate contracting couplings for the numerical integrators are
available, one can obtain a uniform in time estimate of the MLMC variance in
contrast to the majority of the results in the MLMC literature. As a
consequence, a root mean square error of is achieved
with complexity on par with Markov Chain Monte
Carlo (MCMC) methods, which however can be computationally intensive when
applied to large data sets. Finally, we present a multi-level version of the
recently introduced Stochastic Gradient Langevin Dynamics (SGLD) method
(Welling and Teh, 2011) built for large datasets applications. We show that
this is the first stochastic gradient MCMC method with complexity
, in contrast to the
complexity of currently available methods.
Numerical experiments confirm our theoretical findings.Comment: 25 pages, 8 figure
Energy Discrepancies: A Score-Independent Loss for Energy-Based Models
Energy-based models are a simple yet powerful class of probabilistic models,
but their widespread adoption has been limited by the computational burden of
training them. We propose a novel loss function called Energy Discrepancy (ED)
which does not rely on the computation of scores or expensive Markov chain
Monte Carlo. We show that ED approaches the explicit score matching and
negative log-likelihood loss under different limits, effectively interpolating
between both. Consequently, minimum ED estimation overcomes the problem of
nearsightedness encountered in score-based estimation methods, while also
enjoying theoretical guarantees. Through numerical experiments, we demonstrate
that ED learns low-dimensional data distributions faster and more accurately
than explicit score matching or contrastive divergence. For high-dimensional
image data, we describe how the manifold hypothesis puts limitations on our
approach and demonstrate the effectiveness of energy discrepancy by training
the energy-based model as a prior of a variational decoder model
- …