6,251 research outputs found
Parallel Local Approximation MCMC for Expensive Models
Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of PDEs. In recent work [J. Amer. Statist. Assoc., 111 (2016), pp. 1591-1607] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior-adapted approximations harness regularity of the model to reduce the computational cost of inference while preserving asymptotic exactness of the Markov chain. Here we describe two extensions of that work. First, we prove that samplers running in parallel can collaboratively construct a shared posterior approximation while ensuring ergodicity of each associated chain, providing a novel opportunity for exploiting parallel computation in MCMC. Second, focusing on the Metropolis-adjusted Langevin algorithm, we describe how a proposal distribution can successfully employ gradients and other relevant information extracted from the approximation. We investigate the practical performance of our approach using two challenging inference problems, the first in subsurface hydrology and the second in glaciology. Using local approximations constructed via parallel chains, we successfully reduce the run time needed to characterize the posterior distributions in these problems from days to hours and from months to days, respectively, dramatically improving the tractability of Bayesian inference.United States. Department of Energy. Office of Science. Scientific Discovery through Advanced Computing (SciDAC) Program (award DE-SC0007099)Natural Sciences and Engineering Research Council of CanadaUnited States. Office of Naval Researc
Accelerating MCMC via Parallel Predictive Prefetching
We present a general framework for accelerating a large class of widely used
Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast,
iterative approximations to the target density to speculatively evaluate many
potential future steps of the chain in parallel. The approach can accelerate
computation of the target distribution of a Bayesian inference problem, without
compromising exactness, by exploiting subsets of data. It takes advantage of
whatever parallel resources are available, but produces results exactly
equivalent to standard serial execution. In the initial burn-in phase of chain
evaluation, it achieves speedup over serial evaluation that is close to linear
in the number of available cores
Patterns of Scalable Bayesian Inference
Datasets are growing not just in size but in complexity, creating a demand
for rich models and quantification of uncertainty. Bayesian methods are an
excellent fit for this demand, but scaling Bayesian inference is a challenge.
In response to this challenge, there has been considerable recent work based on
varying assumptions about model structure, underlying computational resources,
and the importance of asymptotic correctness. As a result, there is a zoo of
ideas with few clear overarching principles.
In this paper, we seek to identify unifying principles, patterns, and
intuitions for scaling Bayesian inference. We review existing work on utilizing
modern computing resources with both MCMC and variational approximation
techniques. From this taxonomy of ideas, we characterize the general principles
that have proven successful for designing scalable inference procedures and
comment on the path forward
Computational statistics using the Bayesian Inference Engine
This paper introduces the Bayesian Inference Engine (BIE), a general
parallel, optimised software package for parameter inference and model
selection. This package is motivated by the analysis needs of modern
astronomical surveys and the need to organise and reuse expensive derived data.
The BIE is the first platform for computational statistics designed explicitly
to enable Bayesian update and model comparison for astronomical problems.
Bayesian update is based on the representation of high-dimensional posterior
distributions using metric-ball-tree based kernel density estimation. Among its
algorithmic offerings, the BIE emphasises hybrid tempered MCMC schemes that
robustly sample multimodal posterior distributions in high-dimensional
parameter spaces. Moreover, the BIE is implements a full persistence or
serialisation system that stores the full byte-level image of the running
inference and previously characterised posterior distributions for later use.
Two new algorithms to compute the marginal likelihood from the posterior
distribution, developed for and implemented in the BIE, enable model comparison
for complex models and data sets. Finally, the BIE was designed to be a
collaborative platform for applying Bayesian methodology to astronomy. It
includes an extensible object-oriented and easily extended framework that
implements every aspect of the Bayesian inference. By providing a variety of
statistical algorithms for all phases of the inference problem, a scientist may
explore a variety of approaches with a single model and data implementation.
Additional technical details and download details are available from
http://www.astro.umass.edu/bie. The BIE is distributed under the GNU GPL.Comment: Resubmitted version. Additional technical details and download
details are available from http://www.astro.umass.edu/bie. The BIE is
distributed under the GNU GP
Ensemble Transport Adaptive Importance Sampling
Markov chain Monte Carlo methods are a powerful and commonly used family of
numerical methods for sampling from complex probability distributions. As
applications of these methods increase in size and complexity, the need for
efficient methods increases. In this paper, we present a particle ensemble
algorithm. At each iteration, an importance sampling proposal distribution is
formed using an ensemble of particles. A stratified sample is taken from this
distribution and weighted under the posterior, a state-of-the-art ensemble
transport resampling method is then used to create an evenly weighted sample
ready for the next iteration. We demonstrate that this ensemble transport
adaptive importance sampling (ETAIS) method outperforms MCMC methods with
equivalent proposal distributions for low dimensional problems, and in fact
shows better than linear improvements in convergence rates with respect to the
number of ensemble members. We also introduce a new resampling strategy,
multinomial transformation (MT), which while not as accurate as the ensemble
transport resampler, is substantially less costly for large ensemble sizes, and
can then be used in conjunction with ETAIS for complex problems. We also focus
on how algorithmic parameters regarding the mixture proposal can be quickly
tuned to optimise performance. In particular, we demonstrate this methodology's
superior sampling for multimodal problems, such as those arising from inference
for mixture models, and for problems with expensive likelihoods requiring the
solution of a differential equation, for which speed-ups of orders of magnitude
are demonstrated. Likelihood evaluations of the ensemble could be computed in a
distributed manner, suggesting that this methodology is a good candidate for
parallel Bayesian computations
- …