6,251 research outputs found

    Parallel Local Approximation MCMC for Expensive Models

    Get PDF
    Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of PDEs. In recent work [J. Amer. Statist. Assoc., 111 (2016), pp. 1591-1607] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior-adapted approximations harness regularity of the model to reduce the computational cost of inference while preserving asymptotic exactness of the Markov chain. Here we describe two extensions of that work. First, we prove that samplers running in parallel can collaboratively construct a shared posterior approximation while ensuring ergodicity of each associated chain, providing a novel opportunity for exploiting parallel computation in MCMC. Second, focusing on the Metropolis-adjusted Langevin algorithm, we describe how a proposal distribution can successfully employ gradients and other relevant information extracted from the approximation. We investigate the practical performance of our approach using two challenging inference problems, the first in subsurface hydrology and the second in glaciology. Using local approximations constructed via parallel chains, we successfully reduce the run time needed to characterize the posterior distributions in these problems from days to hours and from months to days, respectively, dramatically improving the tractability of Bayesian inference.United States. Department of Energy. Office of Science. Scientific Discovery through Advanced Computing (SciDAC) Program (award DE-SC0007099)Natural Sciences and Engineering Research Council of CanadaUnited States. Office of Naval Researc

    Accelerating MCMC via Parallel Predictive Prefetching

    Full text link
    We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast, iterative approximations to the target density to speculatively evaluate many potential future steps of the chain in parallel. The approach can accelerate computation of the target distribution of a Bayesian inference problem, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores

    Patterns of Scalable Bayesian Inference

    Full text link
    Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

    Computational statistics using the Bayesian Inference Engine

    Full text link
    This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimised software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organise and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasises hybrid tempered MCMC schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE is implements a full persistence or serialisation system that stores the full byte-level image of the running inference and previously characterised posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU GPL.Comment: Resubmitted version. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU GP

    Ensemble Transport Adaptive Importance Sampling

    Full text link
    Markov chain Monte Carlo methods are a powerful and commonly used family of numerical methods for sampling from complex probability distributions. As applications of these methods increase in size and complexity, the need for efficient methods increases. In this paper, we present a particle ensemble algorithm. At each iteration, an importance sampling proposal distribution is formed using an ensemble of particles. A stratified sample is taken from this distribution and weighted under the posterior, a state-of-the-art ensemble transport resampling method is then used to create an evenly weighted sample ready for the next iteration. We demonstrate that this ensemble transport adaptive importance sampling (ETAIS) method outperforms MCMC methods with equivalent proposal distributions for low dimensional problems, and in fact shows better than linear improvements in convergence rates with respect to the number of ensemble members. We also introduce a new resampling strategy, multinomial transformation (MT), which while not as accurate as the ensemble transport resampler, is substantially less costly for large ensemble sizes, and can then be used in conjunction with ETAIS for complex problems. We also focus on how algorithmic parameters regarding the mixture proposal can be quickly tuned to optimise performance. In particular, we demonstrate this methodology's superior sampling for multimodal problems, such as those arising from inference for mixture models, and for problems with expensive likelihoods requiring the solution of a differential equation, for which speed-ups of orders of magnitude are demonstrated. Likelihood evaluations of the ensemble could be computed in a distributed manner, suggesting that this methodology is a good candidate for parallel Bayesian computations
    • …
    corecore