40 research outputs found

    A Single SMC Sampler on MPI that Outperforms a Single MCMC Sampler

    Get PDF
    Markov Chain Monte Carlo (MCMC) is a well-established family of algorithms which are primarily used in Bayesian statistics to sample from a target distribution when direct sampling is challenging. Single instances of MCMC methods are widely considered hard to parallelise in a problem-agnostic fashion and hence, unsuitable to meet both constraints of high accuracy and high throughput. Sequential Monte Carlo (SMC) Samplers can address the same problem, but are parallelisable: they share with Particle Filters the same key tasks and bottleneck. Although a rich literature already exists on MCMC methods, SMC Samplers are relatively underexplored, such that no parallel implementation is currently available. In this paper, we first propose a parallel MPI version of the SMC Sampler, including an optimised implementation of the bottleneck, and then compare it with single-core Metropolis-Hastings. The goal is to show that SMC Samplers may be a promising alternative to MCMC methods with high potential for future improvements. We demonstrate that a basic SMC Sampler with 512 cores is up to 85 times faster or up to 8 times more accurate than Metropolis-Hastings

    An O(log⁥2N)\mathcal{O}(\log_2N) SMC2^2 Algorithm on Distributed Memory with an Approx. Optimal L-Kernel

    Full text link
    Calibrating statistical models using Bayesian inference often requires both accurate and timely estimates of parameters of interest. Particle Markov Chain Monte Carlo (p-MCMC) and Sequential Monte Carlo Squared (SMC2^2) are two methods that use an unbiased estimate of the log-likelihood obtained from a particle filter (PF) to evaluate the target distribution. P-MCMC constructs a single Markov chain which is sequential by nature so cannot be readily parallelized using Distributed Memory (DM) architectures. This is in contrast to SMC2^2 which includes processes, such as importance sampling, that are described as \textit{embarrassingly parallel}. However, difficulties arise when attempting to parallelize resampling. None-the-less, the choice of backward kernel, recycling scheme and compatibility with DM architectures makes SMC2^2 an attractive option when compared with p-MCMC. In this paper, we present an SMC2^2 framework that includes the following features: an optimal (in terms of time complexity) O(log⁡2N)\mathcal{O}(\log_2N) parallelization for DM architectures, an approximately optimal (in terms of accuracy) backward kernel, and an efficient recycling scheme. On a cluster of 128128 DM processors, the results on a biomedical application show that SMC2^2 achieves up to a 70×70\times speed-up vs its sequential implementation. It is also more accurate and roughly 54×54\times faster than p-MCMC. A GitHub link is given which provides access to the code.Comment: 8 pages, 6 figures, accepted to Combined SDF and MFI Conference 2023 conferenc

    Neural adaptive sequential Monte Carlo

    Get PDF
    Sequential Monte Carlo (SMC), or particle filtering, is a popular class of methods for sampling from an intractable target distribution using a sequence of simpler intermediate distributions. Like other importance sampling-based methods, performance is critically dependent on the proposal distribution: a bad proposal can lead to arbitrarily inaccurate estimates of the target distribution. This paper presents a new method for automatically adapting the proposal using an approximation of the Kullback-Leibler divergence between the true posterior and the proposal distribution. The method is very flexible, applicable to any parameterized proposal distribution and it supports online and batch variants. We use the new framework to adapt powerful proposal distributions with rich parameterizations based upon neural networks leading to Neural Adaptive Sequential Monte Carlo (NASMC). Experiments indicate that NASMC significantly improves inference in a non-linear state space model outperforming adaptive proposal methods including the Extended Kalman and Unscented Particle Filters. Experiments also indicate that improved inference translates into improved parameter learning when NASMC is used as a subroutine of Particle Marginal Metropolis Hastings. Finally we show that NASMC is able to train a latent variable recurrent neural network (LV-RNN) achieving results that compete with the state-of-the-art for polymorphic music modelling. NASMC can be seen as bridging the gap between adaptive SMC methods and the recent work in scalable, black-box variational inference

    Streaming Multi-core Sample-based Bayesian Analysis

    Get PDF
    Sequential Monte Carlo (SMC) methods are a well-established family of Bayesian inference algorithms for performing state estimation for Non-Linear Non-Gaussian models. As the models become more accurate, the run-time of SMC applications becomes increasingly slow. Parallel computing can be used to compensate for this side-effect. However, an efficient parallelisation of SMC is hard to achieve, due to the challenges involved in parallelising the bottleneck, resampling, and its constituent redistribute step. While redistribution can be performed in O((N/T) x logN) on a Shared Memory Architecture (SMA) using T parallel threads (e.g. a GPU or mainstream CPUs), a state-of-the-art redistribute takes O((logN)^2) computations on Distributed Memory Architectures (DMAs) which most supercomputers are made of. In this thesis, the focus is on three major goals. First, the thesis proposes a novel parallel redistribute for DMAs which achieves O(logN) time complexity. It is shown that on Message Passing Interface (MPI) the novel redistribute is up to eight times faster than the O((logN)^2) one. On a cluster of 256 cores, an SMC method employing the O((logN)^2) redistribute becomes up to six times faster when switching to the novel redistribution, which is also proved to no longer be the bottleneck. For the same number of cores, the maximum reported speed-up vs a single-core SMC method is 160. A patent application on this invention is currently filed. Second, the thesis describes a novel parallel redistribute for SMAs which takes O((N/T) + logN) steps and fully exploits the computational power of SMAs. The proposed approach is up to six times faster than the O((N/T) x logN) one. This shared memory implementation is then combined with the MPI O(logN) redistribution to obtain a hybrid distributed-shared memory parallel redistribute that fully exploits the large parallelism that modern supercomputers offer. In the end, to make these advances widely available this thesis presents Streaming-Stan and SMC-Stan, two extension packages for Stan, a popular statistical programming language. Streaming-Stan and SMC-Stan offer the possibility to describe models by using the same intuitive syntax used by regular Stan, but they are also equipped with the aforementioned High Performance Computing (HPC) SMC method, in the form of Fixed-Lag SMC and SMC sampler respectively. The same SMC methods also provide a vast choice of proposal distributions, including (on Streaming-Stan) two novel ones, presented in this thesis, which combine the main features of Fixed-Lag SMC methods with Hamiltonian Monte Carlo (HMC) and No-U-Turn Sampler (NUTS)

    Bayesian computation in astronomy: novel methods for parallel and gradient-free inference

    Get PDF
    The goal of this thesis is twofold; introduce the fundamentals of Bayesian inference and computation focusing on astronomical and cosmological applications, and present recent advances in probabilistic computational methods developed by the author that aim to facilitate Bayesian data analysis for the next generation of astronomical observations and theoretical models. The first part of this thesis familiarises the reader with the notion of probability and its relevance for science through the prism of Bayesian reasoning, by introducing the key constituents of the theory and discussing its best practices. The second part includes a pedagogical introduction to the principles of Bayesian computation motivated by the geometric characteristics of probability distributions and followed by a detailed exposition of various methods including Markov chain Monte Carlo (MCMC), Sequential Monte Carlo (SMC) and Nested Sampling (NS). Finally, the third part presents two novel computational methods and their respective software implementations. The first such development is Ensemble Slice Sampling (ESS), a new class of MCMC algorithms that extend the applicability of the standard Slice Sampler by adaptively tuning its only hyperparameter and utilising an ensemble of parallel walkers in order to efficiently handle strong correlations between parameters. The parallel, black–box and gradient-free nature of the method renders it ideal for use in combination with computationally expensive and non–differentiable models often met in astronomy. ESS is implemented in Python in the well–tested and open-source software package called zeus that is specifically designed to tackle the computational challenges posed by modern astronomical and cosmological analyses. In particular, use of the code requires minimal, if any, hand–tuning of hyperparameters while its performance is insensitive to linear correlations and it can scale up to thousands of CPUs without any extra effort. The next contribution includes the introduction of Preconditioned Monte Carlo (PMC), a novel Monte Carlo method for Bayesian inference that facilitates effective sampling of probability distributions with non–trivial geometry. PMC utilises a Normalising Flow (NF) in order to decorrelate the parameters of the distribution and then proceeds by sampling from the preconditioned target distribution using an adaptive SMC scheme. PMC, through its Python implementation pocoMC, achieves excellent sampling performance, including accurate estimation of the model evidence, for highly correlated, non–Gaussian, and multimodal target distributions. Finally, the code is directly parallelisable, manifesting linear scaling up to thousands of CPUs

    Towards Bayesian System Identification: With Application to SHM of Offshore Structures

    Get PDF
    Within the offshore industry Structural Health Monitoring remains a growing area of interest. The oil and gas sectors are faced with ageing infrastructure and are driven by the desire for reliable lifetime extension, whereas the wind energy sector is investing heavily in a large number of structures. This leads to a number of distinct challenges for Structural Health Monitoring which are brought together by one unifying theme --- uncertainty. The offshore environment is highly uncertain, existing structures have not been monitored from construction and the loading and operational conditions they have experienced (among other factors) are not known. For the wind energy sector, high numbers of structures make traditional inspection methods costly and in some cases dangerous due to the inaccessibility of many wind farms. Structural Health Monitoring attempts to address these issues by providing tools to allow automated online assessment of the condition of structures to aid decision making. The work of this thesis presents a number of Bayesian methods which allow system identification, for Structural Health Monitoring, under uncertainty. The Bayesian approach explicitly incorporates prior knowledge that is available and combines this with evidence from observed data to allow the formation of updated beliefs. This is a natural way to approach Structural Health Monitoring, or indeed, many engineering problems. It is reasonable to assume that there is some knowledge available to the engineer before attempting to detect, locate, classify, or model damage on a structure. Having a framework where this knowledge can be exploited, and the uncertainty in that knowledge can be handled rigorously, is a powerful methodology. The problem being that the actual computation of Bayesian results can pose a significant challenge both computationally and in terms of specifying appropriate models. This thesis aims to present a number of Bayesian tools, each of which leverages the power of the Bayesian paradigm to address a different Structural Health Monitoring challenge. Within this work the use of Gaussian Process models is presented as a flexible nonparametric Bayesian approach to regression, which is extended to handle dynamic models within the Gaussian Process NARX framework. The challenge in training Gaussian Process models is seldom discussed and the work shown here aims to offer a quantitative assessment of different learning techniques including discussions on the choice of cost function for optimisation of hyperparameters and the choice of the optimisation algorithm itself. Although rarely considered, the effects of these choices are demonstrated to be important and to inform the use of a Gaussian Process NARX model for wave load identification on offshore structures. The work is not restricted to only Gaussian Process models, but Bayesian state-space models are also used. The novel use of Particle Gibbs for identification of nonlinear oscillators is shown and modifications to this algorithm are applied to handle its specific use in Structural Health Monitoring. Alongside this, the Bayesian state-space model is used to perform joint input-state-parameter inference for Operational Modal Analysis where the use of priors over the parameters and the forcing function (in the form of a Gaussian Process transformed into a state-space representation) provides a methodology for this output-only identification under parameter uncertainty. Interestingly, this method is shown to recover the parameter distributions of the model without compromising the recovery of the loading time-series signal when compared to the case where the parameters are known. Finally, a novel use of an online Bayesian clustering method is presented for performing Structural Health Monitoring in the absence of any available training data. This online method does not require a pre-collected training dataset, nor a model of the structure, and is capable of detecting and classifying a range of operational and damage conditions while in service. This leaves the reader with a toolbox of methods which can be applied, where appropriate, to identification of dynamic systems with a view to Structural Health Monitoring problems within the offshore industry and across engineering

    An O(log2N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

    Get PDF
    Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes O((log2N)2) computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in O(log2N) on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an O(log2N) time complexity. We also present empirical results that indicate that our novel approach outperforms the O((log2N)2) approach.</jats:p

    On a quasi-stationary approach to bayesian computation, with application to tall data

    Get PDF
    Markov Chain Monte Carlo (MCMC) techniques have traditionally been used in a Bayesian inference to simulate from an intractable distribution of parameters. However, the current age of Big data demands more scalable and robust algorithms for the inferences to be computationally feasible. Existing MCMC-based scalable methodologies often uses discretization within their construction and hence they are inexact. A newly proposed field of the Quasi-Stationary Monte Carlo (QSMC) methodology has paved the way for a scalable Bayesian inference in a Big data setting, at the same time, its exactness remains intact. Contrary to MCMC, a QSMC method constructs a Markov process whose quasi-stationary distribution is given by the target. A recently proposed QSMC method called the Scalable Langevin Exact (ScaLE) algorithm has been constructed by suitably combining the exact method of diffusion, the Sequential Monte Carlo methodology for quasi-stationarity and sub-sampling ideas to produce a sub-linear cost in a Big data setting. This thesis uses the mathematical foundations of the ScaLE methodology as a building block and carefully combines a recently proposed regenerative mechanism for quasistationarity to produce a new class of QSMC algorithm called the Regenerating ScaLE (ReScaLE). Further, it provides various empirical results towards the sublinear scalability of ReScaLE and illustrates its application to a real world big data problem where a traditional MCMC method is likely to suffer from a huge computational cost. This work takes further inroads into some current limitations faced by ReScaLE and proposes various algorithmic modifications for targeting quasistationarity. The empirical evidences suggests that these modifications reduce the computational cost and improve the speed of convergence
    corecore