344 research outputs found

    Accelerating inference for stochastic kinetic models

    Full text link
    Stochastic kinetic models (SKMs) are increasingly used to account for the inherent stochasticity exhibited by interacting populations of species in areas such as epidemiology, population ecology and systems biology. Species numbers are modelled using a continuous-time stochastic process, and, depending on the application area of interest, this will typically take the form of a Markov jump process or an It\^o diffusion process. Widespread use of these models is typically precluded by their computational complexity. In particular, performing exact fully Bayesian inference in either modelling framework is challenging due to the intractability of the observed data likelihood, necessitating the use of computationally intensive techniques such as particle Markov chain Monte Carlo (particle MCMC). It is proposed to increase the computational and statistical efficiency of this approach by leveraging the tractability of an inexpensive surrogate derived directly from either the jump or diffusion process. The surrogate is used in three ways: in the design of a gradient-based parameter proposal, to construct an appropriate bridge and in the first stage of a delayed-acceptance step. The resulting approach, which exactly targets the posterior of interest, offers substantial gains in efficiency over a standard particle MCMC implementation.Comment: 29 page

    Augmented pseudo-marginal Metropolis-Hastings for partially observed diffusion processes

    Get PDF
    We consider the problem of inference for nonlinear, multivariate diffusion processes, satisfying Itô stochastic differential equations (SDEs), using data at discrete times that may be incomplete and subject to measurement error. Our starting point is a state-of-the-art correlated pseudo-marginal Metropolis–Hastings algorithm, that uses correlated particle filters to induce strong and positive correlation between successive likelihood estimates. However, unless the measurement error or the dimension of the SDE is small, correlation can be eroded by the resampling steps in the particle filter. We therefore propose a novel augmentation scheme, that allows for conditioning on values of the latent process at the observation times, completely avoiding the need for resampling steps. We integrate over the uncertainty at the observation times with an additional Gibbs step. Connections between the resulting pseudo-marginal scheme and existing inference schemes for diffusion processes are made, giving a unified inference framework that encompasses Gibbs sampling and pseudo marginal schemes. The methodology is applied in three examples of increasing complexity. We find that our approach offers substantial increases in overall efficiency, compared to competing methods

    Accelerating Bayesian inference for stochastic epidemic models using incidence data

    Get PDF
    We consider the case of performing Bayesian inference for stochastic epidemic compartment models, using incomplete time course data consisting of incidence counts that are either the number of new infections or removals in time intervals of fixed length. We eschew the most natural Markov jump process representation for reasons of computational efficiency, and focus on a stochastic differential equation representation. This is further approximated to give a tractable Gaussian process, that is, the linear noise approximation (LNA). Unless the observation model linking the LNA to data is both linear and Gaussian, the observed data likelihood remains intractable. It is in this setting that we consider two approaches for marginalising over the latent process: a correlated pseudo-marginal method and analytic marginalisation via a Gaussian approximation of the observation model. We compare and contrast these approaches using synthetic data before applying the best performing method to real data consisting of removal incidence of oak processionary moth nests in Richmond Park, London. Our approach further allows comparison between various competing compartment models

    Bayesian inference for stochastic kinetic models using data on proportions of cell death

    Get PDF
    PhD ThesisThe PolyQ model is a large stochastic kinetic model that describes protein aggregation within human cells as they undergo ageing. The presence of protein aggregates in cells is a known feature in many age-related diseases, such as Huntington's. Experimental data are available consisting of the proportions of cell death over time. This thesis is motivated by the need to make inference for the rate parameters of the PolyQ model. Ideally observations would be obtained on all chemical species, observed continuously in time. More realistically, it would be hoped that partial observations were available on the chemical species observed discretely in time. However, current experimental techniques only allow noisy observations on the proportions of cell death at a few discrete time points. This presents an ambitious inference problem. The model has a large state space and it is not possible to evaluate the data likelihood analytically. However, realisations from the model can be obtained using a stochastic simulator such as the Gillespie algorithm. The time evolution of a cell can be repeatedly simulated, giving an estimate of the proportion of cell death. Various MCMC schemes can be constructed targeting the posterior distribution of the rate parameters. Although evaluating the marginal likelihood is challenging, a pseudo-marginal approach can be used to replace the marginal likelihood with an easy to construct unbiased estimate. Another alternative which allows for the sampling error in the simulated proportions is also considered. Unfortunately, in practice, simulation from the model is too slow to be used in an MCMC inference scheme. A fast Gaussian process emulator is used to approximate the simulator. This emulator produces fully probabilistic predictions of the simulator output and can be embedded into inference schemes for the rate parameters. The methods developed are illustrated in two smaller models; the birth-death model and a medium sized model of mitochondrial DNA. Finally, inference on the large PolyQ model is considered.Engineering and Physical Sciences Research Counci

    Differential geometric MCMC methods and applications

    Get PDF
    This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations

    Ensemble MCMC: Accelerating Pseudo-Marginal MCMC for State Space Models using the Ensemble Kalman Filter

    Get PDF
    Particle Markov chain Monte Carlo (pMCMC) is now a popular method for performing Bayesian statistical inference on challenging state space models (SSMs) with unknown static parameters. It uses a particle filter (PF) at each iteration of an MCMC algorithm to unbiasedly estimate the likelihood for a given static parameter value. However, pMCMC can be computationally intensive when a large number of particles in the PF is required, such as when the data are highly informative, the model is misspecified and/or the time series is long. In this paper we exploit the ensemble Kalman filter (EnKF) developed in the data assimilation literature to speed up pMCMC. We replace the unbiased PF likelihood with the biased EnKF likelihood estimate within MCMC to sample over the space of the static parameter. On a wide class of different non-linear SSM models, we demonstrate that our extended ensemble MCMC (eMCMC) methods can significantly reduce the computational cost whilst maintaining reasonable accuracy. We also propose several extensions of the vanilla eMCMC algorithm to further improve computational efficiency. Computer code to implement our methods on all the examples can be downloaded from https://github.com/cdrovandi/Ensemble-MCMC

    Bayesian inference for indirectly observed stochastic processes, applications to epidemic modelling

    Get PDF
    Stochastic processes are mathematical objects that offer a probabilistic representation of how some quantities evolve in time. In this thesis we focus on estimating the trajectory and parameters of dynamical systems in cases where only indirect observations of the driving stochastic process are available. We have first explored means to use weekly recorded numbers of cases of Influenza to capture how the frequency and nature of contacts made with infected individuals evolved in time. The latter was modelled with diffusions and can be used to quantify the impact of varying drivers of epidemics as holidays, climate, or prevention interventions. Following this idea, we have estimated how the frequency of condom use has evolved during the intervention of the Gates Foundation against HIV in India. In this setting, the available estimates of the proportion of individuals infected with HIV were not only indirect but also very scarce observations, leading to specific difficulties. At last, we developed a methodology for fractional Brownian motions (fBM), here a fractional stochastic volatility model, indirectly observed through market prices. The intractability of the likelihood function, requiring augmentation of the parameter space with the diffusion path, is ubiquitous in this thesis. We aimed for inference methods robust to refinements in time discretisations, made necessary to enforce accuracy of Euler schemes. The particle Marginal Metropolis Hastings (PMMH) algorithm exhibits this mesh free property. We propose the use of fast approximate filters as a pre-exploration tool to estimate the shape of the target density, for a quicker and more robust adaptation phase of the asymptotically exact algorithm. The fBM problem could not be treated with the PMMH, which required an alternative methodology based on reparameterisation and advanced Hamiltonian Monte Carlo techniques on the diffusion pathspace, that would also be applicable in the Markovian setting
    corecore