595 research outputs found

    Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference

    Get PDF
    We describe an embarrassingly parallel, anytime Monte Carlo method for likelihood-free models. The algorithm starts with the view that the stochasticity of the pseudo-samples generated by the simulator can be controlled externally by a vector of random numbers u, in such a way that the outcome, knowing u, is deterministic. For each instantiation of u we run an optimization procedure to minimize the distance between summary statistics of the simulator and the data. After reweighing these samples using the prior and the Jacobian (accounting for the change of volume in transforming from the space of summary statistics to the space of parameters) we show that this weighted ensemble represents a Monte Carlo estimate of the posterior distribution. The procedure can be run embarrassingly parallel (each node handling one sample) and anytime (by allocating resources to the worst performing sample). The procedure is validated on six experiments.Comment: NIPS 2015 camera read

    Exact Bayesian inference via data augmentation

    Get PDF
    Data augmentation is a common tool in Bayesian statistics, especially in the application of MCMC. Data augmentation is used where direct computation of the posterior density, π(θ|x), of the parameters θ, given the observed data x, is not possible. We show that for a range of problems, it is possible to augment the data by y, such that, π(θ|x,y) is known, and π(y|x) can easily be computed. In particular, π(y|x) is obtained by collapsing π(y,θ|x) through integrating out θ. This allows the exact computation of π(θ|x) as a mixture distribution without recourse to approximating methods such as MCMC. Useful byproducts of the exact posterior distribution are the marginal likelihood of the model and the exact predictive distribution

    Simulation-based Bayesian inference for epidemic models

    Get PDF
    This is the author pre-print version. The final version is available from the publisher via the DOI in this record.A powerful and flexible method for fitting dynamic models to missing and censored data is to use the Bayesian paradigm via data-augmented Markov chain Monte Carlo (DA-MCMC). This samples from the joint posterior for the parameters and missing data, but requires high memory overheads for large-scale systems. In addition, designing efficient proposal distributions for the missing data is typically challenging. Pseudo-marginal methods instead integrate across the missing data using a Monte Carlo estimate for the likelihood, generated from multiple independent simulations from the model. These techniques can avoid the high memory requirements of DA-MCMC, and under certain conditions produce the exact marginal posterior distribution for parameters. A novel method is presented for implementing importance sampling for dynamic epidemic models, by conditioning the simulations on sets of validity criteria (based on the model structure) as well as the observed data. The flexibility of these techniques is illustrated using both removal time and final size data from an outbreak of smallpox. It is shown that these approaches can circumvent the need for reversible-jump MCMC, and can allow inference in situations where DA-MCMC is impossible due to computationally infeasible likelihoods. © 2013 Elsevier B.V. All rights reserved.T. J. M. was in part supported by Department for the Environment, Food and Rural Affairs/Higher Education Funding Council of England, grant number VT0105 and BBSRC grant (BB/I012192/1). J. V. R was in part supported by Australian Research Council’s Discovery Projects funding scheme (project number DP110102893). R. D. was in part supported by Natural Sciences and Engineering Research Council (NSERC) of Canada’s Discovery Grants Program. A. R. C. was in part supported by National Medical Research Council (NMRC/HINIR/005/2009) and NUS Initiative to Improve Health in Asia. The authors would like to thank Andrew Conlan and Theo Kypraios for useful discussions

    Sequential Monte Carlo methods for epidemic data

    Get PDF
    Epidemics often occur rapidly, with new cases being observed daily. Due to the frequently severe social and economic consequences of an outbreak, this is an area of research that benefits greatly from online inference. This motivates research into the construction of fast, adaptive methods for performing real-time statistical analysis of epidemic data. The aim of this thesis is to develop sequential Monte Carlo (SMC) methods for infectious disease outbreaks. These methods utilize the observed removal times of individuals, obtained throughout the outbreak. The SMC algorithm adaptively generates samples from the evolving posterior distribution, allowing for the real-time estimation of the parameters underpinning the outbreak. This is achieved by transforming the samples when new data arrives, so that they represent samples from the posterior distribution which incorporates all of the data. To assess the performance of the SMC algorithm we additionally develop a novel Markov chain Monte Carlo (MCMC) algorithm, utilising adaptive proposal schemes to improve its mixing. We test the SMC and MCMC algorithms on various simulated outbreaks, finding that the two methods produce comparable results in terms of parameter estimation and disease dynamics. However, due to the parallel nature of the SMC algorithm it is computationally much faster. The SMC and MCMC algorithms are applied to the 2001 UK Foot-and-Mouth outbreak: notable for its rapid spread and requirement of control measures to contain the outbreak. This presents an ideal candidate for real-time analysis. We find good agreement between the two methods, with the SMC algorithm again much quicker than the MCMC algorithm. Additionally, the performed inference matches well with previous work conducted on this data set. Overall, we find that the SMC algorithm developed is suitable for the real-time analysis of an epidemic and is highly competitive with the current gold-standard of MCMC methods, whilst being computationally much quicker

    Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

    Full text link
    Pandemic influenza has the epidemic potential to kill millions of people. While various preventive measures exist (i.a., vaccination and school closures), deciding on strategies that lead to their most effective and efficient use remains challenging. To this end, individual-based epidemiological models are essential to assist decision makers in determining the best strategy to curb epidemic spread. However, individual-based models are computationally intensive and it is therefore pivotal to identify the optimal strategy using a minimal amount of model evaluations. Additionally, as epidemiological modeling experiments need to be planned, a computational budget needs to be specified a priori. Consequently, we present a new sampling technique to optimize the evaluation of preventive strategies using fixed budget best-arm identification algorithms. We use epidemiological modeling theory to derive knowledge about the reward distribution which we exploit using Bayesian best-arm identification algorithms (i.e., Top-two Thompson sampling and BayesGap). We evaluate these algorithms in a realistic experimental setting and demonstrate that it is possible to identify the optimal strategy using only a limited number of model evaluations, i.e., 2-to-3 times faster compared to the uniform sampling method, the predominant technique used for epidemiological decision making in the literature. Finally, we contribute and evaluate a statistic for Top-two Thompson sampling to inform the decision makers about the confidence of an arm recommendation

    Design and Analysis of Infectious Disease Studies

    Get PDF
    The fourth workshop on this theme is devoted to the statistical problems of planning and analyzing studies in infectious disease epidemiology
    • …
    corecore