595 research outputs found
Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference
We describe an embarrassingly parallel, anytime Monte Carlo method for
likelihood-free models. The algorithm starts with the view that the
stochasticity of the pseudo-samples generated by the simulator can be
controlled externally by a vector of random numbers u, in such a way that the
outcome, knowing u, is deterministic. For each instantiation of u we run an
optimization procedure to minimize the distance between summary statistics of
the simulator and the data. After reweighing these samples using the prior and
the Jacobian (accounting for the change of volume in transforming from the
space of summary statistics to the space of parameters) we show that this
weighted ensemble represents a Monte Carlo estimate of the posterior
distribution. The procedure can be run embarrassingly parallel (each node
handling one sample) and anytime (by allocating resources to the worst
performing sample). The procedure is validated on six experiments.Comment: NIPS 2015 camera read
Exact Bayesian inference via data augmentation
Data augmentation is a common tool in Bayesian statistics, especially in the application of MCMC. Data augmentation is used where direct computation of the posterior density, π(θ|x), of the parameters θ, given the observed data x, is not possible. We show that for a range of problems, it is possible to augment the data by y, such that, π(θ|x,y) is known, and π(y|x) can easily be computed. In particular, π(y|x) is obtained by collapsing π(y,θ|x) through integrating out θ. This allows the exact computation of π(θ|x) as a mixture distribution without recourse to approximating methods such as MCMC. Useful byproducts of the exact posterior distribution are the marginal likelihood of the model and the exact predictive distribution
Simulation-based Bayesian inference for epidemic models
This is the author pre-print version. The final version is available from the publisher via the DOI in this record.A powerful and flexible method for fitting dynamic models to missing and censored data is to use the Bayesian paradigm via data-augmented Markov chain Monte Carlo (DA-MCMC). This samples from the joint posterior for the parameters and missing data, but requires high memory overheads for large-scale systems. In addition, designing efficient proposal distributions for the missing data is typically challenging. Pseudo-marginal methods instead integrate across the missing data using a Monte Carlo estimate for the likelihood, generated from multiple independent simulations from the model. These techniques can avoid the high memory requirements of DA-MCMC, and under certain conditions produce the exact marginal posterior distribution for parameters. A novel method is presented for implementing importance sampling for dynamic epidemic models, by conditioning the simulations on sets of validity criteria (based on the model structure) as well as the observed data. The flexibility of these techniques is illustrated using both removal time and final size data from an outbreak of smallpox. It is shown that these approaches can circumvent the need for reversible-jump MCMC, and can allow inference in situations where DA-MCMC is impossible due to computationally infeasible likelihoods. © 2013 Elsevier B.V. All rights reserved.T. J. M. was in part supported by Department for the Environment, Food and Rural Affairs/Higher Education Funding Council of England, grant number VT0105 and BBSRC grant (BB/I012192/1). J. V. R was in part supported by Australian Research Council’s Discovery Projects funding scheme (project number DP110102893). R. D. was in part supported by Natural Sciences and Engineering Research Council (NSERC) of Canada’s Discovery Grants Program. A. R. C. was in part supported by National Medical Research Council (NMRC/HINIR/005/2009) and NUS Initiative to Improve Health in Asia. The authors would like to thank Andrew Conlan and Theo Kypraios for useful discussions
Sequential Monte Carlo methods for epidemic data
Epidemics often occur rapidly, with new cases being observed daily. Due to the frequently severe social and economic consequences of an outbreak, this is an area of research that benefits greatly from online inference. This motivates research into the construction of fast, adaptive methods for performing real-time statistical analysis of epidemic data. The aim of this thesis is to develop sequential Monte Carlo (SMC) methods for infectious disease outbreaks. These methods utilize the observed removal times of individuals, obtained throughout the outbreak. The SMC algorithm adaptively generates samples from the evolving posterior distribution, allowing for the real-time estimation of the parameters underpinning the outbreak. This is achieved by transforming the samples when new data arrives, so that they represent samples from the posterior distribution which incorporates all of the data. To assess the performance of the SMC algorithm we additionally develop a novel Markov chain Monte Carlo (MCMC) algorithm, utilising adaptive proposal schemes to improve its mixing. We test the SMC and MCMC algorithms on various simulated outbreaks, finding that the two methods produce comparable results in terms of parameter estimation and disease dynamics. However, due to the parallel nature of the SMC algorithm it is computationally much faster. The SMC and MCMC algorithms are applied to the 2001 UK Foot-and-Mouth outbreak: notable for its rapid spread and requirement of control measures to contain the outbreak. This presents an ideal candidate for real-time analysis. We find good agreement between the two methods, with the SMC algorithm again much quicker than the MCMC algorithm. Additionally, the performed inference matches well with previous work conducted on this data set. Overall, we find that the SMC algorithm developed is suitable for the real-time analysis of an epidemic and is highly competitive with the current gold-standard of MCMC methods, whilst being computationally much quicker
Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies
Pandemic influenza has the epidemic potential to kill millions of people.
While various preventive measures exist (i.a., vaccination and school
closures), deciding on strategies that lead to their most effective and
efficient use remains challenging. To this end, individual-based
epidemiological models are essential to assist decision makers in determining
the best strategy to curb epidemic spread. However, individual-based models are
computationally intensive and it is therefore pivotal to identify the optimal
strategy using a minimal amount of model evaluations. Additionally, as
epidemiological modeling experiments need to be planned, a computational budget
needs to be specified a priori. Consequently, we present a new sampling
technique to optimize the evaluation of preventive strategies using fixed
budget best-arm identification algorithms. We use epidemiological modeling
theory to derive knowledge about the reward distribution which we exploit using
Bayesian best-arm identification algorithms (i.e., Top-two Thompson sampling
and BayesGap). We evaluate these algorithms in a realistic experimental setting
and demonstrate that it is possible to identify the optimal strategy using only
a limited number of model evaluations, i.e., 2-to-3 times faster compared to
the uniform sampling method, the predominant technique used for epidemiological
decision making in the literature. Finally, we contribute and evaluate a
statistic for Top-two Thompson sampling to inform the decision makers about the
confidence of an arm recommendation
Design and Analysis of Infectious Disease Studies
The fourth workshop on this theme is devoted to the statistical problems of planning and analyzing studies in infectious disease epidemiology
- …