20 research outputs found

    Hamiltonian monte carlo with energy conserving subsampling

    Full text link
    © 2019 Khue-Dung Dang, Matias Quiroz, Robert Kohn, Minh-Ngoc Tran, Mattias Villani. Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large data sets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that use gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients

    Subsampling sequential Monte Carlo for static Bayesian models

    Full text link
    © 2020, Springer Science+Business Media, LLC, part of Springer Nature. We show how to speed up sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets

    Subsampling MCMC - an Introduction for the Survey Statistician

    Full text link
    © 2018, Indian Statistical Institute. The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with non-Bayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called Pseudo-Marginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature

    The Block-Poisson Estimator for Optimally Tuned Exact Subsampling MCMC

    Full text link
    Speeding up Markov chain Monte Carlo (MCMC) for datasets with many observations by data subsampling has recently received considerable attention. A pseudo-marginal MCMC method is proposed that estimates the likelihood by data subsampling using a block-Poisson estimator. The estimator is a product of Poisson estimators, allowing us to update a single block of subsample indicators in each MCMC iteration so that a desired correlation is achieved between the logs of successive likelihood estimates. This is important since pseudo-marginal MCMC with positively correlated likelihood estimates can use substantially smaller subsamples without adversely affecting the sampling efficiency. The block-Poisson estimator is unbiased but not necessarily positive, so the algorithm runs the MCMC on the absolute value of the likelihood estimator and uses an importance sampling correction to obtain consistent estimates of the posterior mean of any function of the parameters. Our article derives guidelines to select the optimal tuning parameters for our method and shows that it compares very favorably to regular MCMC without subsampling, and to two other recently proposed exact subsampling approaches in the literature. Supplementary materials for this article are available online

    Practices Concerning Revisional Bariatric Surgery: a Survey of 460 Surgeons.

    No full text
    BACKGROUND There is currently little evidence available on various aspects of Revisional Bariatric Surgery (RBS) and no published consensus amongst experts. The purpose of this study was to understand variation in practices concerning RBS. METHODS Bariatric surgeons from around the world who perform RBS were invited to participate in a questionnaire-based survey on SurveyMonkey®. RESULTS A total of 460 respondents from 62 countries took the survey. For revision after gastric banding, Roux-en-Y gastric bypass (RYGB) (75.5%, n = 345) emerged as the commonest choice followed by sleeve gastrectomy (SG) (56.9%, n = 260) and one anastomosis gastric bypass (OAGB) (37.2%, n = 170). For revision after SG, RYGB (77.7%, n = 355) was the commonest option followed by OAGB (42.45%, n = 194) and re-sleeve (22.32%, n = 102). For revision after RYGB, surgical pouch reduction (49.1%, n = 223), prolongation of bilio-pancreatic limb (30.0%, n = 136), and surgical stoma size reduction (26.43%, n = 120) were the most preferred options. Approximately 90.0% of respondents (n = 406/454) routinely perform an upper gastrointestinal endoscopy before an RBS, and 85.6% (n = 388/453) routinely perform a contrast study. Ninety percent (n = 403/445) reported that the demand for RBS was usually patient-driven, and there was wide variation in criteria used to define successful response, non-responders, and significant weight regain. CONCLUSIONS This survey is the first attempt to understand various aspects of RBS. The findings will help in identifying areas for research and allow consensus building amongst experts

    Soyfood and isoflavone intake and risk of type 2 diabetes in Vietnamese adults.

    No full text
    BACKGROUND/OBJECTIVES: Animal studies have demonstrated that soy isoflavones exert antidiabetic effects. However, evidence regarding the association between soyfood intake, a unique source of isoflavones, and type 2 diabetes remains inconclusive. This study assessed the relationship between habitual intakes of soyfoods and major isoflavones and risk of type 2 diabetes in Vietnamese adults. SUBJECTS/METHODS: A hospital-based case-control study was conducted in Vietnam during 2013-2015. A total of 599 newly diagnosed diabetic cases (age 40-65 years) and 599 hospital-based controls, frequency matched by age and sex, were recruited in Hanoi, capital city of Vietnam. Information on frequency and quantity of soyfood and isoflavone intake, together with demographics, habitual diet and lifestyle characteristics, was obtained from direct interviews using a validated and reliable questionnaire. Unconditional logistic regression analyses were performed to examine the association between soy variables and type 2 diabetes risk. RESULTS: Higher intake of total soyfoods was significantly associated with a lower risk of type 2 diabetes; the adjusted odds ratio (OR) for the highest versus the lowest intake was 0.31 (95% confidence interval (CI): 0.21-0.46; P<0.001). An inverse dose-response relationship of similar magnitude was also observed for total isoflavone intake (OR: 0.35; 95% CI: 0.24 to 0.49; P<0.001). In addition, inverse associations of specific soyfoods (soy milk, tofu and mung bean sprout) and major isoflavones (daidzein, genistein and glycitein) with the type 2 diabetes risk were evident. CONCLUSIONS: Soyfood and isoflavone intake was associated with a lower type 2 diabetes risk in Vietnamese adults.European Journal of Clinical Nutrition advance online publication, 10 May 2017; doi:10.1038/ejcn.2017.76
    corecore