13,749 research outputs found

    Accelerating MCMC via Parallel Predictive Prefetching

    Full text link
    We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast, iterative approximations to the target density to speculatively evaluate many potential future steps of the chain in parallel. The approach can accelerate computation of the target distribution of a Bayesian inference problem, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores

    An alternative marginal likelihood estimator for phylogenetic models

    Get PDF
    Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been proposed none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean estimator (HM) and the arithmetic mean estimator (AM). However it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. Our new implementation of the generalized harmonic mean (GHM) idea recycles MCMC simulations from the posterior, shares the computational simplicity of the original HM estimator, but, unlike it, overcomes the infinite variance issue. The alternative estimator is applied to simulated phylogenetic data and produces fully satisfactory results outperforming those simple estimators currently provided by most of the publicly available software

    Likelihood-Free Parallel Tempering

    Full text link
    Approximate Bayesian Computational (ABC) methods (or likelihood-free methods) have appeared in the past fifteen years as useful methods to perform Bayesian analyses when the likelihood is analytically or computationally intractable. Several ABC methods have been proposed: Monte Carlo Markov Chains (MCMC) methods have been developped by Marjoramet al. (2003) and by Bortotet al. (2007) for instance, and sequential methods have been proposed among others by Sissonet al. (2007), Beaumont et al. (2009) and Del Moral et al. (2009). Until now, while ABC-MCMC methods remain the reference, sequential ABC methods have appeared to outperforms them (see for example McKinley et al. (2009) or Sisson et al. (2007)). In this paper a new algorithm combining population-based MCMC methods with ABC requirements is proposed, using an analogy with the Parallel Tempering algorithm (Geyer, 1991). Performances are compared with existing ABC algorithms on simulations and on a real example

    Bayesian inference on compact binary inspiral gravitational radiation signals in interferometric data

    Full text link
    Presented is a description of a Markov chain Monte Carlo (MCMC) parameter estimation routine for use with interferometric gravitational radiational data in searches for binary neutron star inspiral signals. Five parameters associated with the inspiral can be estimated, and summary statistics are produced. Advanced MCMC methods were implemented, including importance resampling and prior distributions based on detection probability, in order to increase the efficiency of the code. An example is presented from an application using realistic, albeit fictitious, data.Comment: submitted to Classical and Quantum Gravity. 14 pages, 5 figure

    Analytic Continuation of Quantum Monte Carlo Data by Stochastic Analytical Inference

    Full text link
    We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data which is strictly based on principles of Bayesian statistical inference. Within this framework we are able to obtain an explicit expression for the calculation of a weighted average over possible energy spectra, which can be evaluated by standard Monte Carlo simulations, yielding as by-product also the distribution function as function of the regularization parameter. Our algorithm thus avoids the usual ad-hoc assumptions introduced in similar algortihms to fix the regularization parameter. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum entropy calculation

    A Bayesian approach to the semi-analytic model of galaxy formation: methodology

    Get PDF
    We believe that a wide range of physical processes conspire to shape the observed galaxy population but we remain unsure of their detailed interactions. The semi-analytic model (SAM) of galaxy formation uses multi-dimensional parameterisations of the physical processes of galaxy formation and provides a tool to constrain these underlying physical interactions. Because of the high dimensionality, the parametric problem of galaxy formation may be profitably tackled with a Bayesian-inference based approach, which allows one to constrain theory with data in a statistically rigorous way. In this paper we develop a SAM in the framework of Bayesian inference. We show that, with a parallel implementation of an advanced Markov-Chain Monte-Carlo algorithm, it is now possible to rigorously sample the posterior distribution of the high-dimensional parameter space of typical SAMs. As an example, we characterise galaxy formation in the current Λ\LambdaCDM cosmology using the stellar mass function of galaxies as an observational constraint. We find that the posterior probability distribution is both topologically complex and degenerate in some important model parameters, suggesting that thorough explorations of the parameter space are needed to understand the models. We also demonstrate that because of the model degeneracy, adopting a narrow prior strongly restricts the model. Therefore, the inferences based on SAMs are conditional to the model adopted. Using synthetic data to mimic systematic errors in the stellar mass function, we demonstrate that an accurate observational error model is essential to meaningful inference.Comment: revised version to match published article published in MNRA

    Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes

    Full text link
    We define a family of probability distributions for random count matrices with a potentially unbounded number of rows and columns. The three distributions we consider are derived from the gamma-Poisson, gamma-negative binomial, and beta-negative binomial processes. Because the models lead to closed-form Gibbs sampling update equations, they are natural candidates for nonparametric Bayesian priors over count matrices. A key aspect of our analysis is the recognition that, although the random count matrices within the family are defined by a row-wise construction, their columns can be shown to be i.i.d. This fact is used to derive explicit formulas for drawing all the columns at once. Moreover, by analyzing these matrices' combinatorial structure, we describe how to sequentially construct a column-i.i.d. random count matrix one row at a time, and derive the predictive distribution of a new row count vector with previously unseen features. We describe the similarities and differences between the three priors, and argue that the greater flexibility of the gamma- and beta- negative binomial processes, especially their ability to model over-dispersed, heavy-tailed count data, makes these well suited to a wide variety of real-world applications. As an example of our framework, we construct a naive-Bayes text classifier to categorize a count vector to one of several existing random count matrices of different categories. The classifier supports an unbounded number of features, and unlike most existing methods, it does not require a predefined finite vocabulary to be shared by all the categories, and needs neither feature selection nor parameter tuning. Both the gamma- and beta- negative binomial processes are shown to significantly outperform the gamma-Poisson process for document categorization, with comparable performance to other state-of-the-art supervised text classification algorithms.Comment: To appear in Journal of the American Statistical Association (Theory and Methods). 31 pages + 11 page supplement, 5 figure
    • …
    corecore