17,907 research outputs found

    Accelerating MCMC via Parallel Predictive Prefetching

    Full text link
    We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast, iterative approximations to the target density to speculatively evaluate many potential future steps of the chain in parallel. The approach can accelerate computation of the target distribution of a Bayesian inference problem, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores

    Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes

    Full text link
    We define a family of probability distributions for random count matrices with a potentially unbounded number of rows and columns. The three distributions we consider are derived from the gamma-Poisson, gamma-negative binomial, and beta-negative binomial processes. Because the models lead to closed-form Gibbs sampling update equations, they are natural candidates for nonparametric Bayesian priors over count matrices. A key aspect of our analysis is the recognition that, although the random count matrices within the family are defined by a row-wise construction, their columns can be shown to be i.i.d. This fact is used to derive explicit formulas for drawing all the columns at once. Moreover, by analyzing these matrices' combinatorial structure, we describe how to sequentially construct a column-i.i.d. random count matrix one row at a time, and derive the predictive distribution of a new row count vector with previously unseen features. We describe the similarities and differences between the three priors, and argue that the greater flexibility of the gamma- and beta- negative binomial processes, especially their ability to model over-dispersed, heavy-tailed count data, makes these well suited to a wide variety of real-world applications. As an example of our framework, we construct a naive-Bayes text classifier to categorize a count vector to one of several existing random count matrices of different categories. The classifier supports an unbounded number of features, and unlike most existing methods, it does not require a predefined finite vocabulary to be shared by all the categories, and needs neither feature selection nor parameter tuning. Both the gamma- and beta- negative binomial processes are shown to significantly outperform the gamma-Poisson process for document categorization, with comparable performance to other state-of-the-art supervised text classification algorithms.Comment: To appear in Journal of the American Statistical Association (Theory and Methods). 31 pages + 11 page supplement, 5 figure

    Bayesian inference on compact binary inspiral gravitational radiation signals in interferometric data

    Full text link
    Presented is a description of a Markov chain Monte Carlo (MCMC) parameter estimation routine for use with interferometric gravitational radiational data in searches for binary neutron star inspiral signals. Five parameters associated with the inspiral can be estimated, and summary statistics are produced. Advanced MCMC methods were implemented, including importance resampling and prior distributions based on detection probability, in order to increase the efficiency of the code. An example is presented from an application using realistic, albeit fictitious, data.Comment: submitted to Classical and Quantum Gravity. 14 pages, 5 figure

    A Bayesian approach to the semi-analytic model of galaxy formation: methodology

    Get PDF
    We believe that a wide range of physical processes conspire to shape the observed galaxy population but we remain unsure of their detailed interactions. The semi-analytic model (SAM) of galaxy formation uses multi-dimensional parameterisations of the physical processes of galaxy formation and provides a tool to constrain these underlying physical interactions. Because of the high dimensionality, the parametric problem of galaxy formation may be profitably tackled with a Bayesian-inference based approach, which allows one to constrain theory with data in a statistically rigorous way. In this paper we develop a SAM in the framework of Bayesian inference. We show that, with a parallel implementation of an advanced Markov-Chain Monte-Carlo algorithm, it is now possible to rigorously sample the posterior distribution of the high-dimensional parameter space of typical SAMs. As an example, we characterise galaxy formation in the current Λ\LambdaCDM cosmology using the stellar mass function of galaxies as an observational constraint. We find that the posterior probability distribution is both topologically complex and degenerate in some important model parameters, suggesting that thorough explorations of the parameter space are needed to understand the models. We also demonstrate that because of the model degeneracy, adopting a narrow prior strongly restricts the model. Therefore, the inferences based on SAMs are conditional to the model adopted. Using synthetic data to mimic systematic errors in the stellar mass function, we demonstrate that an accurate observational error model is essential to meaningful inference.Comment: revised version to match published article published in MNRA

    Analytic Continuation of Quantum Monte Carlo Data by Stochastic Analytical Inference

    Full text link
    We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data which is strictly based on principles of Bayesian statistical inference. Within this framework we are able to obtain an explicit expression for the calculation of a weighted average over possible energy spectra, which can be evaluated by standard Monte Carlo simulations, yielding as by-product also the distribution function as function of the regularization parameter. Our algorithm thus avoids the usual ad-hoc assumptions introduced in similar algortihms to fix the regularization parameter. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum entropy calculation

    Likelihood-Free Parallel Tempering

    Full text link
    Approximate Bayesian Computational (ABC) methods (or likelihood-free methods) have appeared in the past fifteen years as useful methods to perform Bayesian analyses when the likelihood is analytically or computationally intractable. Several ABC methods have been proposed: Monte Carlo Markov Chains (MCMC) methods have been developped by Marjoramet al. (2003) and by Bortotet al. (2007) for instance, and sequential methods have been proposed among others by Sissonet al. (2007), Beaumont et al. (2009) and Del Moral et al. (2009). Until now, while ABC-MCMC methods remain the reference, sequential ABC methods have appeared to outperforms them (see for example McKinley et al. (2009) or Sisson et al. (2007)). In this paper a new algorithm combining population-based MCMC methods with ABC requirements is proposed, using an analogy with the Parallel Tempering algorithm (Geyer, 1991). Performances are compared with existing ABC algorithms on simulations and on a real example

    Quantum machine learning: a classical perspective

    Get PDF
    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
    corecore