14,557 research outputs found

    Scalable Rejection Sampling for Bayesian Hierarchical Models

    Full text link
    Bayesian hierarchical modeling is a popular approach to capturing unobserved heterogeneity across individual units. However, standard estimation methods such as Markov chain Monte Carlo (MCMC) can be impracticable for modeling outcomes from a large number of units. We develop a new method to sample from posterior distributions of Bayesian models, without using MCMC. Samples are independent, so they can be collected in parallel, and we do not need to be concerned with issues like chain convergence and autocorrelation. The algorithm is scalable under the weak assumption that individual units are conditionally independent, making it applicable for large datasets. It can also be used to compute marginal likelihoods

    Distributed Bayesian Matrix Factorization with Limited Communication

    Full text link
    Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.Comment: 28 pages, 8 figures. The paper is published in Machine Learning journal. An implementation of the method is is available in SMURFF software on github (bmfpp branch): https://github.com/ExaScience/smurf

    New insight on galaxy structure from GALPHAT I. Motivation, methodology, and benchmarks for Sersic models

    Get PDF
    We introduce a new galaxy image decomposition tool, GALPHAT (GALaxy PHotometric ATtributes), to provide full posterior probability distributions and reliable confidence intervals for all model parameters. GALPHAT is designed to yield a high speed and accurate likelihood computation, using grid interpolation and Fourier rotation. We benchmark this approach using an ensemble of simulated Sersic model galaxies over a wide range of observational conditions: the signal-to-noise ratio S/N, the ratio of galaxy size to the PSF and the image size, and errors in the assumed PSF; and a range of structural parameters: the half-light radius rer_e and the Sersic index nn. We characterise the strength of parameter covariance in Sersic model, which increases with S/N and nn, and the results strongly motivate the need for the full posterior probability distribution in galaxy morphology analyses and later inferences. The test results for simulated galaxies successfully demonstrate that, with a careful choice of Markov chain Monte Carlo algorithms and fast model image generation, GALPHAT is a powerful analysis tool for reliably inferring morphological parameters from a large ensemble of galaxies over a wide range of different observational conditions. (abridged)Comment: Submitted to MNRAS. The submitted version with high resolution figures can be downloaded from http://www.astro.umass.edu/~iyoon/GALPHAT/galphat1.pd
    • …
    corecore