14,557 research outputs found
Scalable Rejection Sampling for Bayesian Hierarchical Models
Bayesian hierarchical modeling is a popular approach to capturing unobserved
heterogeneity across individual units. However, standard estimation methods
such as Markov chain Monte Carlo (MCMC) can be impracticable for modeling
outcomes from a large number of units. We develop a new method to sample from
posterior distributions of Bayesian models, without using MCMC. Samples are
independent, so they can be collected in parallel, and we do not need to be
concerned with issues like chain convergence and autocorrelation. The algorithm
is scalable under the weak assumption that individual units are conditionally
independent, making it applicable for large datasets. It can also be used to
compute marginal likelihoods
Distributed Bayesian Matrix Factorization with Limited Communication
Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank
representations of matrices and for predicting missing values and providing
confidence intervals. Scaling up the posterior inference for massive-scale
matrices is challenging and requires distributing both data and computation
over many workers, making communication the main computational bottleneck.
Embarrassingly parallel inference would remove the communication needed, by
using completely independent computations on different data subsets, but it
suffers from the inherent unidentifiability of BMF solutions. We introduce a
hierarchical decomposition of the joint posterior distribution, which couples
the subset inferences, allowing for embarrassingly parallel computations in a
sequence of at most three stages. Using an efficient approximate
implementation, we show improvements empirically on both real and simulated
data. Our distributed approach is able to achieve a speed-up of almost an order
of magnitude over the full posterior, with a negligible effect on predictive
accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC
methods in accuracy, and achieves results competitive to other available
distributed and parallel implementations of BMF.Comment: 28 pages, 8 figures. The paper is published in Machine Learning
journal. An implementation of the method is is available in SMURFF software
on github (bmfpp branch): https://github.com/ExaScience/smurf
New insight on galaxy structure from GALPHAT I. Motivation, methodology, and benchmarks for Sersic models
We introduce a new galaxy image decomposition tool, GALPHAT (GALaxy
PHotometric ATtributes), to provide full posterior probability distributions
and reliable confidence intervals for all model parameters. GALPHAT is designed
to yield a high speed and accurate likelihood computation, using grid
interpolation and Fourier rotation. We benchmark this approach using an
ensemble of simulated Sersic model galaxies over a wide range of observational
conditions: the signal-to-noise ratio S/N, the ratio of galaxy size to the PSF
and the image size, and errors in the assumed PSF; and a range of structural
parameters: the half-light radius and the Sersic index . We
characterise the strength of parameter covariance in Sersic model, which
increases with S/N and , and the results strongly motivate the need for the
full posterior probability distribution in galaxy morphology analyses and later
inferences.
The test results for simulated galaxies successfully demonstrate that, with a
careful choice of Markov chain Monte Carlo algorithms and fast model image
generation, GALPHAT is a powerful analysis tool for reliably inferring
morphological parameters from a large ensemble of galaxies over a wide range of
different observational conditions. (abridged)Comment: Submitted to MNRAS. The submitted version with high resolution
figures can be downloaded from
http://www.astro.umass.edu/~iyoon/GALPHAT/galphat1.pd
- …