31,238 research outputs found
Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification
Gaussian processes are a natural way of defining prior distributions over
functions of one or more input variables. In a simple nonparametric regression
problem, where such a function gives the mean of a Gaussian distribution for an
observed response, a Gaussian process model can easily be implemented using
matrix computations that are feasible for datasets of up to about a thousand
cases. Hyperparameters that define the covariance function of the Gaussian
process can be sampled using Markov chain methods. Regression models where the
noise has a t distribution and logistic or probit models for classification
applications can be implemented by sampling as well for latent values
underlying the observations. Software is now available that implements these
methods using covariance functions with hierarchical parameterizations. Models
defined in this way can discover high-level properties of the data, such as
which inputs are relevant to predicting the response
Computational statistics using the Bayesian Inference Engine
This paper introduces the Bayesian Inference Engine (BIE), a general
parallel, optimised software package for parameter inference and model
selection. This package is motivated by the analysis needs of modern
astronomical surveys and the need to organise and reuse expensive derived data.
The BIE is the first platform for computational statistics designed explicitly
to enable Bayesian update and model comparison for astronomical problems.
Bayesian update is based on the representation of high-dimensional posterior
distributions using metric-ball-tree based kernel density estimation. Among its
algorithmic offerings, the BIE emphasises hybrid tempered MCMC schemes that
robustly sample multimodal posterior distributions in high-dimensional
parameter spaces. Moreover, the BIE is implements a full persistence or
serialisation system that stores the full byte-level image of the running
inference and previously characterised posterior distributions for later use.
Two new algorithms to compute the marginal likelihood from the posterior
distribution, developed for and implemented in the BIE, enable model comparison
for complex models and data sets. Finally, the BIE was designed to be a
collaborative platform for applying Bayesian methodology to astronomy. It
includes an extensible object-oriented and easily extended framework that
implements every aspect of the Bayesian inference. By providing a variety of
statistical algorithms for all phases of the inference problem, a scientist may
explore a variety of approaches with a single model and data implementation.
Additional technical details and download details are available from
http://www.astro.umass.edu/bie. The BIE is distributed under the GNU GPL.Comment: Resubmitted version. Additional technical details and download
details are available from http://www.astro.umass.edu/bie. The BIE is
distributed under the GNU GP
A sampling algorithm to estimate the effect of fluctuations in particle physics data
Background properties in experimental particle physics are typically
estimated using large data sets. However, different events can exhibit
different features because of the quantum mechanical nature of the underlying
physics processes. While signal and background fractions in a given data set
can be evaluated using a maximum likelihood estimator, the shapes of the
corresponding distributions are traditionally obtained using high-statistics
control samples, which normally neglects the effect of fluctuations. On the
other hand, if it was possible to subtract background using templates that take
fluctuations into account, this would be expected to improve the resolution of
the observables of interest, and to reduce systematics depending on the
analysis. This study is an initial step in this direction. We propose a novel
algorithm inspired by the Gibbs sampler that makes it possible to estimate the
shapes of signal and background probability density functions from a given
collection of particles, using control sample templates as initial conditions
and refining them to take into account the effect of fluctuations. Results on
Monte Carlo data are presented, and the prospects for future development are
discussed.Comment: 6 pages, 1 figure. Edited to improve readability in line with the
published article. This is based on a condensed version for publication in
the Proceedings of the International Conference on Mathematical Modelling in
the Physical Sciences, IC-MSQUARE 2012, Budapest, Hungary. A more detailed
discussion can be found in the preceding version of this arXiv recor
Analyze This! A Cosmological Constraint Package for CMBEASY
We introduce a Markov Chain Monte Carlo simulation and data analysis package
that extends the CMBEASY software. We have taken special care in implementing
an adaptive step algorithm for the Markov Chain Monte Carlo in order to improve
convergence. Data analysis routines are provided which allow to test models of
the Universe against measurements of the cosmic microwave background,
supernovae Ia and large scale structure. We present constraints on cosmological
parameters derived from these measurements for a CDM cosmology and
discuss the impact of the different observational data sets on the parameters.
The package is publicly available as part of the CMBEASY software at
www.cmbeasy.org.Comment: Published version, JCAP style, 16 pages, 7 figures. The software is
available at http://www.cmbeasy.or
A Shuffled Complex Evolution Metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters
Markov Chain Monte Carlo (MCMC) methods have become increasingly popular for estimating the posterior probability distribution of parameters in hydrologic models. However, MCMC methods require the a priori definition of a proposal or sampling distribution, which determines the explorative capabilities and efficiency of the sampler and therefore the statistical properties of the Markov Chain and its rate of convergence. In this paper we present an MCMC sampler entitled the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA), which is well suited to infer the posterior distribution of hydrologic model parameters. The SCEM-UA algorithm is a modified version of the original SCE-UA global optimization algorithm developed by Duan et al. [1992]. The SCEM-UA algorithm operates by merging the strengths of the Metropolis algorithm, controlled random search, competitive evolution, and complex shuffling in order to continuously update the proposal distribution and evolve the sampler to the posterior target distribution. Three case studies demonstrate that the adaptive capability of the SCEM-UA algorithm significantly reduces the number of model simulations needed to infer the posterior distribution of the parameters when compared with the traditional Metropolis-Hastings samplers
Efficient learning in ABC algorithms
Approximate Bayesian Computation has been successfully used in population
genetics to bypass the calculation of the likelihood. These methods provide
accurate estimates of the posterior distribution by comparing the observed
dataset to a sample of datasets simulated from the model. Although
parallelization is easily achieved, computation times for ensuring a suitable
approximation quality of the posterior distribution are still high. To
alleviate the computational burden, we propose an adaptive, sequential
algorithm that runs faster than other ABC algorithms but maintains accuracy of
the approximation. This proposal relies on the sequential Monte Carlo sampler
of Del Moral et al. (2012) but is calibrated to reduce the number of
simulations from the model. The paper concludes with numerical experiments on a
toy example and on a population genetic study of Apis mellifera, where our
algorithm was shown to be faster than traditional ABC schemes
- …