932 research outputs found
Hidden Gibbs random fields model selection using Block Likelihood Information Criterion
Performing model selection between Gibbs random fields is a very challenging
task. Indeed, due to the Markovian dependence structure, the normalizing
constant of the fields cannot be computed using standard analytical or
numerical methods. Furthermore, such unobserved fields cannot be integrated out
and the likelihood evaluztion is a doubly intractable problem. This forms a
central issue to pick the model that best fits an observed data. We introduce a
new approximate version of the Bayesian Information Criterion. We partition the
lattice into continuous rectangular blocks and we approximate the probability
measure of the hidden Gibbs field by the product of some Gibbs distributions
over the blocks. On that basis, we estimate the likelihood and derive the Block
Likelihood Information Criterion (BLIC) that answers model choice questions
such as the selection of the dependency structure or the number of latent
states. We study the performances of BLIC for those questions. In addition, we
present a comparison with ABC algorithms to point out that the novel criterion
offers a better trade-off between time efficiency and reliable results
Pre-processing for approximate Bayesian computation in image analysis
Most of the existing algorithms for approximate Bayesian computation (ABC)
assume that it is feasible to simulate pseudo-data from the model at each
iteration. However, the computational cost of these simulations can be
prohibitive for high dimensional data. An important example is the Potts model,
which is commonly used in image analysis. Images encountered in real world
applications can have millions of pixels, therefore scalability is a major
concern. We apply ABC with a synthetic likelihood to the hidden Potts model
with additive Gaussian noise. Using a pre-processing step, we fit a binding
function to model the relationship between the model parameters and the
synthetic likelihood parameters. Our numerical experiments demonstrate that the
precomputed binding function dramatically improves the scalability of ABC,
reducing the average runtime required for model fitting from 71 hours to only 7
minutes. We also illustrate the method by estimating the smoothing parameter
for remotely sensed satellite imagery. Without precomputation, Bayesian
inference is impractical for datasets of that scale.Comment: 5th IMS-ISBA joint meeting (MCMSki IV
Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks
Undirected graphical models are widely used in statistics, physics and
machine vision. However Bayesian parameter estimation for undirected models is
extremely challenging, since evaluation of the posterior typically involves the
calculation of an intractable normalising constant. This problem has received
much attention, but very little of this has focussed on the important practical
case where the data consists of noisy or incomplete observations of the
underlying hidden structure. This paper specifically addresses this problem,
comparing two alternative methodologies. In the first of these approaches
particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently
explore the parameter space, combined with the exchange algorithm (Murray et
al., 2006) for avoiding the calculation of the intractable normalising constant
(a proof showing that this combination targets the correct distribution in
found in a supplementary appendix online). This approach is compared with
approximate Bayesian computation (Pritchard et al., 1999). Applications to
estimating the parameters of Ising models and exponential random graphs from
noisy data are presented. Each algorithm used in the paper targets an
approximation to the true posterior due to the use of MCMC to simulate from the
latent graphical model, in lieu of being able to do this exactly in general.
The supplementary appendix also describes the nature of the resulting
approximation.Comment: 26 pages, 2 figures, accepted in Journal of Computational and
Graphical Statistics (http://www.amstat.org/publications/jcgs.cfm
Bayesian Computation with Intractable Likelihoods
This article surveys computational methods for posterior inference with
intractable likelihoods, that is where the likelihood function is unavailable
in closed form, or where evaluation of the likelihood is infeasible. We review
recent developments in pseudo-marginal methods, approximate Bayesian
computation (ABC), the exchange algorithm, thermodynamic integration, and
composite likelihood, paying particular attention to advancements in
scalability for large datasets. We also mention R and MATLAB source code for
implementations of these algorithms, where they are available.Comment: arXiv admin note: text overlap with arXiv:1503.0806
Sequential Bayesian inference for implicit hidden Markov models and current limitations
Hidden Markov models can describe time series arising in various fields of
science, by treating the data as noisy measurements of an arbitrarily complex
Markov process. Sequential Monte Carlo (SMC) methods have become standard tools
to estimate the hidden Markov process given the observations and a fixed
parameter value. We review some of the recent developments allowing the
inclusion of parameter uncertainty as well as model uncertainty. The
shortcomings of the currently available methodology are emphasised from an
algorithmic complexity perspective. The statistical objects of interest for
time series analysis are illustrated on a toy "Lotka-Volterra" model used in
population ecology. Some open challenges are discussed regarding the
scalability of the reviewed methodology to longer time series,
higher-dimensional state spaces and more flexible models.Comment: Review article written for ESAIM: proceedings and surveys. 25 pages,
10 figure
Reliable ABC model choice via random forests
Approximate Bayesian computation (ABC) methods provide an elaborate approach
to Bayesian inference on complex models, including model choice. Both
theoretical arguments and simulation experiments indicate, however, that model
posterior probabilities may be poorly evaluated by standard ABC techniques. We
propose a novel approach based on a machine learning tool named random forests
to conduct selection among the highly complex models covered by ABC algorithms.
We thus modify the way Bayesian model selection is both understood and
operated, in that we rephrase the inferential goal as a classification problem,
first predicting the model that best fits the data with random forests and
postponing the approximation of the posterior probability of the predicted MAP
for a second stage also relying on random forests. Compared with earlier
implementations of ABC model choice, the ABC random forest approach offers
several potential improvements: (i) it often has a larger discriminative power
among the competing models, (ii) it is more robust against the number and
choice of statistics summarizing the data, (iii) the computing effort is
drastically reduced (with a gain in computation efficiency of at least fifty),
and (iv) it includes an approximation of the posterior probability of the
selected model. The call to random forests will undoubtedly extend the range of
size of datasets and complexity of models that ABC can handle. We illustrate
the power of this novel methodology by analyzing controlled experiments as well
as genuine population genetics datasets. The proposed methodologies are
implemented in the R package abcrf available on the CRAN.Comment: 39 pages, 15 figures, 6 table
- …