4,989 research outputs found
Distributed Parameter Estimation in Probabilistic Graphical Models
This paper presents foundational theoretical results on distributed parameter
estimation for undirected probabilistic graphical models. It introduces a
general condition on composite likelihood decompositions of these models which
guarantees the global consistency of distributed estimators, provided the local
estimators are consistent
Bayesian model selection for exponential random graph models via adjusted pseudolikelihoods
Models with intractable likelihood functions arise in areas including network
analysis and spatial statistics, especially those involving Gibbs random
fields. Posterior parameter es timation in these settings is termed a
doubly-intractable problem because both the likelihood function and the
posterior distribution are intractable. The comparison of Bayesian models is
often based on the statistical evidence, the integral of the un-normalised
posterior distribution over the model parameters which is rarely available in
closed form. For doubly-intractable models, estimating the evidence adds
another layer of difficulty. Consequently, the selection of the model that best
describes an observed network among a collection of exponential random graph
models for network analysis is a daunting task. Pseudolikelihoods offer a
tractable approximation to the likelihood but should be treated with caution
because they can lead to an unreasonable inference. This paper specifies a
method to adjust pseudolikelihoods in order to obtain a reasonable, yet
tractable, approximation to the likelihood. This allows implementation of
widely used computational methods for evidence estimation and pursuit of
Bayesian model selection of exponential random graph models for the analysis of
social networks. Empirical comparisons to existing methods show that our
procedure yields similar evidence estimates, but at a lower computational cost.Comment: Supplementary material attached. To view attachments, please download
and extract the gzzipped source file listed under "Other formats
Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration
Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. In the present article, we propose a thermodynamic integration scheme that directly targets the log Bayes factor. The method is based on a modified annealing path between the posterior distributions of the two models compared, which systematically avoids the high variance prior regime. We combine this scheme with the concept of non-equilibrium TI to minimise discretisation errors from numerical integration. Results obtained on Bayesian regression models applied to standard benchmark data, and a complex hierarchical model applied to biopathway inference, demonstrate a significant reduction in estimator variance over state-of-the-art TI methods
Accelerating delayed-acceptance Markov chain Monte Carlo algorithms
Delayed-acceptance Markov chain Monte Carlo (DA-MCMC) samples from a
probability distribution via a two-stages version of the Metropolis-Hastings
algorithm, by combining the target distribution with a "surrogate" (i.e. an
approximate and computationally cheaper version) of said distribution. DA-MCMC
accelerates MCMC sampling in complex applications, while still targeting the
exact distribution. We design a computationally faster, albeit approximate,
DA-MCMC algorithm. We consider parameter inference in a Bayesian setting where
a surrogate likelihood function is introduced in the delayed-acceptance scheme.
When the evaluation of the likelihood function is computationally intensive,
our scheme produces a 2-4 times speed-up, compared to standard DA-MCMC.
However, the acceleration is highly problem dependent. Inference results for
the standard delayed-acceptance algorithm and our approximated version are
similar, indicating that our algorithm can return reliable Bayesian inference.
As a computationally intensive case study, we introduce a novel stochastic
differential equation model for protein folding data.Comment: 40 pages, 21 figures, 10 table
Linear and Parallel Learning of Markov Random Fields
We introduce a new embarrassingly parallel parameter learning algorithm for
Markov random fields with untied parameters which is efficient for a large
class of practical models. Our algorithm parallelizes naturally over cliques
and, for graphs of bounded degree, its complexity is linear in the number of
cliques. Unlike its competitors, our algorithm is fully parallel and for
log-linear models it is also data efficient, requiring only the local
sufficient statistics of the data to estimate parameters
Hidden Gibbs random fields model selection using Block Likelihood Information Criterion
Performing model selection between Gibbs random fields is a very challenging
task. Indeed, due to the Markovian dependence structure, the normalizing
constant of the fields cannot be computed using standard analytical or
numerical methods. Furthermore, such unobserved fields cannot be integrated out
and the likelihood evaluztion is a doubly intractable problem. This forms a
central issue to pick the model that best fits an observed data. We introduce a
new approximate version of the Bayesian Information Criterion. We partition the
lattice into continuous rectangular blocks and we approximate the probability
measure of the hidden Gibbs field by the product of some Gibbs distributions
over the blocks. On that basis, we estimate the likelihood and derive the Block
Likelihood Information Criterion (BLIC) that answers model choice questions
such as the selection of the dependency structure or the number of latent
states. We study the performances of BLIC for those questions. In addition, we
present a comparison with ABC algorithms to point out that the novel criterion
offers a better trade-off between time efficiency and reliable results
- âŠ