4,989 research outputs found

    Distributed Parameter Estimation in Probabilistic Graphical Models

    Full text link
    This paper presents foundational theoretical results on distributed parameter estimation for undirected probabilistic graphical models. It introduces a general condition on composite likelihood decompositions of these models which guarantees the global consistency of distributed estimators, provided the local estimators are consistent

    Bayesian model selection for exponential random graph models via adjusted pseudolikelihoods

    Get PDF
    Models with intractable likelihood functions arise in areas including network analysis and spatial statistics, especially those involving Gibbs random fields. Posterior parameter es timation in these settings is termed a doubly-intractable problem because both the likelihood function and the posterior distribution are intractable. The comparison of Bayesian models is often based on the statistical evidence, the integral of the un-normalised posterior distribution over the model parameters which is rarely available in closed form. For doubly-intractable models, estimating the evidence adds another layer of difficulty. Consequently, the selection of the model that best describes an observed network among a collection of exponential random graph models for network analysis is a daunting task. Pseudolikelihoods offer a tractable approximation to the likelihood but should be treated with caution because they can lead to an unreasonable inference. This paper specifies a method to adjust pseudolikelihoods in order to obtain a reasonable, yet tractable, approximation to the likelihood. This allows implementation of widely used computational methods for evidence estimation and pursuit of Bayesian model selection of exponential random graph models for the analysis of social networks. Empirical comparisons to existing methods show that our procedure yields similar evidence estimates, but at a lower computational cost.Comment: Supplementary material attached. To view attachments, please download and extract the gzzipped source file listed under "Other formats

    Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration

    Get PDF
    Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. In the present article, we propose a thermodynamic integration scheme that directly targets the log Bayes factor. The method is based on a modified annealing path between the posterior distributions of the two models compared, which systematically avoids the high variance prior regime. We combine this scheme with the concept of non-equilibrium TI to minimise discretisation errors from numerical integration. Results obtained on Bayesian regression models applied to standard benchmark data, and a complex hierarchical model applied to biopathway inference, demonstrate a significant reduction in estimator variance over state-of-the-art TI methods

    Accelerating delayed-acceptance Markov chain Monte Carlo algorithms

    Full text link
    Delayed-acceptance Markov chain Monte Carlo (DA-MCMC) samples from a probability distribution via a two-stages version of the Metropolis-Hastings algorithm, by combining the target distribution with a "surrogate" (i.e. an approximate and computationally cheaper version) of said distribution. DA-MCMC accelerates MCMC sampling in complex applications, while still targeting the exact distribution. We design a computationally faster, albeit approximate, DA-MCMC algorithm. We consider parameter inference in a Bayesian setting where a surrogate likelihood function is introduced in the delayed-acceptance scheme. When the evaluation of the likelihood function is computationally intensive, our scheme produces a 2-4 times speed-up, compared to standard DA-MCMC. However, the acceleration is highly problem dependent. Inference results for the standard delayed-acceptance algorithm and our approximated version are similar, indicating that our algorithm can return reliable Bayesian inference. As a computationally intensive case study, we introduce a novel stochastic differential equation model for protein folding data.Comment: 40 pages, 21 figures, 10 table

    Linear and Parallel Learning of Markov Random Fields

    Full text link
    We introduce a new embarrassingly parallel parameter learning algorithm for Markov random fields with untied parameters which is efficient for a large class of practical models. Our algorithm parallelizes naturally over cliques and, for graphs of bounded degree, its complexity is linear in the number of cliques. Unlike its competitors, our algorithm is fully parallel and for log-linear models it is also data efficient, requiring only the local sufficient statistics of the data to estimate parameters

    Hidden Gibbs random fields model selection using Block Likelihood Information Criterion

    Full text link
    Performing model selection between Gibbs random fields is a very challenging task. Indeed, due to the Markovian dependence structure, the normalizing constant of the fields cannot be computed using standard analytical or numerical methods. Furthermore, such unobserved fields cannot be integrated out and the likelihood evaluztion is a doubly intractable problem. This forms a central issue to pick the model that best fits an observed data. We introduce a new approximate version of the Bayesian Information Criterion. We partition the lattice into continuous rectangular blocks and we approximate the probability measure of the hidden Gibbs field by the product of some Gibbs distributions over the blocks. On that basis, we estimate the likelihood and derive the Block Likelihood Information Criterion (BLIC) that answers model choice questions such as the selection of the dependency structure or the number of latent states. We study the performances of BLIC for those questions. In addition, we present a comparison with ABC algorithms to point out that the novel criterion offers a better trade-off between time efficiency and reliable results
    • 

    corecore