2,635 research outputs found

    Bayesian model comparison with un-normalised likelihoods

    Get PDF
    Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates

    Bayesian model selection for exponential random graph models via adjusted pseudolikelihoods

    Get PDF
    Models with intractable likelihood functions arise in areas including network analysis and spatial statistics, especially those involving Gibbs random fields. Posterior parameter es timation in these settings is termed a doubly-intractable problem because both the likelihood function and the posterior distribution are intractable. The comparison of Bayesian models is often based on the statistical evidence, the integral of the un-normalised posterior distribution over the model parameters which is rarely available in closed form. For doubly-intractable models, estimating the evidence adds another layer of difficulty. Consequently, the selection of the model that best describes an observed network among a collection of exponential random graph models for network analysis is a daunting task. Pseudolikelihoods offer a tractable approximation to the likelihood but should be treated with caution because they can lead to an unreasonable inference. This paper specifies a method to adjust pseudolikelihoods in order to obtain a reasonable, yet tractable, approximation to the likelihood. This allows implementation of widely used computational methods for evidence estimation and pursuit of Bayesian model selection of exponential random graph models for the analysis of social networks. Empirical comparisons to existing methods show that our procedure yields similar evidence estimates, but at a lower computational cost.Comment: Supplementary material attached. To view attachments, please download and extract the gzzipped source file listed under "Other formats

    Quantifying dimensionality: Bayesian cosmological model complexities

    Get PDF
    We demonstrate a measure for the effective number of parameters constrained by a posterior distribution in the context of cosmology. In the same way that the mean of the Shannon information (i.e. the Kullback-Leibler divergence) provides a measure of the strength of constraint between prior and posterior, we show that the variance of the Shannon information gives a measure of dimensionality of constraint. We examine this quantity in a cosmological context, applying it to likelihoods derived from Cosmic Microwave Background, large scale structure and supernovae data. We show that this measure of Bayesian model dimensionality compares favourably both analytically and numerically in a cosmological context with the existing measure of model complexity used in the literature.Comment: 14 pages, 9 figures. v2: updates post peer-review. v3: typographical correction to equation 3

    Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys

    Get PDF
    Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometric errors and parameter degeneracies, the redshift and type distributions can be recovered robustly thanks to the hierarchical nature of the model, which is not possible with common photometric redshift estimation techniques. As a result, redshift uncertainties can be fully propagated in cosmological analyses for the first time, fulfilling an essential requirement for the current and future generations of surveys.Comment: 10 pages, matches version accepted in MNRAS, including new appendix describing the effect of Bayesian shrinkage in a simplified settin

    A Profile Likelihood Analysis of the Constrained MSSM with Genetic Algorithms

    Full text link
    The Constrained Minimal Supersymmetric Standard Model (CMSSM) is one of the simplest and most widely-studied supersymmetric extensions to the standard model of particle physics. Nevertheless, current data do not sufficiently constrain the model parameters in a way completely independent of priors, statistical measures and scanning techniques. We present a new technique for scanning supersymmetric parameter spaces, optimised for frequentist profile likelihood analyses and based on Genetic Algorithms. We apply this technique to the CMSSM, taking into account existing collider and cosmological data in our global fit. We compare our method to the MultiNest algorithm, an efficient Bayesian technique, paying particular attention to the best-fit points and implications for particle masses at the LHC and dark matter searches. Our global best-fit point lies in the focus point region. We find many high-likelihood points in both the stau co-annihilation and focus point regions, including a previously neglected section of the co-annihilation region at large m_0. We show that there are many high-likelihood points in the CMSSM parameter space commonly missed by existing scanning techniques, especially at high masses. This has a significant influence on the derived confidence regions for parameters and observables, and can dramatically change the entire statistical inference of such scans.Comment: 47 pages, 8 figures; Fig. 8, Table 7 and more discussions added to Sec. 3.4.2 in response to referee's comments; accepted for publication in JHE

    Delayed acceptance ABC-SMC

    Get PDF
    Approximate Bayesian computation (ABC) is now an established technique for statistical inference used in cases where the likelihood function is computationally expensive or not available. It relies on the use of a~model that is specified in the form of a~simulator, and approximates the likelihood at a~parameter value θ\theta by simulating auxiliary data sets xx and evaluating the distance of xx from the true data yy. However, ABC is not computationally feasible in cases where using the simulator for each θ\theta is very expensive. This paper investigates this situation in cases where a~cheap, but approximate, simulator is available. The approach is to employ delayed acceptance Markov chain Monte Carlo (MCMC) within an ABC sequential Monte Carlo (SMC) sampler in order to, in a~first stage of the kernel, use the cheap simulator to rule out parts of the parameter space that are not worth exploring, so that the ``true'' simulator is only run (in the second stage of the kernel) where there is a~reasonable chance of accepting proposed values of θ\theta. We show that this approach can be used quite automatically, with few tuning parameters. Applications to stochastic differential equation models and latent doubly intractable distributions are presented
    corecore