18 research outputs found

    Pseudo-Marginal Bayesian Inference for Gaussian Processes

    Get PDF
    The main challenges that arise when adopting Gaussian Process priors in probabilistic modeling are how to carry out exact Bayesian inference and how to account for uncertainty on model parameters when making model-based predictions on out-of-sample data. Using probit regression as an illustrative working example, this paper presents a general and effective methodology based on the pseudo-marginal approach to Markov chain Monte Carlo that efficiently addresses both of these issues. The results presented in this paper show improvements over existing sampling methods to simulate from the posterior distribution over the parameters defining the covariance function of the Gaussian Process prior. This is particularly important as it offers a powerful tool to carry out full Bayesian inference of Gaussian Process based hierarchic statistical models in general. The results also demonstrate that Monte Carlo based integration of all model parameters is actually feasible in this class of models providing a superior quantification of uncertainty in predictions. Extensive comparisons with respect to state-of-the-art probabilistic classifiers confirm this assertion.Comment: 14 pages double colum

    Looking Good With Flickr Faves: Gaussian Processes for Finding Difference Makers in Personality Impressions

    Get PDF
    Flickr allows its users to generate galleries of "faves", i.e., pictures that they have tagged as favourite. According to recent studies, the faves are predictive of the personality traits that people attribute to Flickr users. This article investigates the phenomenon and shows that faves allow one to predict whether a Flickr user is perceived to be above median or not with respect to each of the Big-Five Traits (accuracy up to 79\% depending on the trait). The classifier - based on Gaussian Processes with a new kernel designed for this work - allows one to identify the visual characteristics of faves that better account for the prediction outcome

    Optimal scaling for the pseudo-marginal random walk Metropolis: insensitivity to the noise generating mechanism

    Get PDF
    We examine the optimal scaling and the efficiency of the pseudo-marginal random walk Metropolis algorithm using a recently-derived result on the limiting efficiency as the dimension, d→∞d\rightarrow \infty. We prove that the optimal scaling for a given target varies by less than 20%20\% across a wide range of distributions for the noise in the estimate of the target, and that any scaling that is within 20%20\% of the optimal one will be at least 70%70\% efficient. We demonstrate that this phenomenon occurs even outside the range of distributions for which we rigorously prove it. We then conduct a simulation study on an example with d=10d=10 where importance sampling is used to estimate the target density; we also examine results available from an existing simulations study with d=5d=5 and where a particle filter was used. Our key conclusions are found to hold in these examples also.Comment: New version: simulation study now on a real statistical example (confusing typos corrected

    Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)

    Get PDF
    In applications of Gaussian processes where quantification of uncertainty is of primary interest, it is necessary to accurately characterize the posterior distribution over covariance parameters. This paper proposes an adaptation of the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the posterior distribution over covariance parameters with negligible bias and without the need to compute the marginal likelihood. In Gaussian process regression, this has the enormous advantage that stochastic gradients can be computed by solving linear systems only. A novel unbiased linear systems solver based on parallelizable covariance matrix-vector products is developed to accelerate the unbiased estimation of gradients. The results demonstrate the possibility to enable scalable and exact (in a Monte Carlo sense) quantification of uncertainty in Gaussian processes without imposing any special structure on the covariance or reducing the number of input vectors.Comment: 10 pages - paper accepted at ICML 201

    Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families

    Get PDF
    We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities where classical HMC is not an option due to intractable gradients, KMC adaptively learns the target's gradient structure by fitting an exponential family model in a Reproducing Kernel Hilbert Space. Computational costs are reduced by two novel efficient approximations to this gradient. While being asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and offers substantial mixing improvements over state-of-the-art gradient free samplers. We support our claims with experimental studies on both toy and real-world applications, including Approximate Bayesian Computation and exact-approximate MCMC.Comment: 20 pages, 7 figure

    Physiological Gaussian Process Priors for the Hemodynamics in fMRI Analysis

    Full text link
    Background: Inference from fMRI data faces the challenge that the hemodynamic system that relates neural activity to the observed BOLD fMRI signal is unknown. New Method: We propose a new Bayesian model for task fMRI data with the following features: (i) joint estimation of brain activity and the underlying hemodynamics, (ii) the hemodynamics is modeled nonparametrically with a Gaussian process (GP) prior guided by physiological information and (iii) the predicted BOLD is not necessarily generated by a linear time-invariant (LTI) system. We place a GP prior directly on the predicted BOLD response, rather than on the hemodynamic response function as in previous literature. This allows us to incorporate physiological information via the GP prior mean in a flexible way, and simultaneously gives us the nonparametric flexibility of the GP. Results: Results on simulated data show that the proposed model is able to discriminate between active and non-active voxels also when the GP prior deviates from the true hemodynamics. Our model finds time varying dynamics when applied to real fMRI data. Comparison with Existing Method(s): The proposed model is better at detecting activity in simulated data than standard models, without inflating the false positive rate. When applied to real fMRI data, our GP model in several cases finds brain activity where previously proposed LTI models does not. Conclusions: We have proposed a new non-linear model for the hemodynamics in task fMRI, that is able to detect active voxels, and gives the opportunity to ask new kinds of questions related to hemodynamics.Comment: 18 pages, 14 figure

    Analysis of the Gibbs sampler for hierarchical inverse problems

    Get PDF
    Many inverse problems arising in applications come from continuum models where the unknown parameter is a field. In practice the unknown field is discretized resulting in a problem in RN\mathbb{R}^N, with an understanding that refining the discretization, that is increasing NN, will often be desirable. In the context of Bayesian inversion this situation suggests the importance of two issues: (i) defining hyper-parameters in such a way that they are interpretable in the continuum limit N→∞N \to \infty and so that their values may be compared between different discretization levels; (ii) understanding the efficiency of algorithms for probing the posterior distribution, as a function of large N.N. Here we address these two issues in the context of linear inverse problems subject to additive Gaussian noise within a hierarchical modelling framework based on a Gaussian prior for the unknown field and an inverse-gamma prior for a hyper-parameter, namely the amplitude of the prior variance. The structure of the model is such that the Gibbs sampler can be easily implemented for probing the posterior distribution. Subscribing to the dogma that one should think infinite-dimensionally before implementing in finite dimensions, we present function space intuition and provide rigorous theory showing that as NN increases, the component of the Gibbs sampler for sampling the amplitude of the prior variance becomes increasingly slower. We discuss a reparametrization of the prior variance that is robust with respect to the increase in dimension; we give numerical experiments which exhibit that our reparametrization prevents the slowing down. Our intuition on the behaviour of the prior hyper-parameter, with and without reparametrization, is sufficiently general to include a broad class of nonlinear inverse problems as well as other families of hyper-priors.Comment: to appear, SIAM/ASA Journal on Uncertainty Quantificatio
    corecore