18 research outputs found
Pseudo-Marginal Bayesian Inference for Gaussian Processes
The main challenges that arise when adopting Gaussian Process priors in
probabilistic modeling are how to carry out exact Bayesian inference and how to
account for uncertainty on model parameters when making model-based predictions
on out-of-sample data. Using probit regression as an illustrative working
example, this paper presents a general and effective methodology based on the
pseudo-marginal approach to Markov chain Monte Carlo that efficiently addresses
both of these issues. The results presented in this paper show improvements
over existing sampling methods to simulate from the posterior distribution over
the parameters defining the covariance function of the Gaussian Process prior.
This is particularly important as it offers a powerful tool to carry out full
Bayesian inference of Gaussian Process based hierarchic statistical models in
general. The results also demonstrate that Monte Carlo based integration of all
model parameters is actually feasible in this class of models providing a
superior quantification of uncertainty in predictions. Extensive comparisons
with respect to state-of-the-art probabilistic classifiers confirm this
assertion.Comment: 14 pages double colum
Looking Good With Flickr Faves: Gaussian Processes for Finding Difference Makers in Personality Impressions
Flickr allows its users to generate galleries of "faves", i.e., pictures that they have tagged as favourite. According to recent studies, the faves are predictive of the personality traits that people attribute to Flickr users. This article investigates the phenomenon and shows that faves allow one to predict whether a Flickr user is perceived to be above median or not with respect to each of the Big-Five Traits (accuracy up to 79\% depending on the trait). The classifier - based on Gaussian Processes with a new kernel designed for this work - allows one to identify the visual characteristics of faves that better account for the prediction outcome
Optimal scaling for the pseudo-marginal random walk Metropolis: insensitivity to the noise generating mechanism
We examine the optimal scaling and the efficiency of the pseudo-marginal
random walk Metropolis algorithm using a recently-derived result on the
limiting efficiency as the dimension, . We prove that the
optimal scaling for a given target varies by less than across a wide
range of distributions for the noise in the estimate of the target, and that
any scaling that is within of the optimal one will be at least
efficient. We demonstrate that this phenomenon occurs even outside the range of
distributions for which we rigorously prove it. We then conduct a simulation
study on an example with where importance sampling is used to estimate
the target density; we also examine results available from an existing
simulations study with and where a particle filter was used. Our key
conclusions are found to hold in these examples also.Comment: New version: simulation study now on a real statistical example
(confusing typos corrected
Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)
In applications of Gaussian processes where quantification of uncertainty is
of primary interest, it is necessary to accurately characterize the posterior
distribution over covariance parameters. This paper proposes an adaptation of
the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the
posterior distribution over covariance parameters with negligible bias and
without the need to compute the marginal likelihood. In Gaussian process
regression, this has the enormous advantage that stochastic gradients can be
computed by solving linear systems only. A novel unbiased linear systems solver
based on parallelizable covariance matrix-vector products is developed to
accelerate the unbiased estimation of gradients. The results demonstrate the
possibility to enable scalable and exact (in a Monte Carlo sense)
quantification of uncertainty in Gaussian processes without imposing any
special structure on the covariance or reducing the number of input vectors.Comment: 10 pages - paper accepted at ICML 201
Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families
We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive
MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities
where classical HMC is not an option due to intractable gradients, KMC
adaptively learns the target's gradient structure by fitting an exponential
family model in a Reproducing Kernel Hilbert Space. Computational costs are
reduced by two novel efficient approximations to this gradient. While being
asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and
offers substantial mixing improvements over state-of-the-art gradient free
samplers. We support our claims with experimental studies on both toy and
real-world applications, including Approximate Bayesian Computation and
exact-approximate MCMC.Comment: 20 pages, 7 figure
Physiological Gaussian Process Priors for the Hemodynamics in fMRI Analysis
Background: Inference from fMRI data faces the challenge that the hemodynamic
system that relates neural activity to the observed BOLD fMRI signal is
unknown.
New Method: We propose a new Bayesian model for task fMRI data with the
following features: (i) joint estimation of brain activity and the underlying
hemodynamics, (ii) the hemodynamics is modeled nonparametrically with a
Gaussian process (GP) prior guided by physiological information and (iii) the
predicted BOLD is not necessarily generated by a linear time-invariant (LTI)
system. We place a GP prior directly on the predicted BOLD response, rather
than on the hemodynamic response function as in previous literature. This
allows us to incorporate physiological information via the GP prior mean in a
flexible way, and simultaneously gives us the nonparametric flexibility of the
GP.
Results: Results on simulated data show that the proposed model is able to
discriminate between active and non-active voxels also when the GP prior
deviates from the true hemodynamics. Our model finds time varying dynamics when
applied to real fMRI data.
Comparison with Existing Method(s): The proposed model is better at detecting
activity in simulated data than standard models, without inflating the false
positive rate. When applied to real fMRI data, our GP model in several cases
finds brain activity where previously proposed LTI models does not.
Conclusions: We have proposed a new non-linear model for the hemodynamics in
task fMRI, that is able to detect active voxels, and gives the opportunity to
ask new kinds of questions related to hemodynamics.Comment: 18 pages, 14 figure
Analysis of the Gibbs sampler for hierarchical inverse problems
Many inverse problems arising in applications come from continuum models
where the unknown parameter is a field. In practice the unknown field is
discretized resulting in a problem in , with an understanding
that refining the discretization, that is increasing , will often be
desirable. In the context of Bayesian inversion this situation suggests the
importance of two issues: (i) defining hyper-parameters in such a way that they
are interpretable in the continuum limit and so that their
values may be compared between different discretization levels; (ii)
understanding the efficiency of algorithms for probing the posterior
distribution, as a function of large Here we address these two issues in
the context of linear inverse problems subject to additive Gaussian noise
within a hierarchical modelling framework based on a Gaussian prior for the
unknown field and an inverse-gamma prior for a hyper-parameter, namely the
amplitude of the prior variance. The structure of the model is such that the
Gibbs sampler can be easily implemented for probing the posterior distribution.
Subscribing to the dogma that one should think infinite-dimensionally before
implementing in finite dimensions, we present function space intuition and
provide rigorous theory showing that as increases, the component of the
Gibbs sampler for sampling the amplitude of the prior variance becomes
increasingly slower. We discuss a reparametrization of the prior variance that
is robust with respect to the increase in dimension; we give numerical
experiments which exhibit that our reparametrization prevents the slowing down.
Our intuition on the behaviour of the prior hyper-parameter, with and without
reparametrization, is sufficiently general to include a broad class of
nonlinear inverse problems as well as other families of hyper-priors.Comment: to appear, SIAM/ASA Journal on Uncertainty Quantificatio