201 research outputs found
Approximate Inference for Nonstationary Heteroscedastic Gaussian process Regression
This paper presents a novel approach for approximate integration over the
uncertainty of noise and signal variances in Gaussian process (GP) regression.
Our efficient and straightforward approach can also be applied to integration
over input dependent noise variance (heteroscedasticity) and input dependent
signal variance (nonstationarity) by setting independent GP priors for the
noise and signal variances. We use expectation propagation (EP) for inference
and compare results to Markov chain Monte Carlo in two simulated data sets and
three empirical examples. The results show that EP produces comparable results
with less computational burden
A Mutually-Dependent Hadamard Kernel for Modelling Latent Variable Couplings
We introduce a novel kernel that models input-dependent couplings across
multiple latent processes. The pairwise joint kernel measures covariance along
inputs and across different latent signals in a mutually-dependent fashion. A
latent correlation Gaussian process (LCGP) model combines these non-stationary
latent components into multiple outputs by an input-dependent mixing matrix.
Probit classification and support for multiple observation sets are derived by
Variational Bayesian inference. Results on several datasets indicate that the
LCGP model can recover the correlations between latent signals while
simultaneously achieving state-of-the-art performance. We highlight the latent
covariances with an EEG classification dataset where latent brain processes and
their couplings simultaneously emerge from the model.Comment: 17 pages, 6 figures; accepted to ACML 201
Large-scale Heteroscedastic Regression via Gaussian Process
Heteroscedastic regression considering the varying noises among observations
has many applications in the fields like machine learning and statistics. Here
we focus on the heteroscedastic Gaussian process (HGP) regression which
integrates the latent function and the noise function together in a unified
non-parametric Bayesian framework. Though showing remarkable performance, HGP
suffers from the cubic time complexity, which strictly limits its application
to big data. To improve the scalability, we first develop a variational sparse
inference algorithm, named VSHGP, to handle large-scale datasets. Furthermore,
two variants are developed to improve the scalability and capability of VSHGP.
The first is stochastic VSHGP (SVSHGP) which derives a factorized evidence
lower bound, thus enhancing efficient stochastic variational inference. The
second is distributed VSHGP (DVSHGP) which (i) follows the Bayesian committee
machine formalism to distribute computations over multiple local VSHGP experts
with many inducing points; and (ii) adopts hybrid parameters for experts to
guard against over-fitting and capture local variety. The superiority of DVSHGP
and SVSHGP as compared to existing scalable heteroscedastic/homoscedastic GPs
is then extensively verified on various datasets.Comment: 14 pages, 15 figure
Laplace Approximation for Divisive Gaussian Processes for Nonstationary Regression
The standard Gaussian Process regression (GP) is usually formulated under stationary hypotheses: The noise power is considered constant throughout the input space and the covariance of the prior distribution is typically modeled as depending only on the difference between input samples. These assumptions can be too restrictive and unrealistic for many real-world problems. Although nonstationarity can be achieved using specific covariance functions, they require a prior knowledge of the kind of nonstationarity, not available for most applications. In this paper we propose to use the Laplace approximation to make inference in a divisive GP model to perform nonstationary regression, including heteroscedastic noise cases. The log-concavity of the likelihood ensures a unimodal posterior and makes that the Laplace approximation converges to a unique maximum. The characteristics of the likelihood also allow to obtain accurate posterior approximations when compared to the Expectation Propagation (EP) approximations and the asymptotically exact posterior provided by a Markov Chain Monte Carlo implementation with Elliptical Slice Sampling (ESS), but at a reduced computational load with respect to both, EP and ESS
Modulating Scalable Gaussian Processes for Expressive Statistical Learning
For a learning task, Gaussian process (GP) is interested in learning the
statistical relationship between inputs and outputs, since it offers not only
the prediction mean but also the associated variability. The vanilla GP however
struggles to learn complicated distribution with the property of, e.g.,
heteroscedastic noise, multi-modality and non-stationarity, from massive data
due to the Gaussian marginal and the cubic complexity. To this end, this
article studies new scalable GP paradigms including the non-stationary
heteroscedastic GP, the mixture of GPs and the latent GP, which introduce
additional latent variables to modulate the outputs or inputs in order to learn
richer, non-Gaussian statistical representation. We further resort to different
variational inference strategies to arrive at analytical or tighter evidence
lower bounds (ELBOs) of the marginal likelihood for efficient and effective
model training. Extensive numerical experiments against state-of-the-art GP and
neural network (NN) counterparts on various tasks verify the superiority of
these scalable modulated GPs, especially the scalable latent GP, for learning
diverse data distributions.Comment: 31 pages, 9 figures, 4 tables, preprint under revie
- …