7 research outputs found
Semi-described and semi-supervised learning with Gaussian processes
Propagating input uncertainty through non-linear Gaussian process (GP) mappings is intractable. This hinders the task of training GPs using uncertain and partially observed inputs. In this paper we refer to this task as "semi-described learning". We then introduce a GP framework that solves both, the semi-described and the semi-supervised learning problems (where missing values occur in the outputs). Auto-regressive state space simulation is also recognised as a special case of semi-described learning. To achieve our goal we develop variational methods for handling semi-described inputs in GPs, and couple them with algorithms that allow for imputing the missing values while treating the uncertainty in a principled, Bayesian manner. Extensive experiments on simulated and real-world data study the problems of iterative forecasting and regression/classification with missing values. The results suggest that the principled propagation of uncertainty stemming from our framework can significantly improve performance in these tasks
Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models
The interpretation of complex high-dimensional data typically requires the
use of dimensionality reduction techniques to extract explanatory
low-dimensional representations. However, in many real-world problems these
representations may not be sufficient to aid interpretation on their own, and
it would be desirable to interpret the model in terms of the original features
themselves. Our goal is to characterise how feature-level variation depends on
latent low-dimensional representations, external covariates, and non-linear
interactions between the two. In this paper, we propose to achieve this through
a structured kernel decomposition in a hybrid Gaussian Process model which we
call the Covariate Gaussian Process Latent Variable Model (c-GPLVM). We
demonstrate the utility of our model on simulated examples and applications in
disease progression modelling from high-dimensional gene expression data in the
presence of additional phenotypes. In each setting we show how the c-GPLVM can
extract low-dimensional structures from high-dimensional data sets whilst
allowing a breakdown of feature-level variability that is not present in other
commonly used dimensionality reduction approaches
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Human face exhibits an inherent hierarchy in its representations (i.e.,
holistic facial expressions can be encoded via a set of facial action units
(AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown
great results in unsupervised extraction of hierarchical latent representations
from large amounts of image data, while being robust to noise and other
undesired artifacts. Potentially, this makes VAEs a suitable approach for
learning facial features for AU intensity estimation. Yet, most existing
VAE-based methods apply classifiers learned separately from the encoded
features. By contrast, the non-parametric (probabilistic) approaches, such as
Gaussian Processes (GPs), typically outperform their parametric counterparts,
but cannot deal easily with large amounts of data. To this end, we propose a
novel VAE semi-parametric modeling framework, named DeepCoder, which combines
the modeling power of parametric (convolutional) and nonparametric (ordinal
GPs) VAEs, for joint learning of (1) latent representations at multiple levels
in a task hierarchy1, and (2) classification of multiple ordinal outputs. We
show on benchmark datasets for AU intensity estimation that the proposed
DeepCoder outperforms the state-of-the-art approaches, and related VAEs and
deep learning models.Comment: ICCV 2017 - accepte
Weakly-supervised Multi-output Regression via Correlated Gaussian Processes
Multi-output regression seeks to infer multiple latent functions using data
from multiple groups/sources while accounting for potential between-group
similarities. In this paper, we consider multi-output regression under a
weakly-supervised setting where a subset of data points from multiple groups
are unlabeled. We use dependent Gaussian processes for multiple outputs
constructed by convolutions with shared latent processes. We introduce
hyperpriors for the multinomial probabilities of the unobserved labels and
optimize the hyperparameters which we show improves estimation. We derive two
variational bounds: (i) a modified variational bound for fast and stable
convergence in model inference, (ii) a scalable variational bound that is
amenable to stochastic optimization. We use experiments on synthetic and
real-world data to show that the proposed model outperforms state-of-the-art
models with more accurate estimation of multiple latent functions and
unobserved labels
Continual Multi-task Gaussian Processes
We address the problem of continual learning in multi-task Gaussian process
(GP) models for handling sequential input-output observations. Our approach
extends the existing prior-posterior recursion of online Bayesian inference,
i.e.\ past posterior discoveries become future prior beliefs, to the infinite
functional space setting of GP. For a reason of scalability, we introduce
variational inference together with an sparse approximation based on inducing
inputs. As a consequence, we obtain tractable continual lower-bounds where two
novel Kullback-Leibler (KL) divergences intervene in a natural way. The key
technical property of our method is the recursive reconstruction of conditional
GP priors conditioned on the variational parameters learned so far. To achieve
this goal, we introduce a novel factorization of past variational
distributions, where the predictive GP equation propagates the posterior
uncertainty forward. We then demonstrate that it is possible to derive GP
models over many types of sequential observations, either discrete or
continuous and amenable to stochastic optimization. The continual inference
approach is also applicable to scenarios where potential multi-channel or
heterogeneous observations might appear. Extensive experiments demonstrate that
the method is fully scalable, shows a reliable performance and is robust to
uncertainty error propagation over a plenty of synthetic and real-world
datasets