38,003 research outputs found
Experimental Evaluation of Latent Variable Models for Dimensionality Reduction
We use electropalatographic (EPG) data as a test bed for dimensionality reduction methods based in latent variable modelling, in which an underlying lower dimension representation is inferred directly from the data. Several models (and mixtures of them) are investigated, including factor analysis and the generative topographic mapping. Experiments indicate that nonlinear latent variable modelling reveals a low-dimensional structure in the data inaccessible to the investigated linear model
Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference
Gaussian process latent variable models (GPLVM) are a flexible and non-linear
approach to dimensionality reduction, extending classical Gaussian processes to
an unsupervised learning context. The Bayesian incarnation of the GPLVM Titsias
and Lawrence, 2010] uses a variational framework, where the posterior over
latent variables is approximated by a well-behaved variational family, a
factorized Gaussian yielding a tractable lower bound. However, the
non-factories ability of the lower bound prevents truly scalable inference. In
this work, we study the doubly stochastic formulation of the Bayesian GPLVM
model amenable with minibatch training. We show how this framework is
compatible with different latent variable formulations and perform experiments
to compare a suite of models. Further, we demonstrate how we can train in the
presence of massively missing data and obtain high-fidelity reconstructions. We
demonstrate the model's performance by benchmarking against the canonical
sparse GPLVM for high-dimensional data examples.Comment: AISTATS 202
Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models
Gaussian processes (GPs) are a powerful tool for probabilistic inference over
functions. They have been applied to both regression and non-linear
dimensionality reduction, and offer desirable properties such as uncertainty
estimates, robustness to over-fitting, and principled ways for tuning
hyper-parameters. However the scalability of these models to big datasets
remains an active topic of research. We introduce a novel re-parametrisation of
variational inference for sparse GP regression and latent variable models that
allows for an efficient distributed algorithm. This is done by exploiting the
decoupling of the data given the inducing points to re-formulate the evidence
lower bound in a Map-Reduce setting. We show that the inference scales well
with data and computational resources, while preserving a balanced distribution
of the load among the nodes. We further demonstrate the utility in scaling
Gaussian processes to big data. We show that GP performance improves with
increasing amounts of data in regression (on flight data with 2 million
records) and latent variable modelling (on MNIST). The results show that GPs
perform better than many common models often used for big data.Comment: 9 pages, 8 figure
- …