77 research outputs found
Deep Gaussian Processes
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a
deep belief network based on Gaussian process mappings. The data is modeled as
the output of a multivariate GP. The inputs to that Gaussian process are then
governed by another GP. A single layer model is equivalent to a standard GP or
the GP latent variable model (GP-LVM). We perform inference in the model by
approximate variational marginalization. This results in a strict lower bound
on the marginal likelihood of the model which we use for model selection
(number of layers and nodes per layer). Deep belief networks are typically
applied to relatively large data sets using stochastic gradient descent for
optimization. Our fully Bayesian treatment allows for the application of deep
models even when data is scarce. Model selection by our variational bound shows
that a five layer hierarchy is justified even when modelling a digit data set
containing only 150 examples.Comment: 9 pages, 8 figures. Appearing in Proceedings of the 16th
International Conference on Artificial Intelligence and Statistics (AISTATS)
201
Manifold Relevance Determination
In this paper we present a fully Bayesian latent variable model which
exploits conditional nonlinear(in)-dependence structures to learn an efficient
latent representation. The latent space is factorized to represent shared and
private information from multiple views of the data. In contrast to previous
approaches, we introduce a relaxation to the discrete segmentation and allow
for a "softly" shared latent space. Further, Bayesian techniques allow us to
automatically estimate the dimensionality of the latent spaces. The model is
capable of capturing structure underlying extremely high dimensional spaces.
This is illustrated by modelling unprocessed images with tenths of thousands of
pixels. This also allows us to directly generate novel images from the trained
model by sampling from the discovered latent spaces. We also demonstrate the
model by prediction of human pose in an ambiguous setting. Our Bayesian
framework allows us to perform disambiguation in a principled manner by
including latent space priors which incorporate the dynamic nature of the data.Comment: ICML201
Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
This paper presents a generic Bayesian framework that enables any deep
learning model to actively learn from targeted crowds. Our framework inherits
from recent advances in Bayesian deep learning, and extends existing work by
considering the targeted crowdsourcing approach, where multiple annotators with
unknown expertise contribute an uncontrolled amount (often limited) of
annotations. Our framework leverages the low-rank structure in annotations to
learn individual annotator expertise, which then helps to infer the true labels
from noisy and sparse annotations. It provides a unified Bayesian model to
simultaneously infer the true labels and train the deep learning model in order
to reach an optimal learning efficacy. Finally, our framework exploits the
uncertainty of the deep learning model during prediction as well as the
annotators' estimated expertise to minimize the number of required annotations
and annotators for optimally training the deep learning model.
We evaluate the effectiveness of our framework for intent classification in
Alexa (Amazon's personal assistant), using both synthetic and real-world
datasets. Experiments show that our framework can accurately learn annotator
expertise, infer true labels, and effectively reduce the amount of annotations
in model training as compared to state-of-the-art approaches. We further
discuss the potential of our proposed framework in bridging machine learning
and crowdsourcing towards improved human-in-the-loop systems
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Factor analysis aims to determine latent factors, or traits, which summarize
a given data set. Inter-battery factor analysis extends this notion to multiple
views of the data. In this paper we show how a nonlinear, nonparametric version
of these models can be recovered through the Gaussian process latent variable
model. This gives us a flexible formalism for multi-view learning where the
latent variables can be used both for exploratory purposes and for learning
representations that enable efficient inference for ambiguous estimation tasks.
Learning is performed in a Bayesian manner through the formulation of a
variational compression scheme which gives a rigorous lower bound on the log
likelihood. Our Bayesian framework provides strong regularization during
training, allowing the structure of the latent space to be determined
efficiently and automatically. We demonstrate this by producing the first (to
our knowledge) published results of learning from dozens of views, even when
data is scarce. We further show experimental results on several different types
of multi-view data sets and for different kinds of tasks, including exploratory
data analysis, generation, ambiguity modelling through latent priors and
classification.Comment: 49 pages including appendi
Factorized Topic Models
In this paper we present a modification to a latent topic model, which makes
the model exploit supervision to produce a factorized representation of the
observed data. The structured parameterization separately encodes variance that
is shared between classes from variance that is private to each class by the
introduction of a new prior over the topic space. The approach allows for a
more eff{}icient inference and provides an intuitive interpretation of the data
in terms of an informative signal together with structured noise. The
factorized representation is shown to enhance inference performance for image,
text, and video classification.Comment: ICLR 201
- …