19,001 research outputs found
Generative Models For Deep Learning with Very Scarce Data
The goal of this paper is to deal with a data scarcity scenario where deep
learning techniques use to fail. We compare the use of two well established
techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as
generative models in order to increase the training set in a classification
framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms
for generating new samples. We show that generalization can be improved
comparing this methodology to other state-of-the-art techniques, e.g.
semi-supervised learning with ladder networks. Furthermore, we show that RBM is
better than VAE generating new samples for training a classifier with good
generalization capabilities
Learning recurrent representations for hierarchical behavior modeling
We propose a framework for detecting action patterns from motion sequences
and modeling the sensory-motor relationship of animals, using a generative
recurrent neural network. The network has a discriminative part (classifying
actions) and a generative part (predicting motion), whose recurrent cells are
laterally connected, allowing higher levels of the network to represent high
level phenomena. We test our framework on two types of data, fruit fly behavior
and online handwriting. Our results show that 1) taking advantage of unlabeled
sequences, by predicting future motion, significantly improves action detection
performance when training labels are scarce, 2) the network learns to represent
high level phenomena such as writer identity and fly gender, without
supervision, and 3) simulated motion trajectories, generated by treating motion
prediction as input to the network, look realistic and may be used to
qualitatively evaluate whether the model has learnt generative control rules
Deep Gaussian Processes
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a
deep belief network based on Gaussian process mappings. The data is modeled as
the output of a multivariate GP. The inputs to that Gaussian process are then
governed by another GP. A single layer model is equivalent to a standard GP or
the GP latent variable model (GP-LVM). We perform inference in the model by
approximate variational marginalization. This results in a strict lower bound
on the marginal likelihood of the model which we use for model selection
(number of layers and nodes per layer). Deep belief networks are typically
applied to relatively large data sets using stochastic gradient descent for
optimization. Our fully Bayesian treatment allows for the application of deep
models even when data is scarce. Model selection by our variational bound shows
that a five layer hierarchy is justified even when modelling a digit data set
containing only 150 examples.Comment: 9 pages, 8 figures. Appearing in Proceedings of the 16th
International Conference on Artificial Intelligence and Statistics (AISTATS)
201
- …