9,950 research outputs found
Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net
We propose a novel method to directly learn a stochastic transition operator
whose repeated application provides generated samples. Traditional undirected
graphical models approach this problem indirectly by learning a Markov chain
model whose stationary distribution obeys detailed balance with respect to a
parameterized energy function. The energy function is then modified so the
model and data distributions match, with no guarantee on the number of steps
required for the Markov chain to converge. Moreover, the detailed balance
condition is highly restrictive: energy based models corresponding to neural
networks must have symmetric weights, unlike biological neural circuits. In
contrast, we develop a method for directly learning arbitrarily parameterized
transition operators capable of expressing non-equilibrium stationary
distributions that violate detailed balance, thereby enabling us to learn more
biologically plausible asymmetric neural networks and more general non-energy
based dynamical systems. The proposed training objective, which we derive via
principled variational methods, encourages the transition operator to "walk
back" in multi-step trajectories that start at data-points, as quickly as
possible back to the original data points. We present a series of experimental
results illustrating the soundness of the proposed approach, Variational
Walkback (VW), on the MNIST, CIFAR-10, SVHN and CelebA datasets, demonstrating
superior samples compared to earlier attempts to learn a transition operator.
We also show that although each rapid training trajectory is limited to a
finite but variable number of steps, our transition operator continues to
generate good samples well past the length of such trajectories, thereby
demonstrating the match of its non-equilibrium stationary distribution to the
data distribution. Source Code: http://github.com/anirudh9119/walkback_nips17Comment: To appear at NIPS 201
Learning to sample from noise with deep generative models
L’apprentissage automatique et spécialement l’apprentissage profond se sont imposés ces
dernières années pour résoudre une large variété de tâches. Une des applications les plus
remarquables concerne la vision par ordinateur. Les systèmes de détection ou de classification ont connu des avancées majeurs grâce a l’apprentissage profond. Cependant, il reste de
nombreux obstacles à une compréhension du monde similaire aux être vivants. Ces derniers
n’ont pas besoin de labels pour classifier, pour extraire des caractéristiques du monde réel.
L’apprentissage non supervisé est un des axes de recherche qui se concentre sur la résolution
de ce problème.
Dans ce mémoire, je présente un nouveau moyen d’entrainer des réseaux de neurones de
manière non supervisée. Je présente une méthode permettant d’échantillonner de manière
itérative a partir de bruit afin de générer des données qui se rapprochent des données
d’entrainement. Cette procédure itérative s’appelle l’entrainement par infusion qui est une
nouvelle approche permettant d’apprendre l’opérateur de transition d’une chaine de Markov.
Dans le premier chapitre, j’introduis des bases concernant l’apprentissage automatique et la
théorie des probabilités. Dans le second chapitre, j’expose les modèles génératifs qui ont
inspiré ce travail. Dans le troisième et dernier chapitre, je présente comment améliorer
l’échantillonnage dans les modèles génératifs avec l’entrainement par infusion.Machine learning and specifically deep learning has made significant breakthroughs in recent
years concerning different tasks. One well known application of deep learning is computer vision. Tasks such as detection or classification are nearly considered solved by the community.
However, training state-of-the-art models for such tasks requires to have labels associated
to the data we want to classify. A more general goal is, similarly to animal brains, to be
able to design algorithms that can extract meaningful features from data that aren’t labeled.
Unsupervised learning is one of the axes that try to solve this problem.
In this thesis, I present a new way to train a neural network as a generative model capable of
generating quality samples (a task akin to imagining). I explain how by starting from noise,
it is possible to get samples which are close to the training data. This iterative procedure
is called Infusion training and is a novel approach to learning the transition operator of a
generative Markov chain.
In the first chapter, I present some background about machine learning and probabilistic
models. The second chapter presents generative models that inspired this work. The third
and last chapter presents and investigates our novel approach to learn a generative model
with Infusion training
- …