751 research outputs found
Learning Hierarchical Features from Generative Models
Deep neural networks have been shown to be very successful at learning
feature hierarchies in supervised learning tasks. Generative models, on the
other hand, have benefited less from hierarchical models with multiple layers
of latent variables. In this paper, we prove that hierarchical latent variable
models do not take advantage of the hierarchical structure when trained with
existing variational methods, and provide some limitations on the kind of
features existing models can learn. Finally we propose an alternative
architecture that do not suffer from these limitations. Our model is able to
learn highly interpretable and disentangled hierarchical features on several
natural image datasets with no task specific regularization or prior knowledge.Comment: ICML'201
Deconstructing the Ladder Network Architecture
The Manual labeling of data is and will remain a costly endeavor. For this
reason, semi-supervised learning remains a topic of practical importance. The
recently proposed Ladder Network is one such approach that has proven to be
very successful. In addition to the supervised objective, the Ladder Network
also adds an unsupervised objective corresponding to the reconstruction costs
of a stack of denoising autoencoders. Although the empirical results are
impressive, the Ladder Network has many components intertwined, whose
contributions are not obvious in such a complex architecture. In order to help
elucidate and disentangle the different ingredients in the Ladder Network
recipe, this paper presents an extensive experimental investigation of variants
of the Ladder Network in which we replace or remove individual components to
gain more insight into their relative importance. We find that all of the
components are necessary for achieving optimal performance, but they do not
contribute equally. For semi-supervised tasks, we conclude that the most
important contribution is made by the lateral connection, followed by the
application of noise, and finally the choice of what we refer to as the
`combinator function' in the decoder path. We also find that as the number of
labeled training examples increases, the lateral connections and reconstruction
criterion become less important, with most of the improvement in generalization
being due to the injection of noise in each layer. Furthermore, we present a
new type of combinator function that outperforms the original design in both
fully- and semi-supervised tasks, reducing record test error rates on
Permutation-Invariant MNIST to 0.57% for the supervised setting, and to 0.97%
and 1.0% for semi-supervised settings with 1000 and 100 labeled examples
respectively.Comment: Proceedings of the 33 rd International Conference on Machine
Learning, New York, NY, USA, 201
Adversarial Autoencoders
In this paper, we propose the "adversarial autoencoder" (AAE), which is a
probabilistic autoencoder that uses the recently proposed generative
adversarial networks (GAN) to perform variational inference by matching the
aggregated posterior of the hidden code vector of the autoencoder with an
arbitrary prior distribution. Matching the aggregated posterior to the prior
ensures that generating from any part of prior space results in meaningful
samples. As a result, the decoder of the adversarial autoencoder learns a deep
generative model that maps the imposed prior to the data distribution. We show
how the adversarial autoencoder can be used in applications such as
semi-supervised classification, disentangling style and content of images,
unsupervised clustering, dimensionality reduction and data visualization. We
performed experiments on MNIST, Street View House Numbers and Toronto Face
datasets and show that adversarial autoencoders achieve competitive results in
generative modeling and semi-supervised classification tasks
Semi-Supervised Learning with Ladder Networks
We combine supervised learning with unsupervised learning in deep neural
networks. The proposed model is trained to simultaneously minimize the sum of
supervised and unsupervised cost functions by backpropagation, avoiding the
need for layer-wise pre-training. Our work builds on the Ladder network
proposed by Valpola (2015), which we extend by combining the model with
supervision. We show that the resulting model reaches state-of-the-art
performance in semi-supervised MNIST and CIFAR-10 classification, in addition
to permutation-invariant MNIST classification with all labels.Comment: Revised denoising function, updated results, fixed typo
Semi-Amortized Variational Autoencoders
Amortized variational inference (AVI) replaces instance-specific local
inference with a global inference network. While AVI has enabled efficient
training of deep generative models such as variational autoencoders (VAE),
recent empirical work suggests that inference networks can produce suboptimal
variational parameters. We propose a hybrid approach, to use AVI to initialize
the variational parameters and run stochastic variational inference (SVI) to
refine them. Crucially, the local SVI procedure is itself differentiable, so
the inference network and generative model can be trained end-to-end with
gradient-based optimization. This semi-amortized approach enables the use of
rich generative models without experiencing the posterior-collapse phenomenon
common in training VAEs for problems like text generation. Experiments show
this approach outperforms strong autoregressive and variational baselines on
standard text and image datasets.Comment: ICML 201
Recent Advances in Autoencoder-Based Representation Learning
Learning useful representations with little or no supervision is a key
challenge in artificial intelligence. We provide an in-depth review of recent
advances in representation learning with a focus on autoencoder-based models.
To organize these results we make use of meta-priors believed useful for
downstream tasks, such as disentanglement and hierarchical organization of
features. In particular, we uncover three main mechanisms to enforce such
properties, namely (i) regularizing the (approximate or aggregate) posterior
distribution, (ii) factorizing the encoding and decoding distribution, or (iii)
introducing a structured prior distribution. While there are some promising
results, implicit or explicit supervision remains a key enabler and all current
methods use strong inductive biases and modeling assumptions. Finally, we
provide an analysis of autoencoder-based representation learning through the
lens of rate-distortion theory and identify a clear tradeoff between the amount
of prior knowledge available about the downstream tasks, and how useful the
representation is for this task.Comment: Presented at the third workshop on Bayesian Deep Learning (NeurIPS
2018
PixelGAN Autoencoders
In this paper, we describe the "PixelGAN autoencoder", a generative
autoencoder in which the generative path is a convolutional autoregressive
neural network on pixels (PixelCNN) that is conditioned on a latent code, and
the recognition path uses a generative adversarial network (GAN) to impose a
prior distribution on the latent code. We show that different priors result in
different decompositions of information between the latent code and the
autoregressive decoder. For example, by imposing a Gaussian distribution as the
prior, we can achieve a global vs. local decomposition, or by imposing a
categorical distribution as the prior, we can disentangle the style and content
information of images in an unsupervised fashion. We further show how the
PixelGAN autoencoder with a categorical prior can be directly used in
semi-supervised settings and achieve competitive semi-supervised classification
results on the MNIST, SVHN and NORB datasets
Learning to Generate with Memory
Memory units have been widely used to enrich the capabilities of deep
networks on capturing long-term dependencies in reasoning and prediction tasks,
but little investigation exists on deep generative models (DGMs) which are good
at inferring high-level invariant representations from unlabeled data. This
paper presents a deep generative model with a possibly large external memory
and an attention mechanism to capture the local detail information that is
often lost in the bottom-up abstraction process in representation learning. By
adopting a smooth attention model, the whole network is trained end-to-end by
optimizing a variational bound of data likelihood via auto-encoding variational
Bayesian methods, where an asymmetric recognition network is learnt jointly to
infer high-level invariant representations. The asymmetric architecture can
reduce the competition between bottom-up invariant feature extraction and
top-down generation of instance details. Our experiments on several datasets
demonstrate that memory can significantly boost the performance of DGMs and
even achieve state-of-the-art results on various tasks, including density
estimation, image generation, and missing value imputation
MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
Variational Autoencoder (VAE), a simple and effective deep generative model,
has led to a number of impressive empirical successes and spawned many advanced
variants and theoretical investigations. However, recent studies demonstrate
that, when equipped with expressive generative distributions (aka. decoders),
VAE suffers from learning uninformative latent representations with the
observation called KL Varnishing, in which case VAE collapses into an
unconditional generative model. In this work, we introduce mutual
posterior-divergence regularization, a novel regularization that is able to
control the geometry of the latent space to accomplish meaningful
representation learning, while achieving comparable or superior capability of
density estimation. Experiments on three image benchmark datasets demonstrate
that, when equipped with powerful decoders, our model performs well both on
density estimation and representation learning.Comment: Published at ICLR-2019. 12 pages contents + 4 pages appendix, 5
figure
Item Recommendation with Variational Autoencoders and Heterogenous Priors
In recent years, Variational Autoencoders (VAEs) have been shown to be highly
effective in both standard collaborative filtering applications and extensions
such as incorporation of implicit feedback. We extend VAEs to collaborative
filtering with side information, for instance when ratings are combined with
explicit text feedback from the user. Instead of using a user-agnostic standard
Gaussian prior, we incorporate user-dependent priors in the latent VAE space to
encode users' preferences as functions of the review text. Taking into account
both the rating and the text information to represent users in this multimodal
latent space is promising to improve recommendation quality. Our proposed model
is shown to outperform the existing VAE models for collaborative filtering (up
to 29.41% relative improvement in ranking metric) along with other baselines
that incorporate both user ratings and text for item recommendation.Comment: Accepted for the 3rd Workshop on Deep Learning for Recommender
Systems (DLRS 2018), held in conjunction with the 12th ACM Conference on
Recommender Systems (RecSys 2018) in Vancouver, Canad
- …