6 research outputs found
Learnable Explicit Density for Continuous Latent Space and Variational Inference
In this paper, we study two aspects of the variational autoencoder (VAE): the
prior distribution over the latent variables and its corresponding posterior.
First, we decompose the learning of VAEs into layerwise density estimation, and
argue that having a flexible prior is beneficial to both sample generation and
inference. Second, we analyze the family of inverse autoregressive flows
(inverse AF) and show that with further improvement, inverse AF could be used
as universal approximation to any complicated posterior. Our analysis results
in a unified approach to parameterizing a VAE, without the need to restrict
ourselves to use factorial Gaussians in the latent real space.Comment: 2 figures, 5 pages, submitted to ICML Principled Approaches to Deep
Learning worksho
To Regularize or Not To Regularize? The Bias Variance Trade-off in Regularized AEs
Regularized Auto-Encoders (RAEs) form a rich class of neural generative
models. They effectively model the joint-distribution between the data and the
latent space using an Encoder-Decoder combination, with regularization imposed
in terms of a prior over the latent space. Despite their advantages, such as
stability in training, the performance of AE based models has not reached the
superior standards as that of the other generative models such as Generative
Adversarial Networks (GANs). Motivated by this, we examine the effect of the
latent prior on the generation quality of deterministic AE models in this
paper. Specifically, we consider the class of RAEs with deterministic
Encoder-Decoder pairs, Wasserstein Auto-Encoders (WAE), and show that having a
fixed prior distribution, \textit{a priori}, oblivious to the dimensionality of
the `true' latent space, will lead to the infeasibility of the optimization
problem considered. Further, we show that, in the finite data regime, despite
knowing the correct latent dimensionality, there exists a bias-variance
trade-off with any arbitrary prior imposition. As a remedy to both the issues
mentioned above, we introduce an additional state space in the form of flexibly
learnable latent priors, in the optimization objective of the WAEs. We
implicitly learn the distribution of the latent prior jointly with the AE
training, which not only makes the learning objective feasible but also
facilitates operation on different points of the bias-variance curve. We show
the efficacy of our model, called FlexAE, through several experiments on
multiple datasets, and demonstrate that it is the new state-of-the-art for the
AE based generative models
Improving Sequential Latent Variable Models with Autoregressive Flows
We propose an approach for improving sequence modeling based on
autoregressive normalizing flows. Each autoregressive transform, acting across
time, serves as a moving frame of reference, removing temporal correlations,
and simplifying the modeling of higher-level dynamics. This technique provides
a simple, general-purpose method for improving sequence modeling, with
connections to existing and classical techniques. We demonstrate the proposed
approach both with standalone flow-based models and as a component within
sequential latent variable models. Results are presented on three benchmark
video datasets, where autoregressive flow-based dynamics improve log-likelihood
performance over baseline models. Finally, we illustrate the decorrelation and
improved generalization properties of using flow-based dynamics
On the Necessity and Effectiveness of Learning the Prior of Variational Auto-Encoder
Using powerful posterior distributions is a popular approach to achieving
better variational inference. However, recent works showed that the aggregated
posterior may fail to match unit Gaussian prior, thus learning the prior
becomes an alternative way to improve the lower-bound. In this paper, for the
first time in the literature, we prove the necessity and effectiveness of
learning the prior when aggregated posterior does not match unit Gaussian
prior, analyze why this situation may happen, and propose a hypothesis that
learning the prior may improve reconstruction loss, all of which are supported
by our extensive experiment results. We show that using learned Real NVP prior
and just one latent variable in VAE, we can achieve test NLL comparable to very
deep state-of-the-art hierarchical VAE, outperforming many previous works with
complex hierarchical VAE architectures
A Tutorial on Deep Latent Variable Models of Natural Language
There has been much recent, exciting work on combining the complementary
strengths of latent variable models and deep learning. Latent variable modeling
makes it easy to explicitly specify model constraints through conditional
independence properties, while deep learning makes it possible to parameterize
these conditional likelihoods with powerful function approximators. While these
"deep latent variable" models provide a rich, flexible framework for modeling
many real-world phenomena, difficulties exist: deep parameterizations of
conditional likelihoods usually make posterior inference intractable, and
latent variable objectives often complicate backpropagation by introducing
points of non-differentiability. This tutorial explores these issues in depth
through the lens of variational inference.Comment: EMNLP 2018 Tutoria
Predictive Coding, Variational Autoencoders, and Biological Connections
This paper reviews predictive coding, from theoretical neuroscience, and
variational autoencoders, from machine learning, identifying the common origin
and mathematical framework underlying both areas. As each area is prominent
within its respective field, more firmly connecting these areas could prove
useful in the dialogue between neuroscience and machine learning. After
reviewing each area, we discuss two possible correspondences implied by this
perspective: cortical pyramidal dendrites as analogous to (non-linear) deep
networks and lateral inhibition as analogous to normalizing flows. These
connections may provide new directions for further investigations in each
field.Comment: NeurIPS NeuroAI Workshop, NAISys, Neural Computatio