7,037 research outputs found
A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text
When trained effectively, the Variational Autoencoder (VAE) is both a
powerful language model and an effective representation learning framework. In
practice, however, VAEs are trained with the evidence lower bound (ELBO) as a
surrogate objective to the intractable marginal data likelihood. This approach
to training yields unstable results, frequently leading to a disastrous local
optimum known as posterior collapse. In this paper, we investigate a simple fix
for posterior collapse which yields surprisingly effective results. The
combination of two known heuristics, previously considered only in isolation,
substantially improves held-out likelihood, reconstruction, and latent
representation learning when compared with previous state-of-the-art methods.
More interestingly, while our experiments demonstrate superiority on these
principle evaluations, our method obtains a worse ELBO. We use these results to
argue that the typical surrogate objective for VAEs may not be sufficient or
necessarily appropriate for balancing the goals of representation learning and
data distribution modeling.Comment: EMNLP 2019 short paper. The first two authors contributed equall
Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians
Convolutional neural nets (CNNs) have demonstrated remarkable performance in
recent history. Such approaches tend to work in a unidirectional bottom-up
feed-forward fashion. However, practical experience and biological evidence
tells us that feedback plays a crucial role, particularly for detailed spatial
understanding tasks. This work explores bidirectional architectures that also
reason with top-down feedback: neural units are influenced by both lower and
higher-level units.
We do so by treating units as rectified latent variables in a quadratic
energy function, which can be seen as a hierarchical Rectified Gaussian model
(RGs). We show that RGs can be optimized with a quadratic program (QP), that
can in turn be optimized with a recurrent neural network (with rectified linear
units). This allows RGs to be trained with GPU-optimized gradient descent. From
a theoretical perspective, RGs help establish a connection between CNNs and
hierarchical probabilistic models. From a practical perspective, RGs are well
suited for detailed spatial tasks that can benefit from top-down reasoning. We
illustrate them on the challenging task of keypoint localization under
occlusions, where local bottom-up evidence may be misleading. We demonstrate
state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201
Grammar Variational Autoencoder
Deep generative models have been wildly successful at learning coherent
latent representations for continuous data such as video and audio. However,
generative modeling of discrete data such as arithmetic expressions and
molecular structures still poses significant challenges. Crucially,
state-of-the-art methods often produce outputs that are not valid. We make the
key observation that frequently, discrete data can be represented as a parse
tree from a context-free grammar. We propose a variational autoencoder which
encodes and decodes directly to and from these parse trees, ensuring the
generated outputs are always valid. Surprisingly, we show that not only does
our model more often generate valid outputs, it also learns a more coherent
latent space in which nearby points decode to similar discrete outputs. We
demonstrate the effectiveness of our learned models by showing their improved
performance in Bayesian optimization for symbolic regression and molecular
synthesis
- …