4 research outputs found
Riemannian Normalizing Flow on Variational Wasserstein Autoencoder for Text Modeling
Recurrent Variational Autoencoder has been widely used for language modeling
and text generation tasks. These models often face a difficult optimization
problem, also known as the Kullback-Leibler (KL) term vanishing issue, where
the posterior easily collapses to the prior, and the model will ignore latent
codes in generative tasks. To address this problem, we introduce an improved
Wasserstein Variational Autoencoder (WAE) with Riemannian Normalizing Flow
(RNF) for text modeling. The RNF transforms a latent variable into a space that
respects the geometric characteristics of input space, which makes posterior
impossible to collapse to the non-informative prior. The Wasserstein objective
minimizes the distance between the marginal distribution and the prior directly
and therefore does not force the posterior to match the prior. Empirical
experiments show that our model avoids KL vanishing over a range of datasets
and has better performances in tasks such as language modeling, likelihood
approximation, and text generation. Through a series of experiments and
analysis over latent space, we show that our model learns latent distributions
that respect latent space geometry and is able to generate sentences that are
more diverse.Comment: NAACL 2019 (oral
Conditional Flow Variational Autoencoders for Structured Sequence Prediction
Prediction of future states of the environment and interacting agents is a
key competence required for autonomous agents to operate successfully in the
real world. Prior work for structured sequence prediction based on latent
variable models imposes a uni-modal standard Gaussian prior on the latent
variables. This induces a strong model bias which makes it challenging to fully
capture the multi-modality of the distribution of the future states. In this
work, we introduce Conditional Flow Variational Autoencoders (CF-VAE) using our
novel conditional normalizing flow based prior to capture complex multi-modal
conditional distributions for effective structured sequence prediction.
Moreover, we propose two novel regularization schemes which stabilizes training
and deals with posterior collapse for stable training and better fit to the
target data distribution. Our experiments on three multi-modal structured
sequence prediction datasets -- MNIST Sequences, Stanford Drone and HighD --
show that the proposed method obtains state of art results across different
evaluation metrics.Comment: To appear at Bayesian Deep Learning and Machine Learning for
Autonomous Driving @NeurIPS 201
Normalizing Flows on Tori and Spheres
Normalizing flows are a powerful tool for building expressive distributions
in high dimensions. So far, most of the literature has concentrated on learning
flows on Euclidean spaces. Some problems however, such as those involving
angles, are defined on spaces with more complex geometries, such as tori or
spheres. In this paper, we propose and compare expressive and numerically
stable flows on such spaces. Our flows are built recursively on the dimension
of the space, starting from flows on circles, closed intervals or spheres.Comment: Accepted to the International Conference on Machine Learning (ICML)
202
Neural Manifold Ordinary Differential Equations
To better conform to data geometry, recent deep generative modelling
techniques adapt Euclidean constructions to non-Euclidean spaces. In this
paper, we study normalizing flows on manifolds. Previous work has developed
flow models for specific cases; however, these advancements hand craft layers
on a manifold-by-manifold basis, restricting generality and inducing cumbersome
design constraints. We overcome these issues by introducing Neural Manifold
Ordinary Differential Equations, a manifold generalization of Neural ODEs,
which enables the construction of Manifold Continuous Normalizing Flows
(MCNFs). MCNFs require only local geometry (therefore generalizing to arbitrary
manifolds) and compute probabilities with continuous change of variables
(allowing for a simple and expressive flow construction). We find that
leveraging continuous manifold dynamics produces a marked improvement for both
density estimation and downstream tasks.Comment: Submitted to NeurIPS 202