7,208 research outputs found
Variance Loss in Variational Autoencoders
In this article, we highlight what appears to be major issue of Variational
Autoencoders, evinced from an extensive experimentation with different network
architectures and datasets: the variance of generated data is significantly
lower than that of training data. Since generative models are usually evaluated
with metrics such as the Frechet Inception Distance (FID) that compare the
distributions of (features of) real versus generated images, the variance loss
typically results in degraded scores. This problem is particularly relevant in
a two stage setting, where we use a second VAE to sample in the latent space of
the first VAE. The minor variance creates a mismatch between the actual
distribution of latent variables and those generated by the second VAE, that
hinders the beneficial effects of the second stage. Renormalizing the output of
the second VAE towards the expected normal spherical distribution, we obtain a
sudden burst in the quality of generated samples, as also testified in terms of
FID.Comment: Article accepted at the Sixth International Conference on Machine
Learning, Optimization, and Data Science. July 19-23, 2020 - Certosa di
Pontignano, Siena, Ital
Diffusion Variational Autoencoders
A standard Variational Autoencoder, with a Euclidean latent space, is
structurally incapable of capturing topological properties of certain datasets.
To remove topological obstructions, we introduce Diffusion Variational
Autoencoders with arbitrary manifolds as a latent space. A Diffusion
Variational Autoencoder uses transition kernels of Brownian motion on the
manifold. In particular, it uses properties of the Brownian motion to implement
the reparametrization trick and fast approximations to the KL divergence. We
show that the Diffusion Variational Autoencoder is capable of capturing
topological properties of synthetic datasets. Additionally, we train MNIST on
spheres, tori, projective spaces, SO(3), and a torus embedded in R3. Although a
natural dataset like MNIST does not have latent variables with a clear-cut
topological structure, training it on a manifold can still highlight
topological and geometrical properties.Comment: 10 pages, 8 figures Added an appendix with derivation of asymptotic
expansion of KL divergence for heat kernel on arbitrary Riemannian manifolds,
and an appendix with new experiments on binarized MNIST. Added a previously
missing factor in the asymptotic expansion of the heat kernel and corrected a
coefficient in asymptotic expansion KL divergence; further minor edit
- …