Variational Autoencoders (VAEs) represent the given data in a low-dimensional
latent space, which is generally assumed to be Euclidean. This assumption
naturally leads to the common choice of a standard Gaussian prior over
continuous latent variables. Recent work has, however, shown that this prior
has a detrimental effect on model capacity, leading to subpar performance. We
propose that the Euclidean assumption lies at the heart of this failure mode.
To counter this, we assume a Riemannian structure over the latent space, which
constitutes a more principled geometric view of the latent codes, and replace
the standard Gaussian prior with a Riemannian Brownian motion prior. We propose
an efficient inference scheme that does not rely on the unknown normalizing
factor of this prior. Finally, we demonstrate that this prior significantly
increases model capacity using only one additional scalar parameter.Comment: Published in ICML 202