846 research outputs found
Auxiliary Guided Autoregressive Variational Autoencoders
Generative modeling of high-dimensional data is a key problem in machine
learning. Successful approaches include latent variable models and
autoregressive models. The complementary strengths of these approaches, to
model global and local image statistics respectively, suggest hybrid models
that encode global image structure into latent variables while autoregressively
modeling low level detail. Previous approaches to such hybrid models restrict
the capacity of the autoregressive decoder to prevent degenerate models that
ignore the latent variables and only rely on autoregressive modeling. Our
contribution is a training procedure relying on an auxiliary loss function that
controls which information is captured by the latent variables and what is left
to the autoregressive decoder. Our approach can leverage arbitrarily powerful
autoregressive decoders, achieves state-of-the art quantitative performance
among models with latent variables, and generates qualitatively convincing
samples.Comment: Published as a conference paper at ECML-PKDD 201
The Variational Homoencoder: Learning to learn high capacity generative models from few examples
Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot
classification, conditional and unconditional generation) as inference within a
single generative model. However, when this generative model is expressed as a
powerful neural network such as a PixelCNN, we show that existing learning
techniques typically fail to effectively use latent variables. To address this,
we develop a modification of the Variational Autoencoder in which encoded
observations are decoded to new elements from the same class. This technique,
which we call a Variational Homoencoder (VHE), produces a hierarchical latent
variable model which better utilises latent variables. We use the VHE framework
to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all
existing models on test set likelihood and achieves strong performance on
one-shot generation and classification tasks. We additionally validate the VHE
on natural images from the YouTube Faces database. Finally, we develop
extensions of the model that apply to richer dataset structures such as
factorial and hierarchical categories.Comment: UAI 2018 oral presentatio
Neural Discrete Representation Learning
Learning useful representations without supervision remains a key challenge
in machine learning. In this paper, we propose a simple yet powerful generative
model that learns such discrete representations. Our model, the Vector
Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways:
the encoder network outputs discrete, rather than continuous, codes; and the
prior is learnt rather than static. In order to learn a discrete latent
representation, we incorporate ideas from vector quantisation (VQ). Using the
VQ method allows the model to circumvent issues of "posterior collapse" --
where the latents are ignored when they are paired with a powerful
autoregressive decoder -- typically observed in the VAE framework. Pairing
these representations with an autoregressive prior, the model can generate high
quality images, videos, and speech as well as doing high quality speaker
conversion and unsupervised learning of phonemes, providing further evidence of
the utility of the learnt representations
Spatial Variational Auto-Encoding via Matrix-Variate Normal Distributions
The key idea of variational auto-encoders (VAEs) resembles that of
traditional auto-encoder models in which spatial information is supposed to be
explicitly encoded in the latent space. However, the latent variables in VAEs
are vectors, which can be interpreted as multiple feature maps of size 1x1.
Such representations can only convey spatial information implicitly when
coupled with powerful decoders. In this work, we propose spatial VAEs that use
feature maps of larger size as latent variables to explicitly capture spatial
information. This is achieved by allowing the latent variables to be sampled
from matrix-variate normal (MVN) distributions whose parameters are computed
from the encoder network. To increase dependencies among locations on latent
feature maps and reduce the number of parameters, we further propose spatial
VAEs via low-rank MVN distributions. Experimental results show that the
proposed spatial VAEs outperform original VAEs in capturing rich structural and
spatial information.Comment: Accepted by SDM2019. Code is publicly available at
https://github.com/divelab/sva
- …