Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot
classification, conditional and unconditional generation) as inference within a
single generative model. However, when this generative model is expressed as a
powerful neural network such as a PixelCNN, we show that existing learning
techniques typically fail to effectively use latent variables. To address this,
we develop a modification of the Variational Autoencoder in which encoded
observations are decoded to new elements from the same class. This technique,
which we call a Variational Homoencoder (VHE), produces a hierarchical latent
variable model which better utilises latent variables. We use the VHE framework
to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all
existing models on test set likelihood and achieves strong performance on
one-shot generation and classification tasks. We additionally validate the VHE
on natural images from the YouTube Faces database. Finally, we develop
extensions of the model that apply to richer dataset structures such as
factorial and hierarchical categories.Comment: UAI 2018 oral presentatio