Search CORE

1,560 research outputs found

HyperVAE: A Minimum Description Length Variational Hyper-Encoding Network

Author: Dam Hieu-Chi
Gupta Sunil
Nguyen Phuoc
Rana Santu
Tran Truyen
Venkatesh Svetha
Publication venue
Publication date: 18/05/2020
Field of study

We propose a framework called HyperVAE for encoding distributions of distributions. When a target distribution is modeled by a VAE, its neural network parameters \theta is drawn from a distribution p(\theta) which is modeled by a hyper-level VAE. We propose a variational inference using Gaussian mixture models to implicitly encode the parameters \theta into a low dimensional Gaussian distribution. Given a target distribution, we predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(\theta). HyperVAE can encode the parameters \theta in full in contrast to common hyper-networks practices, which generate only the scale and bias vectors as target-network parameters. Thus HyperVAE preserves much more information about the model for each task in the latent space. We discuss HyperVAE using the minimum description length (MDL) principle and show that it helps HyperVAE to generalize. We evaluate HyperVAE in density estimation tasks, outlier detection and discovery of novel design classes, demonstrating its efficacy

arXiv.org e-Print Archive

Recommended from our members

Advances in Compression using Probabilistic Models

Author: Havasi Marton
Publication venue: University of Cambridge
Publication date: 16/12/2021
Field of study

The increasing demand for data transmission and storage necessitate the use of efficient compression methods. Compression algorithms work by mapping data to a more compact representation from which the original data can be recovered. To operate efficiently, they need to capture the characteristics of the data distribution, which can be difficult, especially for high-dimensional data. One emerging solution lies in applying probabilistic machine learning to capture the data distribution in an unsupervised manner. Once a probabilistic model for the data is defined, variational inference can be used to infer its parameters from data. Variational inference is closely related to the optimal compression size, as stated by Hinton's bits-back argument: the evidence lower bound, the objective optimized by variational inference, corresponds to a lower bound on the optimal compression size of the average datapoint. However, current compression methods rely on variational inference merely as a heuristic, and they do not approach its postulated efficiency. In this thesis, we present principled and practical algorithms that get closer to this limit. After discussing our approach, we demonstrate its efficacy in image compression and model compression. First, we focus on image compression, where we use a variational autoencoder to learn a mapping between the images and their unobserved, latent representations. We propose a stochastic coding scheme to encode the latent representation, from which the original image can be approximately reconstructed. Next, we look at the compression of deep learning models. We use variational inference to approximate the posterior distribution of the weights in a neural network, and apply our stochastic coding scheme to encode a weight configuration. Finally, we investigate a connection between variational inference and our compression algorithm. We show that a technique we used for compression can improve variational inference by generating samples from a highly flexible posterior approximation, without significantly increasing the computational costs

Apollo (Cambridge)