1,560 research outputs found
HyperVAE: A Minimum Description Length Variational Hyper-Encoding Network
We propose a framework called HyperVAE for encoding distributions of
distributions. When a target distribution is modeled by a VAE, its neural
network parameters \theta is drawn from a distribution p(\theta) which is
modeled by a hyper-level VAE. We propose a variational inference using Gaussian
mixture models to implicitly encode the parameters \theta into a low
dimensional Gaussian distribution. Given a target distribution, we predict the
posterior distribution of the latent code, then use a matrix-network decoder to
generate a posterior distribution q(\theta). HyperVAE can encode the parameters
\theta in full in contrast to common hyper-networks practices, which generate
only the scale and bias vectors as target-network parameters. Thus HyperVAE
preserves much more information about the model for each task in the latent
space. We discuss HyperVAE using the minimum description length (MDL) principle
and show that it helps HyperVAE to generalize. We evaluate HyperVAE in density
estimation tasks, outlier detection and discovery of novel design classes,
demonstrating its efficacy
Recommended from our members
Advances in Compression using Probabilistic Models
The increasing demand for data transmission and storage necessitate the use of efficient compression methods. Compression algorithms work by mapping data to a more compact representation from which the original data can be recovered. To operate efficiently, they need to capture the characteristics of the data distribution, which can be difficult, especially for high-dimensional data.
One emerging solution lies in applying probabilistic machine learning to capture the data distribution in an unsupervised manner. Once a probabilistic model for the data is defined, variational inference can be used to infer its parameters from data. Variational inference is closely related to the optimal compression size, as stated by Hinton's bits-back argument: the evidence lower bound, the objective optimized by variational inference, corresponds to a lower bound on the optimal compression size of the average datapoint. However, current compression methods rely on variational inference merely as a heuristic, and they do not approach its postulated efficiency. In this thesis, we present principled and practical algorithms that get closer to this limit. After discussing our approach, we demonstrate its efficacy in image compression and model compression.
First, we focus on image compression, where we use a variational autoencoder to learn a mapping between the images and their unobserved, latent representations. We propose a stochastic coding scheme to encode the latent representation, from which the original image can be approximately reconstructed. Next, we look at the compression of deep learning models. We use variational inference to approximate the posterior distribution of the weights in a neural network, and apply our stochastic coding scheme to encode a weight configuration. Finally, we investigate a connection between variational inference and our compression algorithm. We show that a technique we used for compression can improve variational inference by generating samples from a highly flexible posterior approximation, without significantly increasing the computational costs
- …