1,067 research outputs found
Extending Stan for Deep Probabilistic Programming
Stan is a popular declarative probabilistic programming language with a
high-level syntax for expressing graphical models and beyond. Stan differs by
nature from generative probabilistic programming languages like Church,
Anglican, or Pyro. This paper presents a comprehensive compilation scheme to
compile any Stan model to a generative language and proves its correctness.
This sheds a clearer light on the relative expressiveness of different kinds of
probabilistic languages and opens the door to combining their mutual strengths.
Specifically, we use our compilation scheme to build a compiler from Stan to
Pyro and extend Stan with support for explicit variational inference guides and
deep probabilistic models. That way, users familiar with Stan get access to new
features without having to learn a fundamentally new language. Overall, our
paper clarifies the relationship between declarative and generative
probabilistic programming languages and is a step towards making deep
probabilistic programming easier
VAE with a VampPrior
Many different methods to train deep generative models have been introduced
in the past. In this paper, we propose to extend the variational auto-encoder
(VAE) framework with a new type of prior which we call "Variational Mixture of
Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture
distribution (e.g., a mixture of Gaussians) with components given by
variational posteriors conditioned on learnable pseudo-inputs. We further
extend this prior to a two layer hierarchical model and show that this
architecture with a coupled prior and posterior, learns significantly better
models. The model also avoids the usual local optima issues related to useless
latent dimensions that plague VAEs. We provide empirical studies on six
datasets, namely, static and binary MNIST, OMNIGLOT, Caltech 101 Silhouettes,
Frey Faces and Histopathology patches, and show that applying the hierarchical
VampPrior delivers state-of-the-art results on all datasets in the unsupervised
permutation invariant setting and the best results or comparable to SOTA
methods for the approach with convolutional networks.Comment: 16 pages, final version, AISTATS 201
Resampled Priors for Variational Autoencoders
We propose Learned Accept/Reject Sampling (LARS), a method for constructing
richer priors using rejection sampling with a learned acceptance function. This
work is motivated by recent analyses of the VAE objective, which pointed out
that commonly used simple priors can lead to underfitting. As the distribution
induced by LARS involves an intractable normalizing constant, we show how to
estimate it and its gradients efficiently. We demonstrate that LARS priors
improve VAE performance on several standard datasets both when they are learned
jointly with the rest of the model and when they are fitted to a pretrained
model. Finally, we show that LARS can be combined with existing methods for
defining flexible priors for an additional boost in performance
Normalizing Flow with Variational Latent Representation
Normalizing flow (NF) has gained popularity over traditional maximum
likelihood based methods due to its strong capability to model complex data
distributions. However, the standard approach, which maps the observed data to
a normal distribution, has difficulty in handling data distributions with
multiple relatively isolated modes. To overcome this issue, we propose a new
framework based on variational latent representation to improve the practical
performance of NF. The idea is to replace the standard normal latent variable
with a more general latent representation, jointly learned via Variational
Bayes. For example, by taking the latent representation as a discrete sequence,
our framework can learn a Transformer model that generates the latent sequence
and an NF model that generates continuous data distribution conditioned on the
sequence. The resulting method is significantly more powerful than the standard
normalization flow approach for generating data distributions with multiple
modes. Extensive experiments have shown the advantages of NF with variational
latent representation.Comment: 24 pages, 7 figure
- …