2,275 research outputs found
Variational Sequential Monte Carlo
Many recent advances in large scale probabilistic inference rely on
variational methods. The success of variational approaches depends on (i)
formulating a flexible parametric family of distributions, and (ii) optimizing
the parameters to find the member of this family that most closely approximates
the exact posterior. In this paper we present a new approximating family of
distributions, the variational sequential Monte Carlo (VSMC) family, and show
how to optimize it in variational inference. VSMC melds variational inference
(VI) and sequential Monte Carlo (SMC), providing practitioners with flexible,
accurate, and powerful Bayesian inference. The VSMC family is a variational
family that can approximate the posterior arbitrarily well, while still
allowing for efficient optimization of its parameters. We demonstrate its
utility on state space models, stochastic volatility models for financial data,
and deep Markov models of brain neural circuits
Learning with MISELBO: The Mixture Cookbook
Mixture models in variational inference (VI) is an active field of research.
Recent works have established their connection to multiple importance sampling
(MIS) through the MISELBO and advanced the use of ensemble approximations for
large-scale problems. However, as we show here, an independent learning of the
ensemble components can lead to suboptimal diversity. Hence, we study the
effect of instead using MISELBO as an objective function for learning mixtures,
and we propose the first ever mixture of variational approximations for a
normalizing flow-based hierarchical variational autoencoder (VAE) with
VampPrior and a PixelCNN decoder network. Two major insights led to the
construction of this novel composite model. First, mixture models have
potential to be off-the-shelf tools for practitioners to obtain more flexible
posterior approximations in VAEs. Therefore, we make them more accessible by
demonstrating how to apply them to four popular architectures. Second, the
mixture components cooperate in order to cover the target distribution while
trying to maximize their diversity when MISELBO is the objective function. We
explain this cooperative behavior by drawing a novel connection between VI and
adaptive importance sampling. Finally, we demonstrate the superiority of the
Mixture VAEs' learned feature representations on both image and single-cell
transcriptome data, and obtain state-of-the-art results among VAE architectures
in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
Code available here: \url{https://github.com/Lagergren-Lab/MixtureVAEs}
Adaptive Density Estimation for Generative Models
Unsupervised learning of generative models has seen tremendous progress over
recent years, in particular due to generative adversarial networks (GANs),
variational autoencoders, and flow-based models. GANs have dramatically
improved sample quality, but suffer from two drawbacks: (i) they mode-drop,
i.e., do not cover the full support of the train data, and (ii) they do not
allow for likelihood evaluations on held-out data. In contrast,
likelihood-based training encourages models to cover the full support of the
train data, but yields poorer samples. These mutual shortcomings can in
principle be addressed by training generative latent variable models in a
hybrid adversarial-likelihood manner. However, we show that commonly made
parametric assumptions create a conflict between them, making successful hybrid
models non trivial. As a solution, we propose to use deep invertible
transformations in the latent variable decoder. This approach allows for
likelihood computations in image space, is more efficient than fully invertible
models, and can take full advantage of adversarial training. We show that our
model significantly improves over existing hybrid models: offering GAN-like
samples, IS and FID scores that are competitive with fully adversarial models,
and improved likelihood scores
- …