29 research outputs found
Flexible and accurate inference and learning for deep generative models
We introduce a new approach to learning in hierarchical latent-variable
generative models called the "distributed distributional code Helmholtz
machine", which emphasises flexibility and accuracy in the inferential process.
In common with the original Helmholtz machine and later variational autoencoder
algorithms (but unlike adverserial methods) our approach learns an explicit
inference or "recognition" model to approximate the posterior distribution over
the latent variables. Unlike in these earlier methods, the posterior
representation is not limited to a narrow tractable parameterised form (nor is
it represented by samples). To train the generative and recognition models we
develop an extended wake-sleep algorithm inspired by the original Helmholtz
Machine. This makes it possible to learn hierarchical latent models with both
discrete and continuous variables, where an accurate posterior representation
is essential. We demonstrate that the new algorithm outperforms current
state-of-the-art methods on synthetic, natural image patch and the MNIST data
sets
Unbiased estimators for the variance of MMD estimators
The maximum mean discrepancy (MMD) is a kernel-based distance between
probability distributions useful in many applications (Gretton et al. 2012),
bearing a simple estimator with pleasing computational and statistical
properties. Being able to efficiently estimate the variance of this estimator
is very helpful to various problems in two-sample testing. Towards this end,
Bounliphone et al. (2016) used the theory of U-statistics to derive estimators
for the variance of an MMD estimator, and differences between two such
estimators. Their estimator, however, drops lower-order terms, and is
unnecessarily biased. We show in this note - extending and correcting work of
Sutherland et al. (2017) - that we can find a truly unbiased estimator for the
actual variance of both the squared MMD estimator and the difference of two
correlated squared MMD estimators, at essentially no additional computational
cost.Comment: Fixes and extends the appendices of arXiv:1611.04488 and
arXiv:1511.0458
Informative Features for Model Comparison
Given two candidate models, and a set of target observations, we address the
problem of measuring the relative goodness of fit of the two models. We propose
two new statistical tests which are nonparametric, computationally efficient
(runtime complexity is linear in the sample size), and interpretable. As a
unique advantage, our tests can produce a set of examples (informative
features) indicating the regions in the data domain where one model fits
significantly better than the other. In a real-world problem of comparing GAN
models, the test power of our new test matches that of the state-of-the-art
test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201
Evaluating Disentanglement in Generative Models Without Knowledge of Latent Factors
Probabilistic generative models provide a flexible and systematic framework
for learning the underlying geometry of data. However, model selection in this
setting is challenging, particularly when selecting for ill-defined qualities
such as disentanglement or interpretability. In this work, we address this gap
by introducing a method for ranking generative models based on the training
dynamics exhibited during learning. Inspired by recent theoretical
characterizations of disentanglement, our method does not require supervision
of the underlying latent factors. We evaluate our approach by demonstrating the
need for disentanglement metrics which do not require labels\textemdash the
underlying generative factors. We additionally demonstrate that our approach
correlates with baseline supervised methods for evaluating disentanglement.
Finally, we show that our method can be used as an unsupervised indicator for
downstream performance on reinforcement learning and fairness-classification
problems