Search CORE

29 research outputs found

Flexible and accurate inference and learning for deep generative models

Author: Sahani Maneesh
Vertes Eszter
Publication venue
Publication date: 01/01/2018
Field of study

We introduce a new approach to learning in hierarchical latent-variable generative models called the "distributed distributional code Helmholtz machine", which emphasises flexibility and accuracy in the inferential process. In common with the original Helmholtz machine and later variational autoencoder algorithms (but unlike adverserial methods) our approach learns an explicit inference or "recognition" model to approximate the posterior distribution over the latent variables. Unlike in these earlier methods, the posterior representation is not limited to a narrow tractable parameterised form (nor is it represented by samples). To train the generative and recognition models we develop an extended wake-sleep algorithm inspired by the original Helmholtz Machine. This makes it possible to learn hierarchical latent models with both discrete and continuous variables, where an accurate posterior representation is essential. We demonstrate that the new algorithm outperforms current state-of-the-art methods on synthetic, natural image patch and the MNIST data sets

arXiv.org e-Print Archive

UCL Discovery

Unbiased estimators for the variance of MMD estimators

Author: Sutherland Danica J.
Publication venue
Publication date: 14/01/2021
Field of study

The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties. Being able to efficiently estimate the variance of this estimator is very helpful to various problems in two-sample testing. Towards this end, Bounliphone et al. (2016) used the theory of U-statistics to derive estimators for the variance of an MMD estimator, and differences between two such estimators. Their estimator, however, drops lower-order terms, and is unnecessarily biased. We show in this note - extending and correcting work of Sutherland et al. (2017) - that we can find a truly unbiased estimator for the actual variance of both the squared MMD estimator and the difference of two correlated squared MMD estimators, at essentially no additional computational cost.Comment: Fixes and extends the appendices of arXiv:1611.04488 and arXiv:1511.0458

arXiv.org e-Print Archive

Informative Features for Model Comparison

Author: Gretton Arthur
Hays James
Jitkrittum Wittawat
Kanagawa Heishiro
Sangkloy Patsorn
Schölkopf Bernhard
Publication venue
Publication date: 27/10/2018
Field of study

Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201

arXiv.org e-Print Archive

UCL Discovery

Evaluating Disentanglement in Generative Models Without Knowledge of Latent Factors

Author: Cloninger Alexander
Holtz Chester
Mishne Gal
Publication venue
Publication date: 04/10/2022
Field of study

Probabilistic generative models provide a flexible and systematic framework for learning the underlying geometry of data. However, model selection in this setting is challenging, particularly when selecting for ill-defined qualities such as disentanglement or interpretability. In this work, we address this gap by introducing a method for ranking generative models based on the training dynamics exhibited during learning. Inspired by recent theoretical characterizations of disentanglement, our method does not require supervision of the underlying latent factors. We evaluate our approach by demonstrating the need for disentanglement metrics which do not require labels\textemdash the underlying generative factors. We additionally demonstrate that our approach correlates with baseline supervised methods for evaluating disentanglement. Finally, we show that our method can be used as an unsupervised indicator for downstream performance on reinforcement learning and fairness-classification problems

arXiv.org e-Print Archive