36,238 research outputs found
Revisiting Precision and Recall Definition for Generative Model Evaluation
In this article we revisit the definition of Precision-Recall (PR) curves for
generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than
providing a scalar for generative quality, PR curves distinguish mode-collapse
(poor recall) and bad quality (poor precision). We first generalize their
formulation to arbitrary measures, hence removing any restriction to finite
support. We also expose a bridge between PR curves and type I and type II error
rates of likelihood ratio classifiers on the task of discriminating between
samples of the two distributions. Building upon this new perspective, we
propose a novel algorithm to approximate precision-recall curves, that shares
some interesting methodological properties with the hypothesis testing
technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest
of the proposed formulation over the original approach on controlled
multi-modal datasets.Comment: ICML 201
Flexible and accurate inference and learning for deep generative models
We introduce a new approach to learning in hierarchical latent-variable
generative models called the "distributed distributional code Helmholtz
machine", which emphasises flexibility and accuracy in the inferential process.
In common with the original Helmholtz machine and later variational autoencoder
algorithms (but unlike adverserial methods) our approach learns an explicit
inference or "recognition" model to approximate the posterior distribution over
the latent variables. Unlike in these earlier methods, the posterior
representation is not limited to a narrow tractable parameterised form (nor is
it represented by samples). To train the generative and recognition models we
develop an extended wake-sleep algorithm inspired by the original Helmholtz
Machine. This makes it possible to learn hierarchical latent models with both
discrete and continuous variables, where an accurate posterior representation
is essential. We demonstrate that the new algorithm outperforms current
state-of-the-art methods on synthetic, natural image patch and the MNIST data
sets
Informative Features for Model Comparison
Given two candidate models, and a set of target observations, we address the
problem of measuring the relative goodness of fit of the two models. We propose
two new statistical tests which are nonparametric, computationally efficient
(runtime complexity is linear in the sample size), and interpretable. As a
unique advantage, our tests can produce a set of examples (informative
features) indicating the regions in the data domain where one model fits
significantly better than the other. In a real-world problem of comparing GAN
models, the test power of our new test matches that of the state-of-the-art
test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201
- …