4,708 research outputs found
NorMatch: Matching Normalizing Flows with Discriminative Classifiers for Semi-Supervised Learning
Semi-Supervised Learning (SSL) aims to learn a model using a tiny labeled set
and massive amounts of unlabeled data. To better exploit the unlabeled data the
latest SSL methods use pseudo-labels predicted from a single discriminative
classifier. However, the generated pseudo-labels are inevitably linked to
inherent confirmation bias and noise which greatly affects the model
performance. In this work we introduce a new framework for SSL named NorMatch.
Firstly, we introduce a new uncertainty estimation scheme based on normalizing
flows, as an auxiliary classifier, to enforce highly certain pseudo-labels
yielding a boost of the discriminative classifiers. Secondly, we introduce a
threshold-free sample weighting strategy to exploit better both high and low
confidence pseudo-labels. Furthermore, we utilize normalizing flows to model,
in an unsupervised fashion, the distribution of unlabeled data. This modelling
assumption can further improve the performance of generative classifiers via
unlabeled data, and thus, implicitly contributing to training a better
discriminative classifier. We demonstrate, through numerical and visual
results, that NorMatch achieves state-of-the-art performance on several
datasets.Comment: Accepted to Transactions on Machine Learning Researc
Auxiliary Deep Generative Models
Deep generative models parameterized by neural networks have recently
achieved state-of-the-art performance in unsupervised and semi-supervised
learning. We extend deep generative models with auxiliary variables which
improves the variational approximation. The auxiliary variables leave the
generative model unchanged but make the variational distribution more
expressive. Inspired by the structure of the auxiliary variable we also propose
a model with two stochastic layers and skip connections. Our findings suggest
that more expressive and properly specified deep generative models converge
faster with better results. We show state-of-the-art performance within
semi-supervised learning on MNIST, SVHN and NORB datasets.Comment: Proceedings of the 33rd International Conference on Machine Learning,
New York, NY, USA, 2016, JMLR: Workshop and Conference Proceedings volume 48,
Proceedings of the 33rd International Conference on Machine Learning, New
York, NY, USA, 201
Hybrid Models with Deep and Invertible Features
We propose a neural hybrid model consisting of a linear model defined on a
set of features computed by a deep, invertible transformation (i.e. a
normalizing flow). An attractive property of our model is that both
p(features), the density of the features, and p(targets | features), the
predictive distribution, can be computed exactly in a single feed-forward pass.
We show that our hybrid model, despite the invertibility constraints, achieves
similar accuracy to purely predictive models. Moreover the generative component
remains a good model of the input features despite the hybrid optimization
objective. This offers additional capabilities such as detection of
out-of-distribution inputs and enabling semi-supervised learning. The
availability of the exact joint density p(targets, features) also allows us to
compute many quantities readily, making our hybrid model a useful building
block for downstream applications of probabilistic deep learning.Comment: ICML 201
Semi-Supervised Generation with Cluster-aware Generative Models
Deep generative models trained with large amounts of unlabelled data have
proven to be powerful within the domain of unsupervised learning. Many real
life data sets contain a small amount of labelled data points, that are
typically disregarded when training generative models. We propose the
Cluster-aware Generative Model, that uses unlabelled information to infer a
latent representation that models the natural clustering of the data, and
additional labelled data points to refine this clustering. The generative
performances of the model significantly improve when labelled information is
exploited, obtaining a log-likelihood of -79.38 nats on permutation invariant
MNIST, while also achieving competitive semi-supervised classification
accuracies. The model can also be trained fully unsupervised, and still improve
the log-likelihood performance with respect to related methods
- …