231,067 research outputs found
Boosted Generative Models
We propose a novel approach for using unsupervised boosting to create an
ensemble of generative models, where models are trained in sequence to correct
earlier mistakes. Our meta-algorithmic framework can leverage any existing base
learner that permits likelihood evaluation, including recent deep expressive
models. Further, our approach allows the ensemble to include discriminative
models trained to distinguish real data from model-generated data. We show
theoretical conditions under which incorporating a new model in the ensemble
will improve the fit and empirically demonstrate the effectiveness of our
black-box boosting algorithms on density estimation, classification, and sample
generation on benchmark datasets for a wide range of generative models.Comment: AAAI 201
Auxiliary Deep Generative Models
Deep generative models parameterized by neural networks have recently
achieved state-of-the-art performance in unsupervised and semi-supervised
learning. We extend deep generative models with auxiliary variables which
improves the variational approximation. The auxiliary variables leave the
generative model unchanged but make the variational distribution more
expressive. Inspired by the structure of the auxiliary variable we also propose
a model with two stochastic layers and skip connections. Our findings suggest
that more expressive and properly specified deep generative models converge
faster with better results. We show state-of-the-art performance within
semi-supervised learning on MNIST, SVHN and NORB datasets.Comment: Proceedings of the 33rd International Conference on Machine Learning,
New York, NY, USA, 2016, JMLR: Workshop and Conference Proceedings volume 48,
Proceedings of the 33rd International Conference on Machine Learning, New
York, NY, USA, 201
Sliced Wasserstein Generative Models
In generative modeling, the Wasserstein distance (WD) has emerged as a useful
metric to measure the discrepancy between generated and real data
distributions. Unfortunately, it is challenging to approximate the WD of
high-dimensional distributions. In contrast, the sliced Wasserstein distance
(SWD) factorizes high-dimensional distributions into their multiple
one-dimensional marginal distributions and is thus easier to approximate. In
this paper, we introduce novel approximations of the primal and dual SWD.
Instead of using a large number of random projections, as it is done by
conventional SWD approximation methods, we propose to approximate SWDs with a
small number of parameterized orthogonal projections in an end-to-end deep
learning fashion. As concrete applications of our SWD approximations, we design
two types of differentiable SWD blocks to equip modern generative
frameworks---Auto-Encoders (AE) and Generative Adversarial Networks (GAN). In
the experiments, we not only show the superiority of the proposed generative
models on standard image synthesis benchmarks, but also demonstrate the
state-of-the-art performance on challenging high resolution image and video
generation in an unsupervised manner.Comment: This paper is accepted by CVPR 2019, accidentally uploaded as a new
submission (arXiv:1904.05408, which has been withdrawn). The code is
available at this https URL https:// github.com/musikisomorphie/swd.gi
Generative and Discriminative Text Classification with Recurrent Neural Networks
We empirically characterize the performance of discriminative and generative
LSTM models for text classification. We find that although RNN-based generative
models are more powerful than their bag-of-words ancestors (e.g., they account
for conditional dependencies across words in a document), they have higher
asymptotic error rates than discriminatively trained RNN models. However we
also find that generative models approach their asymptotic error rate more
rapidly than their discriminative counterparts---the same pattern that Ng &
Jordan (2001) proved holds for linear classification models that make more
naive conditional independence assumptions. Building on this finding, we
hypothesize that RNN-based generative classification models will be more robust
to shifts in the data distribution. This hypothesis is confirmed in a series of
experiments in zero-shot and continual learning settings that show that
generative models substantially outperform discriminative models
- …