92 research outputs found

    Unsupervised feature learning with discriminative encoder

    Full text link
    In recent years, deep discriminative models have achieved extraordinary performance on supervised learning tasks, significantly outperforming their generative counterparts. However, their success relies on the presence of a large amount of labeled data. How can one use the same discriminative models for learning useful features in the absence of labels? We address this question in this paper, by jointly modeling the distribution of data and latent features in a manner that explicitly assigns zero probability to unobserved data. Rather than maximizing the marginal probability of observed data, we maximize the joint probability of the data and the latent features using a two step EM-like procedure. To prevent the model from overfitting to our initial selection of latent features, we use adversarial regularization. Depending on the task, we allow the latent features to be one-hot or real-valued vectors and define a suitable prior on the features. For instance, one-hot features correspond to class labels and are directly used for the unsupervised and semi-supervised classification task, whereas real-valued feature vectors are fed as input to simple classifiers for auxiliary supervised discrimination tasks. The proposed model, which we dub discriminative encoder (or DisCoder), is flexible in the type of latent features that it can capture. The proposed model achieves state-of-the-art performance on several challenging tasks.Comment: 10 pages, 4 figures, International Conference on Data Mining, 201

    Stabilizing Training of Generative Adversarial Networks through Regularization

    Full text link
    Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters. This fragility is in part due to a dimensional mismatch or non-overlapping support between the model distribution and the data distribution, causing their density ratio and the associated f-divergence to be undefined. We overcome this fundamental limitation and propose a new regularization approach with low computational cost that yields a stable GAN training procedure. We demonstrate the effectiveness of this regularizer across several architectures trained on common benchmark image generation tasks. Our regularization turns GAN models into reliable building blocks for deep learning

    Entropy-based aggregate posterior alignment techniques for deterministic autoencoders and implications for adversarial examples

    Get PDF
    We present results obtained in the context of generative neural models — specifically autoencoders — utilizing standard results from coding theory. The methods are fairly elementary in principle, yet, combined with the ubiquitous practice of Batch Normalization in these models, yield excellent results when it comes to comparing with rival autoencoding architectures. In particular, we resolve a split that arises when comparing two different types of autoencoding models — VAEs versus regularized deterministic autoencoders — often simply called RAEs (Regularized Auto Encoder). The latter offer superior performance but lose guarantees on their latent space. Further, in the latter, a wide variety of regularizers are applied for excellent performance — ranging from L2 regularization to spectral normalization. We, on the other hand, show that a simple entropy like term suffices to kill two birds with one stone — that of offering good performance while keeping a well behaved latent space. The primary thrust of the thesis exactly consists of a paper presented at UAI 2020 on these matters, titled “Batch norm with entropic regularization turns deterministic autoencoders into generative models”. This was a joint work with Abdullah Rashwan who was at the time with us at Waterloo as a postdoctoral associate, and is now at Google, and my supervisor, Pascal Poupart. This constitutes chapter 2. Extensions on this that relate to batch norm’s interplay with adversarial examples are in chapter 3. An overall overview is presented in chapter 1, which serves jointly as an introduction
    corecore