Search CORE

203,203 research outputs found

Generative Modeling of Convolutional Neural Networks

Author: Dai Jifeng
Lu Yang
Wu Ying-Nian
Publication venue
Publication date: 09/04/2015
Field of study

The convolutional neural networks (CNNs) have proven to be a powerful tool for discriminative learning. Recently researchers have also started to show interest in the generative aspects of CNNs in order to gain a deeper understanding of what they have learned and how to further improve them. This paper investigates generative modeling of CNNs. The main contributions include: (1) We construct a generative model for the CNN in the form of exponential tilting of a reference distribution. (2) We propose a generative gradient for pre-training CNNs by a non-parametric importance sampling scheme, which is fundamentally different from the commonly used discriminative gradient, and yet has the same computational architecture and cost as the latter. (3) We propose a generative visualization method for the CNNs by sampling from an explicit parametric image distribution. The proposed visualization method can directly draw synthetic samples for any given node in a trained CNN by the Hamiltonian Monte Carlo (HMC) algorithm, without resorting to any extra hold-out images. Experiments on the challenging ImageNet benchmark show that the proposed generative gradient pre-training consistently helps improve the performances of CNNs, and the proposed generative visualization method generates meaningful and varied samples of synthetic images from a large-scale deep CNN

arXiv.org e-Print Archive

Language Modeling with Generative Adversarial Networks

Author: Contractor Utkarsh
Moradshahi Mehrad
Publication venue
Publication date: 07/04/2018
Field of study

Generative Adversarial Networks (GANs) have been promising in the field of image generation, however, they have been hard to train for language generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them which causes high levels of instability in training GANs. Consequently, past work has resorted to pre-training with maximum-likelihood or training GANs without pre-training with a WGAN objective with a gradient penalty. In this study, we present a comparison of those approaches. Furthermore, we present the results of some experiments that indicate better training and convergence of Wasserstein GANs (WGANs) when a weaker regularization term is enforcing the Lipschitz constraint

arXiv.org e-Print Archive

Enhanced Experience Replay Generation for Efficient Reinforcement Learning

Author: Hu Wenfeng
Huang Vincent
Ley Tobias
Vlachou-Konchylaki Martha
Publication venue
Publication date: 29/05/2017
Field of study

Applying deep reinforcement learning (RL) on real systems suffers from slow data sampling. We propose an enhanced generative adversarial network (EGAN) to initialize an RL agent in order to achieve faster learning. The EGAN utilizes the relation between states and actions to enhance the quality of data samples generated by a GAN. Pre-training the agent with the EGAN shows a steeper learning curve with a 20% improvement of training time in the beginning of learning, compared to no pre-training, and an improvement compared to training with GAN by about 5% with smaller variations. For real time systems with sparse and slow data sampling the EGAN could be used to speed up the early phases of the training process

arXiv.org e-Print Archive

High-Quality Face Image SR Using Conditional Generative Adversarial Networks

Author: Bin Huang
Chun-Liang Lin
Weihai Chen
Xingming Wu
Publication venue
Publication date: 03/07/2017
Field of study

We propose a novel single face image super-resolution method, which named Face Conditional Generative Adversarial Network(FCGAN), based on boundary equilibrium generative adversarial networks. Without taking any facial prior information, our method can generate a high-resolution face image from a low-resolution one. Compared with existing studies, both our training and testing phases are end-to-end pipeline with little pre/post-processing. To enhance the convergence speed and strengthen feature propagation, skip-layer connection is further employed in the generative and discriminative networks. Extensive experiments demonstrate that our model achieves competitive performance compared with state-of-the-art models.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Face Identity Disentanglement via Latent Space Mapping

Author: Bermano Amit
Cohen-Or Daniel
Li Yangyan
Nitzan Yotam
Publication venue
Publication date: 07/08/2020
Field of study

Learning disentangled representations of data is a fundamental problem in artificial intelligence. Specifically, disentangled latent representations allow generative models to control and compose the disentangled factors in the synthesis process. Current methods, however, require extensive supervision and training, or instead, noticeably compromise quality. In this paper, we present a method that learn show to represent data in a disentangled way, with minimal supervision, manifested solely using available pre-trained networks. Our key insight is to decouple the processes of disentanglement and synthesis, by employing a leading pre-trained unconditional image generator, such as StyleGAN. By learning to map into its latent space, we leverage both its state-of-the-art quality generative power, and its rich and expressive latent space, without the burden of training it.We demonstrate our approach on the complex and high dimensional domain of human heads. We evaluate our method qualitatively and quantitatively, and exhibit its success with de-identification operations and with temporal identity coherency in image sequences. Through this extensive experimentation, we show that our method successfully disentangles identity from other facial attributes, surpassing existing methods, even though they require more training and supervision.Comment: 17 pages, 10 figure

arXiv.org e-Print Archive

Image Generation From Small Datasets via Batch Statistics Adaptation

Author: Harada Tatsuya
Noguchi Atsuhiro
Publication venue
Publication date: 23/10/2019
Field of study

Thanks to the recent development of deep generative models, it is becoming possible to generate high-quality images with both fidelity and diversity. However, the training of such generative models requires a large dataset. To reduce the amount of data required, we propose a new method for transferring prior knowledge of the pre-trained generator, which is trained with a large dataset, to a small dataset in a different domain. Using such prior knowledge, the model can generate images leveraging some common sense that cannot be acquired from a small dataset. In this work, we propose a novel method focusing on the parameters for batch statistics, scale and shift, of the hidden layers in the generator. By training only these parameters in a supervised manner, we achieved stable training of the generator, and our method can generate higher quality images compared to previous methods without collapsing, even when the dataset is small (~100). Our results show that the diversity of the filters acquired in the pre-trained generator is important for the performance on the target domain. Our method makes it possible to add a new class or domain to a pre-trained generator without disturbing the performance on the original domain.Comment: ICCV 201

arXiv.org e-Print Archive

Generative Adversarial Networks with Decoder-Encoder Output Noise

Author: Gao Wei
Liu Yongbin
Yang Youzhao
Zhong Guoqiang
Publication venue
Publication date: 10/07/2018
Field of study

In recent years, research on image generation methods has been developing fast. The auto-encoding variational Bayes method (VAEs) was proposed in 2013, which uses variational inference to learn a latent space from the image database and then generates images using the decoder. The generative adversarial networks (GANs) came out as a promising framework, which uses adversarial training to improve the generative ability of the generator. However, the generated pictures by GANs are generally blurry. The deep convolutional generative adversarial networks (DCGANs) were then proposed to leverage the quality of generated images. Since the input noise vectors are randomly sampled from a Gaussian distribution, the generator has to map from a whole normal distribution to the images. This makes DCGANs unable to reflect the inherent structure of the training data. In this paper, we propose a novel deep model, called generative adversarial networks with decoder-encoder output noise (DE-GANs), which takes advantage of both the adversarial training and the variational Bayesain inference to improve the performance of image generation. DE-GANs use a pre-trained decoder-encoder architecture to map the random Gaussian noise vectors to informative ones and pass them to the generator of the adversarial networks. Since the decoder-encoder architecture is trained by the same images as the generators, the output vectors could carry the intrinsic distribution information of the original images. Moreover, the loss function of DE-GANs is different from GANs and DCGANs. A hidden-space loss function is added to the adversarial loss function to enhance the robustness of the model. Extensive empirical results show that DE-GANs can accelerate the convergence of the adversarial training process and improve the quality of the generated images

arXiv.org e-Print Archive

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

Author: Bar Amir
Berant Jonathan
Bogin Ben
Press Ofir
Wolf Lior
Publication venue
Publication date: 21/12/2017
Field of study

Generative Adversarial Networks (GANs) have shown great promise recently in image generation. Training GANs for language generation has proven to be more difficult, because of the non-differentiable nature of generating text with recurrent neural networks. Consequently, past work has either resorted to pre-training with maximum-likelihood or used convolutional networks for generation. In this work, we show that recurrent neural networks can be trained to generate text with GANs from scratch using curriculum learning, by slowly teaching the model to generate sequences of increasing and variable length. We empirically show that our approach vastly improves the quality of generated sequences compared to a convolutional baseline.Comment: Presented at the 1st Workshop on Learning to Generate Natural Language at ICML 201

arXiv.org e-Print Archive

Learning Implicit Text Generation via Feature Matching

Author: Bai Ke
Chenthamarakshan Vijil
Das Payel
Dognin Pierre
Mroueh Youssef
Padhi Inkit
Santos Cicero Nogueira dos
Publication venue
Publication date: 08/05/2020
Field of study

Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English: unconditional text generation, class-conditional text generation, and unsupervised text style transfer. SeqGFMN is stable to train and outperforms various adversarial approaches for text generation and text style transfer.Comment: ACL 202

arXiv.org e-Print Archive

Virtual Conditional Generative Adversarial Networks

Author: Cai Guanyu
He Lianghua
Shang Shaohua
Shi Haifeng
Wang Yuqin
Publication venue
Publication date: 25/01/2019
Field of study

When trained on multimodal image datasets, normal Generative Adversarial Networks (GANs) are usually outperformed by class-conditional GANs and ensemble GANs, but conditional GANs is restricted to labeled datasets and ensemble GANs lack efficiency. We propose a novel GAN variant called virtual conditional GAN (vcGAN) which is not only an ensemble GAN with multiple generative paths while adding almost zero network parameters, but also a conditional GAN that can be trained on unlabeled datasets without explicit clustering steps or objectives other than the adversary loss. Inside the vcGAN's generator, a learnable ``analog-to-digital converter (ADC)" module maps a slice of the inputted multivariate Gaussian noise to discrete/digital noise (virtual label), according to which a selector selects the corresponding generative path to produce the sample. All the generative paths share the same decoder network while in each path the decoder network is fed with a concatenation of a different pre-computed amplified one-hot vector and the inputted Gaussian noise. We conducted a lot of experiments on several balanced/imbalanced image datasets to demonstrate that vcGAN converges faster and achieves improved Frech\'et Inception Distance (FID). In addition, we show the training byproduct that the ADC in vcGAN learned the categorical probability of each mode and that each generative path generates samples of specific mode, which enables class-conditional sampling. Codes are available at \url{https://github.com/annonnymmouss/vcgan

arXiv.org e-Print Archive