9,825 research outputs found
Multi-Generator Generative Adversarial Nets
We propose a new approach to train the Generative Adversarial Nets (GANs)
with a mixture of generators to overcome the mode collapsing problem. The main
intuition is to employ multiple generators, instead of using a single one as in
the original GAN. The idea is simple, yet proven to be extremely effective at
covering diverse data modes, easily overcoming the mode collapse and delivering
state-of-the-art results. A minimax formulation is able to establish among a
classifier, a discriminator, and a set of generators in a similar spirit with
GAN. Generators create samples that are intended to come from the same
distribution as the training data, whilst the discriminator determines whether
samples are true data or generated by generators, and the classifier specifies
which generator a sample comes from. The distinguishing feature is that
internal samples are created from multiple generators, and then one of them
will be randomly selected as final output similar to the mechanism of a
probabilistic mixture model. We term our method Mixture GAN (MGAN). We develop
theoretical analysis to prove that, at the equilibrium, the Jensen-Shannon
divergence (JSD) between the mixture of generators' distributions and the
empirical data distribution is minimal, whilst the JSD among generators'
distributions is maximal, hence effectively avoiding the mode collapse. By
utilizing parameter sharing, our proposed model adds minimal computational cost
to the standard GAN, and thus can also efficiently scale to large-scale
datasets. We conduct extensive experiments on synthetic 2D data and natural
image databases (CIFAR-10, STL-10 and ImageNet) to demonstrate the superior
performance of our MGAN in achieving state-of-the-art Inception scores over
latest baselines, generating diverse and appealing recognizable objects at
different resolutions, and specializing in capturing different types of objects
by generators
Multi-Task Generative Adversarial Nets with Shared Memory for Cross-Domain Coordination Control
Generating sequential decision process from huge amounts of measured process
data is a future research direction for collaborative factory automation,
making full use of those online or offline process data to directly design
flexible make decisions policy, and evaluate performance. The key challenges
for the sequential decision process is to online generate sequential
decision-making policy directly, and transferring knowledge across tasks
domain. Most multi-task policy generating algorithms often suffer from
insufficient generating cross-task sharing structure at discrete-time nonlinear
systems with applications. This paper proposes the multi-task generative
adversarial nets with shared memory for cross-domain coordination control,
which can generate sequential decision policy directly from raw sensory input
of all of tasks, and online evaluate performance of system actions in
discrete-time nonlinear systems. Experiments have been undertaken using a
professional flexible manufacturing testbed deployed within a smart factory of
Weichai Power in China. Results on three groups of discrete-time nonlinear
control tasks show that our proposed model can availably improve the
performance of task with the help of other related tasks
Dual Discriminator Generative Adversarial Nets
We propose in this paper a novel approach to tackle the problem of mode
collapse encountered in generative adversarial network (GAN). Our idea is
intuitive but proven to be very effective, especially in addressing some key
limitations of GAN. In essence, it combines the Kullback-Leibler (KL) and
reverse KL divergences into a unified objective function, thus it exploits the
complementary statistical properties from these divergences to effectively
diversify the estimated density in capturing multi-modes. We term our method
dual discriminator generative adversarial nets (D2GAN) which, unlike GAN, has
two discriminators; and together with a generator, it also has the analogy of a
minimax game, wherein a discriminator rewards high scores for samples from data
distribution whilst another discriminator, conversely, favoring data from the
generator, and the generator produces data to fool both two discriminators. We
develop theoretical analysis to show that, given the maximal discriminators,
optimizing the generator of D2GAN reduces to minimizing both KL and reverse KL
divergences between data distribution and the distribution induced from the
data generated by the generator, hence effectively avoiding the mode collapsing
problem. We conduct extensive experiments on synthetic and real-world
large-scale datasets (MNIST, CIFAR-10, STL-10, ImageNet), where we have made
our best effort to compare our D2GAN with the latest state-of-the-art GAN's
variants in comprehensive qualitative and quantitative evaluations. The
experimental results demonstrate the competitive and superior performance of
our approach in generating good quality and diverse samples over baselines, and
the capability of our method to scale up to ImageNet database
Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching
Community-based question answering (CQA) websites represent an important
source of information. As a result, the problem of matching the most valuable
answers to their corresponding questions has become an increasingly popular
research topic. We frame this task as a binary (relevant/irrelevant)
classification problem, and present an adversarial training framework to
alleviate label imbalance issue. We employ a generative model to iteratively
sample a subset of challenging negative samples to fool our classification
model. Both models are alternatively optimized using REINFORCE algorithm. The
proposed method is completely different from previous ones, where negative
samples in training set are directly used or uniformly down-sampled. Further,
we propose using Multi-scale Matching which explicitly inspects the correlation
between words and ngrams of different levels of granularity. We evaluate the
proposed method on SemEval 2016 and SemEval 2017 datasets and achieves
state-of-the-art or similar performance
Segmentation Guided Image-to-Image Translation with Adversarial Networks
Recently image-to-image translation has received increasing attention, which
aims to map images in one domain to another specific one. Existing methods
mainly solve this task via a deep generative model, and focus on exploring the
relationship between different domains. However, these methods neglect to
utilize higher-level and instance-specific information to guide the training
process, leading to a great deal of unrealistic generated images of low
quality. Existing methods also lack of spatial controllability during
translation. To address these challenge, we propose a novel Segmentation Guided
Generative Adversarial Networks (SGGAN), which leverages semantic segmentation
to further boost the generation performance and provide spatial mapping. In
particular, a segmentor network is designed to impose semantic information on
the generated images. Experimental results on multi-domain face image
translation task empirically demonstrate our ability of the spatial
modification and our superiority in image quality over several state-of-the-art
methods.Comment: Accepted for publication in 2019 14th IEEE International Conference
on Automatic Face & Gesture Recognition (FG 2019
3D-A-Nets: 3D Deep Dense Descriptor for Volumetric Shapes with Adversarial Networks
Recently researchers have been shifting their focus towards learned 3D shape
descriptors from hand-craft ones to better address challenging issues of the
deformation and structural variation inherently present in 3D objects. 3D
geometric data are often transformed to 3D Voxel grids with regular format in
order to be better fed to a deep neural net architecture. However, the
computational intractability of direct application of 3D convolutional nets to
3D volumetric data severely limits the efficiency (i.e. slow processing) and
effectiveness (i.e. unsatisfied accuracy) in processing 3D geometric data. In
this paper, powered with a novel design of adversarial networks (3D-A-Nets), we
have developed a novel 3D deep dense shape descriptor (3D-DDSD) to address the
challenging issues of efficient and effective 3D volumetric data processing. We
developed new definition of 2D multilayer dense representation (MDR) of 3D
volumetric data to extract concise but geometrically informative shape
description and a novel design of adversarial networks that jointly train a set
of convolution neural network (CNN), recurrent neural network (RNN) and an
adversarial discriminator. More specifically, the generator network produces 3D
shape features that encourages the clustering of samples from the same category
with correct class label, whereas the discriminator network discourages the
clustering by assigning them misleading adversarial class labels. By addressing
the challenges posed by the computational inefficiency of direct application of
CNN to 3D volumetric data, 3D-A-Nets can learn high-quality 3D-DSDD which
demonstrates superior performance on 3D shape classification and retrieval over
other state-of-the-art techniques by a great margin.Comment: 8 pages, 8 figure
Tensorizing Generative Adversarial Nets
Generative Adversarial Network (GAN) and its variants exhibit
state-of-the-art performance in the class of generative models. To capture
higher-dimensional distributions, the common learning procedure requires high
computational complexity and a large number of parameters. The problem of
employing such massive framework arises when deploying it on a platform with
limited computational power such as mobile phones. In this paper, we present a
new generative adversarial framework by representing each layer as a tensor
structure connected by multilinear operations, aiming to reduce the number of
model parameters by a large factor while preserving the generative performance
and sample quality. To learn the model, we employ an efficient algorithm which
alternatively optimizes both discriminator and generator. Experimental outcomes
demonstrate that our model can achieve high compression rate for model
parameters up to times when compared to the original GAN for MNIST
dataset.Comment: 4 pages, 3 figure
Student's t-Generative Adversarial Networks
Generative Adversarial Networks (GANs) have a great performance in image
generation, but they need a large scale of data to train the entire framework,
and often result in nonsensical results. We propose a new method referring to
conditional GAN, which equipments the latent noise with mixture of Student's
t-distribution with attention mechanism in addition to class information.
Student's t-distribution has long tails that can provide more diversity to the
latent noise. Meanwhile, the discriminator in our model implements two tasks
simultaneously, judging whether the images come from the true data
distribution, and identifying the class of each generated images. The
parameters of the mixture model can be learned along with those of GANs.
Moreover, we mathematically prove that any multivariate Student's
t-distribution can be obtained by a linear transformation of a normal
multivariate Student's t-distribution. Experiments comparing the proposed
method with typical GAN, DeliGAN and DCGAN indicate that, our method has a
great performance on generating diverse and legible objects with limited data
Multi-View Image Generation from a Single-View
This paper addresses a challenging problem -- how to generate multi-view
cloth images from only a single view input. To generate realistic-looking
images with different views from the input, we propose a new image generation
model termed VariGANs that combines the strengths of the variational inference
and the Generative Adversarial Networks (GANs). Our proposed VariGANs model
generates the target image in a coarse-to-fine manner instead of a single pass
which suffers from severe artifacts. It first performs variational inference to
model global appearance of the object (e.g., shape and color) and produce a
coarse image with a different view. Conditioned on the generated low resolution
images, it then proceeds to perform adversarial learning to fill details and
generate images of consistent details with the input. Extensive experiments
conducted on two clothing datasets, MVC and DeepFashion, have demonstrated that
images of a novel view generated by our model are more plausible than those
generated by existing approaches, in terms of more consistent global appearance
as well as richer and sharper details
Medical Image Generation using Generative Adversarial Networks
Generative adversarial networks (GANs) are unsupervised Deep Learning
approach in the computer vision community which has gained significant
attention from the last few years in identifying the internal structure of
multimodal medical imaging data. The adversarial network simultaneously
generates realistic medical images and corresponding annotations, which proven
to be useful in many cases such as image augmentation, image registration,
medical image generation, image reconstruction, and image-to-image translation.
These properties bring the attention of the researcher in the field of medical
image analysis and we are witness of rapid adaption in many novel and
traditional applications. This chapter provides state-of-the-art progress in
GANs-based clinical application in medical image generation, and cross-modality
synthesis. The various framework of GANs which gained popularity in the
interpretation of medical images, such as Deep Convolutional GAN (DCGAN),
Laplacian GAN (LAPGAN), pix2pix, CycleGAN, and unsupervised image-to-image
translation model (UNIT), continue to improve their performance by
incorporating additional hybrid architecture, has been discussed. Further, some
of the recent applications of these frameworks for image reconstruction, and
synthesis, and future research directions in the area have been covered.Comment: 19 pages, 3 figures, 5 table
- …