11,355 research outputs found
Improved Training of Generative Adversarial Networks Using Representative Features
Despite the success of generative adversarial networks (GANs) for image
generation, the trade-off between visual quality and image diversity remains a
significant issue. This paper achieves both aims simultaneously by improving
the stability of training GANs. The key idea of the proposed approach is to
implicitly regularize the discriminator using representative features. Focusing
on the fact that standard GAN minimizes reverse Kullback-Leibler (KL)
divergence, we transfer the representative feature, which is extracted from the
data distribution using a pre-trained autoencoder (AE), to the discriminator of
standard GANs. Because the AE learns to minimize forward KL divergence, our GAN
training with representative features is influenced by both reverse and forward
KL divergence. Consequently, the proposed approach is verified to improve
visual quality and diversity of state of the art GANs using extensive
evaluations.Comment: Accepted at ICML 201
Label-Removed Generative Adversarial Networks Incorporating with K-Means
Generative Adversarial Networks (GANs) have achieved great success in
generating realistic images. Most of these are conditional models, although
acquisition of class labels is expensive and time-consuming in practice. To
reduce the dependence on labeled data, we propose an un-conditional generative
adversarial model, called K-Means-GAN (KM-GAN), which incorporates the idea of
updating centers in K-Means into GANs. Specifically, we redesign the framework
of GANs by applying K-Means on the features extracted from the discriminator.
With obtained labels from K-Means, we propose new objective functions from the
perspective of deep metric learning (DML). Distinct from previous works, the
discriminator is treated as a feature extractor rather than a classifier in
KM-GAN, meanwhile utilization of K-Means makes features of the discriminator
more representative. Experiments are conducted on various datasets, such as
MNIST, Fashion-10, CIFAR-10 and CelebA, and show that the quality of samples
generated by KM-GAN is comparable to some conditional generative adversarial
models
Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks
Infrared (IR) images are essential to improve the visibility of dark or
camouflaged objects. Object recognition and segmentation based on a neural
network using IR images provide more accuracy and insight than color visible
images. But the bottleneck is the amount of relevant IR images for training. It
is difficult to collect real-world IR images for special purposes, including
space exploration, military and fire-fighting applications. To solve this
problem, we created color visible and IR images using a Unity-based 3D game
editor. These synthetically generated color visible and IR images were used to
train cycle consistent adversarial networks (CycleGAN) to convert visible
images to IR images. CycleGAN has the advantage that it does not require
precisely matching visible and IR pairs for transformation training. In this
study, we discovered that additional synthetic data can help improve CycleGAN
performance. Neural network training using real data (N = 20) performed more
accurate transformations than training using real (N = 10) and synthetic (N =
10) data combinations. The result indicates that the synthetic data cannot
exceed the quality of the real data. Neural network training using real (N =
10) and synthetic (N = 100) data combinations showed almost the same
performance as training using real data (N = 20). At least 10 times more
synthetic data than real data is required to achieve the same performance. In
summary, CycleGAN is used with synthetic data to improve the IR image
conversion performance of visible images.Comment: 8 pages, 6 figures, SPI
Self-Attention Generative Adversarial Networks
In this paper, we propose the Self-Attention Generative Adversarial Network
(SAGAN) which allows attention-driven, long-range dependency modeling for image
generation tasks. Traditional convolutional GANs generate high-resolution
details as a function of only spatially local points in lower-resolution
feature maps. In SAGAN, details can be generated using cues from all feature
locations. Moreover, the discriminator can check that highly detailed features
in distant portions of the image are consistent with each other. Furthermore,
recent work has shown that generator conditioning affects GAN performance.
Leveraging this insight, we apply spectral normalization to the GAN generator
and find that this improves training dynamics. The proposed SAGAN achieves the
state-of-the-art results, boosting the best published Inception score from 36.8
to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the
challenging ImageNet dataset. Visualization of the attention layers shows that
the generator leverages neighborhoods that correspond to object shapes rather
than local regions of fixed shape
Adversarial Feature Sampling Learning for Efficient Visual Tracking
The tracking-by-detection framework usually consist of two stages: drawing
samples around the target object in the first stage and classifying each sample
as the target object or background in the second stage. Current popular
trackers based on tracking-by-detection framework typically draw samples in the
raw image as the inputs of deep convolution networks in the first stage, which
usually results in high computational burden and low running speed. In this
paper, we propose a new visual tracking method using sampling deep
convolutional features to address this problem. Only one cropped image around
the target object is input into the designed deep convolution network and the
samples is sampled on the feature maps of the network by spatial bilinear
resampling. In addition, a generative adversarial network is integrated into
our network framework to augment positive samples and improve the tracking
performance. Extensive experiments on benchmark datasets demonstrate that the
proposed method achieves a comparable performance to state-of-the-art trackers
and accelerates tracking-by-detection trackers based on raw-image samples
effectively
Non-Adversarial Image Synthesis with Generative Latent Nearest Neighbors
Unconditional image generation has recently been dominated by generative
adversarial networks (GANs). GAN methods train a generator which regresses
images from random noise vectors, as well as a discriminator that attempts to
differentiate between the generated images and a training set of real images.
GANs have shown amazing results at generating realistic looking images. Despite
their success, GANs suffer from critical drawbacks including: unstable training
and mode-dropping. The weaknesses in GANs have motivated research into
alternatives including: variational auto-encoders (VAEs), latent embedding
learning methods (e.g. GLO) and nearest-neighbor based implicit maximum
likelihood estimation (IMLE). Unfortunately at the moment, GANs still
significantly outperform the alternative methods for image generation. In this
work, we present a novel method - Generative Latent Nearest Neighbors (GLANN) -
for training generative models without adversarial training. GLANN combines the
strengths of IMLE and GLO in a way that overcomes the main drawbacks of each
method. Consequently, GLANN generates images that are far better than GLO and
IMLE. Our method does not suffer from mode collapse which plagues GAN training
and is much more stable. Qualitative results show that GLANN outperforms a
baseline consisting of 800 GANs and VAEs on commonly used datasets. Our models
are also shown to be effective for training truly non-adversarial unsupervised
image translation
Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery
Obtaining models that capture imaging markers relevant for disease
progression and treatment monitoring is challenging. Models are typically based
on large amounts of data with annotated examples of known markers aiming at
automating detection. High annotation effort and the limitation to a vocabulary
of known markers limit the power of such approaches. Here, we perform
unsupervised learning to identify anomalies in imaging data as candidates for
markers. We propose AnoGAN, a deep convolutional generative adversarial network
to learn a manifold of normal anatomical variability, accompanying a novel
anomaly scoring scheme based on the mapping from image space to a latent space.
Applied to new data, the model labels anomalies, and scores image patches
indicating their fit into the learned distribution. Results on optical
coherence tomography images of the retina demonstrate that the approach
correctly identifies anomalous images, such as images containing retinal fluid
or hyperreflective foci.Comment: To be published in the proceedings of the international conference on
Information Processing in Medical Imaging (IPMI), 201
Unpaired Photo-to-Caricature Translation on Faces in the Wild
Recently, image-to-image translation has been made much progress owing to the
success of conditional Generative Adversarial Networks (cGANs). And some
unpaired methods based on cycle consistency loss such as DualGAN, CycleGAN and
DiscoGAN are really popular. However, it's still very challenging for
translation tasks with the requirement of high-level visual information
conversion, such as photo-to-caricature translation that requires satire,
exaggeration, lifelikeness and artistry. We present an approach for learning to
translate faces in the wild from the source photo domain to the target
caricature domain with different styles, which can also be used for other
high-level image-to-image translation tasks. In order to capture global
structure with local statistics while translation, we design a dual pathway
model with one coarse discriminator and one fine discriminator. For generator,
we provide one extra perceptual loss in association with adversarial loss and
cycle consistency loss to achieve representation learning for two different
domains. Also the style can be learned by the auxiliary noise input.
Experiments on photo-to-caricature translation of faces in the wild show
considerable performance gain of our proposed method over state-of-the-art
translation methods as well as its potential real applications.Comment: 28 pages, 11 figure
An empirical study on evaluation metrics of generative adversarial networks
Evaluating generative adversarial networks (GANs) is inherently challenging.
In this paper, we revisit several representative sample-based evaluation
metrics for GANs, and address the problem of how to evaluate the evaluation
metrics. We start with a few necessary conditions for metrics to produce
meaningful scores, such as distinguishing real from generated samples,
identifying mode dropping and mode collapsing, and detecting overfitting. With
a series of carefully designed experiments, we comprehensively investigate
existing sample-based metrics and identify their strengths and limitations in
practical settings. Based on these results, we observe that kernel Maximum Mean
Discrepancy (MMD) and the 1-Nearest-Neighbor (1-NN) two-sample test seem to
satisfy most of the desirable properties, provided that the distances between
samples are computed in a suitable feature space. Our experiments also unveil
interesting properties about the behavior of several popular GAN models, such
as whether they are memorizing training samples, and how far they are from
learning the target distribution.Comment: arXiv admin note: text overlap with arXiv:1802.03446 by other author
Quantum-assisted associative adversarial network: Applying quantum annealing in deep learning
We present an algorithm for learning a latent variable generative model via
generative adversarial learning where the canonical uniform noise input is
replaced by samples from a graphical model. This graphical model is learned by
a Boltzmann machine which learns low-dimensional feature representation of data
extracted by the discriminator. A quantum annealer, the D-Wave 2000Q, is used
to sample from this model. This algorithm joins a growing family of algorithms
that use a quantum annealing subroutine in deep learning, and provides a
framework to test the advantages of quantum-assisted learning in GANs. Fully
connected, symmetric bipartite and Chimera graph topologies are compared on a
reduced stochastically binarized MNIST dataset, for both classical and quantum
annealing sampling methods. The quantum-assisted associative adversarial
network successfully learns a generative model of the MNIST dataset for all
topologies, and is also applied to the LSUN dataset bedrooms class for the
Chimera topology. Evaluated using the Fr\'{e}chet inception distance and
inception score, the quantum and classical versions of the algorithm are found
to have equivalent performance for learning an implicit generative model of the
MNIST dataset
- …