42 research outputs found
Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation
Image-to-image translation has been made much progress with embracing
Generative Adversarial Networks (GANs). However, it's still very challenging
for translation tasks that require high quality, especially at high-resolution
and photorealism. In this paper, we present Discriminative Region Proposal
Adversarial Networks (DRPAN) for high-quality image-to-image translation. We
decompose the procedure of image-to-image translation task into three iterated
steps, first is to generate an image with global structure but some local
artifacts (via GAN), second is using our DRPnet to propose the most fake region
from the generated image, and third is to implement "image inpainting" on the
most fake region for more realistic result through a reviser, so that the
system (DRPAN) can be gradually optimized to synthesize images with more
attention on the most artifact local part. Experiments on a variety of
image-to-image translation tasks and datasets validate that our method
outperforms state-of-the-arts for producing high-quality translation results in
terms of both human perceptual studies and automatic quantitative measures.Comment: ECCV 201
Compatibility Family Learning for Item Recommendation and Generation
Compatibility between items, such as clothes and shoes, is a major factor
among customer's purchasing decisions. However, learning "compatibility" is
challenging due to (1) broader notions of compatibility than those of
similarity, (2) the asymmetric nature of compatibility, and (3) only a small
set of compatible and incompatible items are observed. We propose an end-to-end
trainable system to embed each item into a latent vector and project a query
item into K compatible prototypes in the same space. These prototypes reflect
the broad notions of compatibility. We refer to both the embedding and
prototypes as "Compatibility Family". In our learned space, we introduce a
novel Projected Compatibility Distance (PCD) function which is differentiable
and ensures diversity by aiming for at least one prototype to be close to a
compatible item, whereas none of the prototypes are close to an incompatible
item. We evaluate our system on a toy dataset, two Amazon product datasets, and
Polyvore outfit dataset. Our method consistently achieves state-of-the-art
performance. Finally, we show that we can visualize the candidate compatible
prototypes using a Metric-regularized Conditional Generative Adversarial
Network (MrCGAN), where the input is a projected prototype and the output is a
generated image of a compatible item. We ask human evaluators to judge the
relative compatibility between our generated images and images generated by
CGANs conditioned directly on query items. Our generated images are
significantly preferred, with roughly twice the number of votes as others.Comment: 9 pages, accepted to AAAI 201
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
Synthesizing realistic images from human drawn sketches is a challenging
problem in computer graphics and vision. Existing approaches either need exact
edge maps, or rely on retrieval of existing photographs. In this work, we
propose a novel Generative Adversarial Network (GAN) approach that synthesizes
plausible images from 50 categories including motorcycles, horses and couches.
We demonstrate a data augmentation technique for sketches which is fully
automatic, and we show that the augmented data is helpful to our task. We
introduce a new network building block suitable for both the generator and
discriminator which improves the information flow by injecting the input image
at multiple scales. Compared to state-of-the-art image translation methods, our
approach generates more realistic images and achieves significantly higher
Inception Scores.Comment: Accepted to CVPR 201
SGAN: An Alternative Training of Generative Adversarial Networks
The Generative Adversarial Networks (GANs) have demonstrated impressive
performance for data synthesis, and are now used in a wide range of computer
vision tasks. In spite of this success, they gained a reputation for being
difficult to train, what results in a time-consuming and human-involved
development process to use them.
We consider an alternative training process, named SGAN, in which several
adversarial "local" pairs of networks are trained independently so that a
"global" supervising pair of networks can be trained against them. The goal is
to train the global pair with the corresponding ensemble opponent for improved
performances in terms of mode coverage. This approach aims at increasing the
chances that learning will not stop for the global pair, preventing both to be
trapped in an unsatisfactory local minimum, or to face oscillations often
observed in practice. To guarantee the latter, the global pair never affects
the local ones.
The rules of SGAN training are thus as follows: the global generator and
discriminator are trained using the local discriminators and generators,
respectively, whereas the local networks are trained with their fixed local
opponent.
Experimental results on both toy and real-world problems demonstrate that
this approach outperforms standard training in terms of better mitigating mode
collapse, stability while converging and that it surprisingly, increases the
convergence speed as well