7,104 research outputs found
One-Sided Unsupervised Domain Mapping
In unsupervised domain mapping, the learner is given two unmatched datasets
and . The goal is to learn a mapping that translates a sample
in to the analog sample in . Recent approaches have shown that when
learning simultaneously both and the inverse mapping ,
convincing mappings are obtained. In this work, we present a method of learning
without learning . This is done by learning a mapping that
maintains the distance between a pair of samples. Moreover, good mappings are
obtained, even by maintaining the distance between different parts of the same
sample before and after mapping. We present experimental results that the new
method not only allows for one sided mapping learning, but also leads to
preferable numerical results over the existing circularity-based constraint.
Our entire code is made publicly available at
https://github.com/sagiebenaim/DistanceGAN .Comment: to be published in NIPS 201
Unsupervised Shadow Removal Using Target Consistency Generative Adversarial Network
Unsupervised shadow removal aims to learn a non-linear function to map the
original image from shadow domain to non-shadow domain in the absence of paired
shadow and non-shadow data. In this paper, we develop a simple yet efficient
target-consistency generative adversarial network (TC-GAN) for the shadow
removal task in the unsupervised manner. Compared with the bidirectional
mapping in cycle-consistency GAN based methods for shadow removal, TC-GAN tries
to learn a one-sided mapping to cast shadow images into shadow-free ones. With
the proposed target-consistency constraint, the correlations between shadow
images and the output shadow-free image are strictly confined. Extensive
comparison experiments results show that TC-GAN outperforms the
state-of-the-art unsupervised shadow removal methods by 14.9% in terms of FID
and 31.5% in terms of KID. It is rather remarkable that TC-GAN achieves
comparable performance with supervised shadow removal methods
A Theory of Output-Side Unsupervised Domain Adaptation
When learning a mapping from an input space to an output space, the
assumption that the sample distribution of the training data is the same as
that of the test data is often violated. Unsupervised domain shift methods
adapt the learned function in order to correct for this shift. Previous work
has focused on utilizing unlabeled samples from the target distribution. We
consider the complementary problem in which the unlabeled samples are given
post mapping, i.e., we are given the outputs of the mapping of unknown samples
from the shifted domain. Two other variants are also studied: the two sided
version, in which unlabeled samples are give from both the input and the output
spaces, and the Domain Transfer problem, which was recently formalized. In all
cases, we derive generalization bounds that employ discrepancy terms
Attention-GAN for Object Transfiguration in Wild Images
This paper studies the object transfiguration problem in wild images. The
generative network in classical GANs for object transfiguration often
undertakes a dual responsibility: to detect the objects of interests and to
convert the object from source domain to target domain. In contrast, we
decompose the generative network into two separat networks, each of which is
only dedicated to one particular sub-task. The attention network predicts
spatial attention maps of images, and the transformation network focuses on
translating objects. Attention maps produced by attention network are
encouraged to be sparse, so that major attention can be paid to objects of
interests. No matter before or after object transfiguration, attention maps
should remain constant. In addition, learning attention network can receive
more instructions, given the available segmentation annotations of images.
Experimental results demonstrate the necessity of investigating attention in
object transfiguration, and that the proposed algorithm can learn accurate
attention to improve quality of generated images
Sem-GAN: Semantically-Consistent Image-to-Image Translation
Unpaired image-to-image translation is the problem of mapping an image in the
source domain to one in the target domain, without requiring corresponding
image pairs. To ensure the translated images are realistically plausible,
recent works, such as Cycle-GAN, demands this mapping to be invertible. While,
this requirement demonstrates promising results when the domains are unimodal,
its performance is unpredictable in a multi-modal scenario such as in an image
segmentation task. This is because, invertibility does not necessarily enforce
semantic correctness. To this end, we present a semantically-consistent GAN
framework, dubbed Sem-GAN, in which the semantics are defined by the class
identities of image segments in the source domain as produced by a semantic
segmentation algorithm. Our proposed framework includes consistency constraints
on the translation task that, together with the GAN loss and the
cycle-constraints, enforces that the images when translated will inherit the
appearances of the target domain, while (approximately) maintaining their
identities from the source domain. We present experiments on several
image-to-image translation tasks and demonstrate that Sem-GAN improves the
quality of the translated images significantly, sometimes by more than 20% on
the FCN score. Further, we show that semantic segmentation models, trained with
synthetic images translated via Sem-GAN, leads to significantly better
segmentation results than other variants
One-Shot Unsupervised Cross Domain Translation
Given a single image x from domain A and a set of images from domain B, our
task is to generate the analogous of x in B. We argue that this task could be a
key AI capability that underlines the ability of cognitive agents to act in the
world and present empirical evidence that the existing unsupervised domain
translation methods fail on this task. Our method follows a two step process.
First, a variational autoencoder for domain B is trained. Then, given the new
sample x, we create a variational autoencoder for domain A by adapting the
layers that are close to the image in order to directly fit x, and only
indirectly adapt the other layers. Our experiments indicate that the new method
does as well, when trained on one sample x, as the existing domain transfer
methods, when these enjoy a multitude of training samples from domain A. Our
code is made publicly available at
https://github.com/sagiebenaim/OneShotTranslationComment: Published at NIPS 201
Estimating the Success of Unsupervised Image to Image Translation
While in supervised learning, the validation error is an unbiased estimator
of the generalization (test) error and complexity-based generalization bounds
are abundant, no such bounds exist for learning a mapping in an unsupervised
way. As a result, when training GANs and specifically when using GANs for
learning to map between domains in a completely unsupervised way, one is forced
to select the hyperparameters and the stopping epoch by subjectively examining
multiple options. We propose a novel bound for predicting the success of
unsupervised cross domain mapping methods, which is motivated by the recently
proposed Simplicity Principle. The bound can be applied both in expectation,
for comparing hyperparameters and for selecting a stopping criterion, or per
sample, in order to predict the success of a specific cross-domain translation.
The utility of the bound is demonstrated in an extensive set of experiments
employing multiple recent algorithms. Our code is available at
https://github.com/sagiebenaim/gan_bound .Comment: The first and second authors contributed equall
Batch weight for domain adaptation with mass shift
Unsupervised domain transfer is the task of transferring or translating
samples from a source distribution to a different target distribution. Current
solutions unsupervised domain transfer often operate on data on which the modes
of the distribution are well-matched, for instance have the same frequencies of
classes between source and target distributions. However, these models do not
perform well when the modes are not well-matched, as would be the case when
samples are drawn independently from two different, but related, domains. This
mode imbalance is problematic as generative adversarial networks (GANs), a
successful approach in this setting, are sensitive to mode frequency, which
results in a mismatch of semantics between source samples and generated samples
of the target distribution. We propose a principled method of re-weighting
training samples to correct for such mass shift between the transferred
distributions, which we call batch-weight. We also provide rigorous
probabilistic setting for domain transfer and new simplified objective for
training transfer networks, an alternative to complex, multi-component loss
functions used in the current state-of-the art image-to-image translation
models. The new objective stems from the discrimination of joint distributions
and enforces cycle-consistency in an abstract, high-level, rather than
pixel-wise, sense. Lastly, we experimentally show the effectiveness of the
proposed methods in several image-to-image translation tasks
Expression Conditional GAN for Facial Expression-to-Expression Translation
In this paper, we focus on the facial expression translation task and propose
a novel Expression Conditional GAN (ECGAN) which can learn the mapping from one
image domain to another one based on an additional expression attribute. The
proposed ECGAN is a generic framework and is applicable to different expression
generation tasks where specific facial expression can be easily controlled by
the conditional attribute label. Besides, we introduce a novel face mask loss
to reduce the influence of background changing. Moreover, we propose an entire
framework for facial expression generation and recognition in the wild, which
consists of two modules, i.e., generation and recognition. Finally, we evaluate
our framework on several public face datasets in which the subjects have
different races, illumination, occlusion, pose, color, content and background
conditions. Even though these datasets are very diverse, both the qualitative
and quantitative results demonstrate that our approach is able to generate
facial expressions accurately and robustly.Comment: 5 pages, 5 figures, accepted to ICIP 201
Visual Analogies between Atari Games for Studying Transfer Learning in RL
In this work, we ask the following question: Can visual analogies, learned in
an unsupervised way, be used in order to transfer knowledge between pairs of
games and even play one game using an agent trained for another game? We
attempt to answer this research question by creating visual analogies between a
pair of games: a source game and a target game. For example, given a video
frame in the target game, we map it to an analogous state in the source game
and then attempt to play using a trained policy learned for the source game. We
demonstrate convincing visual mapping between four pairs of games (eight
mappings), which are used to evaluate three transfer learning approaches
- …