15,329 research outputs found
LatentKeypointGAN: Controlling GANs via Latent Keypoints
Generative adversarial networks (GANs) have attained photo-realistic quality
in image generation. However, how to best control the image content remains an
open challenge. We introduce LatentKeypointGAN, a two-stage GAN which is
trained end-to-end on the classical GAN objective with internal conditioning on
a set of space keypoints. These keypoints have associated appearance embeddings
that respectively control the position and style of the generated objects and
their parts. A major difficulty that we address with suitable network
architectures and training schemes is disentangling the image into spatial and
appearance factors without domain knowledge and supervision signals. We
demonstrate that LatentKeypointGAN provides an interpretable latent space that
can be used to re-arrange the generated images by re-positioning and exchanging
keypoint embeddings, such as generating portraits by combining the eyes, nose,
and mouth from different images. In addition, the explicit generation of
keypoints and matching images enables a new, GAN-based method for unsupervised
keypoint detection
Recycle-GAN: Unsupervised Video Retargeting
We introduce a data-driven approach for unsupervised video retargeting that
translates content from one domain to another while preserving the style native
to a domain, i.e., if contents of John Oliver's speech were to be transferred
to Stephen Colbert, then the generated content/speech should be in Stephen
Colbert's style. Our approach combines both spatial and temporal information
along with adversarial losses for content translation and style preservation.
In this work, we first study the advantages of using spatiotemporal constraints
over spatial constraints for effective retargeting. We then demonstrate the
proposed approach for the problems where information in both space and time
matters such as face-to-face translation, flower-to-flower, wind and cloud
synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos -
http://www.cs.cmu.edu/~aayushb/Recycle-GA
Style Separation and Synthesis via Generative Adversarial Networks
Style synthesis attracts great interests recently, while few works focus on
its dual problem "style separation". In this paper, we propose the Style
Separation and Synthesis Generative Adversarial Network (S3-GAN) to
simultaneously implement style separation and style synthesis on object
photographs of specific categories. Based on the assumption that the object
photographs lie on a manifold, and the contents and styles are independent, we
employ S3-GAN to build mappings between the manifold and a latent vector space
for separating and synthesizing the contents and styles. The S3-GAN consists of
an encoder network, a generator network, and an adversarial network. The
encoder network performs style separation by mapping an object photograph to a
latent vector. Two halves of the latent vector represent the content and style,
respectively. The generator network performs style synthesis by taking a
concatenated vector as input. The concatenated vector contains the style half
vector of the style target image and the content half vector of the content
target image. Once obtaining the images from the generator network, an
adversarial network is imposed to generate more photo-realistic images.
Experiments on CelebA and UT Zappos 50K datasets demonstrate that the S3-GAN
has the capacity of style separation and synthesis simultaneously, and could
capture various styles in a single model
- …