898 research outputs found
GAGAN: Geometry-Aware Generative Adversarial Networks
Deep generative models learned through adversarial training have become
increasingly popular for their ability to generate naturalistic image textures.
However, aside from their texture, the visual appearance of objects is
significantly influenced by their shape geometry; information which is not
taken into account by existing generative models. This paper introduces the
Geometry-Aware Generative Adversarial Networks (GAGAN) for incorporating
geometric information into the image generation process. Specifically, in GAGAN
the generator samples latent variables from the probability space of a
statistical shape model. By mapping the output of the generator to a canonical
coordinate frame through a differentiable geometric transformation, we enforce
the geometry of the objects and add an implicit connection from the prior to
the generated object. Experimental results on face generation indicate that the
GAGAN can generate realistic images of faces with arbitrary facial attributes
such as facial expression, pose, and morphology, that are of better quality
than current GAN-based methods. Our method can be used to augment any existing
GAN architecture and improve the quality of the images generated
Disentangling Factors of Variation by Mixing Them
We propose an approach to learn image representations that consist of
disentangled factors of variation without exploiting any manual labeling or
data domain knowledge. A factor of variation corresponds to an image attribute
that can be discerned consistently across a set of images, such as the pose or
color of objects. Our disentangled representation consists of a concatenation
of feature chunks, each chunk representing a factor of variation. It supports
applications such as transferring attributes from one image to another, by
simply mixing and unmixing feature chunks, and classification or retrieval
based on one or several attributes, by considering a user-specified subset of
feature chunks. We learn our representation without any labeling or knowledge
of the data domain, using an autoencoder architecture with two novel training
objectives: first, we propose an invariance objective to encourage that
encoding of each attribute, and decoding of each chunk, are invariant to
changes in other attributes and chunks, respectively; second, we include a
classification objective, which ensures that each chunk corresponds to a
consistently discernible attribute in the represented image, hence avoiding
degenerate feature mappings where some chunks are completely ignored. We
demonstrate the effectiveness of our approach on the MNIST, Sprites, and CelebA
datasets.Comment: CVPR 201
- …