7 research outputs found
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model
We propose a novel end-to-end semi-supervised adversarial framework to
generate photorealistic face images of new identities with wide ranges of
expressions, poses, and illuminations conditioned by a 3D morphable model.
Previous adversarial style-transfer methods either supervise their networks
with large volume of paired data or use unpaired data with a highly
under-constrained two-way generative framework in an unsupervised fashion. We
introduce pairwise adversarial supervision to constrain two-way domain
adaptation by a small number of paired real and synthetic images for training
along with the large volume of unpaired data. Extensive qualitative and
quantitative experiments are performed to validate our idea. Generated face
images of new identities contain pose, lighting and expression diversity and
qualitative results show that they are highly constraint by the synthetic input
image while adding photorealism and retaining identity information. We combine
face images generated by the proposed method with the real data set to train
face recognition algorithms. We evaluated the model on two challenging data
sets: LFW and IJB-A. We observe that the generated images from our framework
consistently improves over the performance of deep face recognition network
trained with Oxford VGG Face dataset and achieves comparable results to the
state-of-the-art
Pixel Sampling for Style Preserving Face Pose Editing
The existing auto-encoder based face pose editing methods primarily focus on
modeling the identity preserving ability during pose synthesis, but are less
able to preserve the image style properly, which refers to the color,
brightness, saturation, etc. In this paper, we take advantage of the well-known
frontal/profile optical illusion and present a novel two-stage approach to
solve the aforementioned dilemma, where the task of face pose manipulation is
cast into face inpainting. By selectively sampling pixels from the input face
and slightly adjust their relative locations with the proposed ``Pixel
Attention Sampling" module, the face editing result faithfully keeps the
identity information as well as the image style unchanged. By leveraging
high-dimensional embedding at the inpainting stage, finer details are
generated. Further, with the 3D facial landmarks as guidance, our method is
able to manipulate face pose in three degrees of freedom, i.e., yaw, pitch, and
roll, resulting in more flexible face pose editing than merely controlling the
yaw angle as usually achieved by the current state-of-the-art. Both the
qualitative and quantitative evaluations validate the superiority of the
proposed approach
GIF: Generative Interpretable Faces
Photo-realistic visualization and animation of expressive human faces have
been a long standing challenge. 3D face modeling methods provide parametric
control but generates unrealistic images, on the other hand, generative 2D
models like GANs (Generative Adversarial Networks) output photo-realistic face
images, but lack explicit control. Recent methods gain partial control, either
by attempting to disentangle different factors in an unsupervised manner, or by
adding control post hoc to a pre-trained model. Unconditional GANs, however,
may entangle factors that are hard to undo later. We condition our generative
model on pre-defined control parameters to encourage disentanglement in the
generation process. Specifically, we condition StyleGAN2 on FLAME, a generative
3D face model. While conditioning on FLAME parameters yields unsatisfactory
results, we find that conditioning on rendered FLAME geometry and photometric
details works well. This gives us a generative 2D face model named GIF
(Generative Interpretable Faces) that offers FLAME's parametric control. Here,
interpretable refers to the semantic meaning of different parameters. Given
FLAME parameters for shape, pose, expressions, parameters for appearance,
lighting, and an additional style vector, GIF outputs photo-realistic face
images. We perform an AMT based perceptual study to quantitatively and
qualitatively evaluate how well GIF follows its conditioning. The code, data,
and trained model are publicly available for research purposes at
http://gif.is.tue.mpg.de.Comment: International Conference on 3D Vision (3DV) 202