86,412 research outputs found
SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute
Manipulating latent code in generative adversarial networks (GANs) for facial
image synthesis mainly focuses on continuous attribute synthesis (e.g., age,
pose and emotion), while discrete attribute synthesis (like face mask and
eyeglasses) receives less attention. Directly applying existing works to facial
discrete attributes may cause inaccurate results. In this work, we propose an
innovative framework to tackle challenging facial discrete attribute synthesis
via semantic decomposing, dubbed SD-GAN. To be concrete, we explicitly
decompose the discrete attribute representation into two components, i.e. the
semantic prior basis and offset latent representation. The semantic prior basis
shows an initializing direction for manipulating face representation in the
latent space. The offset latent presentation obtained by 3D-aware semantic
fusion network is proposed to adjust prior basis. In addition, the fusion
network integrates 3D embedding for better identity preservation and discrete
attribute synthesis. The combination of prior basis and offset latent
representation enable our method to synthesize photo-realistic face images with
discrete attributes. Notably, we construct a large and valuable dataset MEGN
(Face Mask and Eyeglasses images crawled from Google and Naver) for completing
the lack of discrete attributes in the existing dataset. Extensive qualitative
and quantitative experiments demonstrate the state-of-the-art performance of
our method. Our code is available at: https://github.com/MontaEllis/SD-GAN.Comment: 16 pages, 12 figures, Accepted by ACM MM202
MoFaNeRF: Morphable Facial Neural Radiance Field
We propose a parametric model that maps free-view images into a vector space
of coded facial shape, expression and appearance with a neural radiance field,
namely Morphable Facial NeRF. Specifically, MoFaNeRF takes the coded facial
shape, expression and appearance along with space coordinate and view direction
as input to an MLP, and outputs the radiance of the space point for
photo-realistic image synthesis. Compared with conventional 3D morphable models
(3DMM), MoFaNeRF shows superiority in directly synthesizing photo-realistic
facial details even for eyes, mouths, and beards. Also, continuous face
morphing can be easily achieved by interpolating the input shape, expression
and appearance codes. By introducing identity-specific modulation and texture
encoder, our model synthesizes accurate photometric details and shows strong
representation ability. Our model shows strong ability on multiple applications
including image-based fitting, random generation, face rigging, face editing,
and novel view synthesis. Experiments show that our method achieves higher
representation ability than previous parametric models, and achieves
competitive performance in several applications. To the best of our knowledge,
our work is the first facial parametric model built upon a neural radiance
field that can be used in fitting, generation and manipulation. The code and
data is available at https://github.com/zhuhao-nju/mofanerf.Comment: accepted to ECCV2022; code available at
http://github.com/zhuhao-nju/mofaner
HeadGAN: one-shot neural head synthesis and editing
Recent attempts to solve the problem of head reenactment using a single reference image have shown promising results. However, most of them either perform poorly in terms of photo-realism, or fail to meet the identity preservation problem, or do not fully transfer the driving pose and expression. We propose HeadGAN, a novel system that conditions synthesis on 3D face representations, which can be extracted from any driving video and adapted to the facial geometry of any reference image, disentangling identity from expression. We further improve mouth movements, by utilising audio features as a complementary input. The 3D face representation enables HeadGAN to be further used as an efficient method for compression and reconstruction and a tool for expression and pose editing
VITON: An Image-based Virtual Try-on Network
We present an image-based VIirtual Try-On Network (VITON) without using 3D
information in any form, which seamlessly transfers a desired clothing item
onto the corresponding region of a person using a coarse-to-fine strategy.
Conditioned upon a new clothing-agnostic yet descriptive person representation,
our framework first generates a coarse synthesized image with the target
clothing item overlaid on that same person in the same pose. We further enhance
the initial blurry clothing area with a refinement network. The network is
trained to learn how much detail to utilize from the target clothing item, and
where to apply to the person in order to synthesize a photo-realistic image in
which the target item deforms naturally with clear visual patterns. Experiments
on our newly collected Zalando dataset demonstrate its promise in the
image-based virtual try-on task over state-of-the-art generative models
Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis
Cross-domain synthesizing realistic faces to learn deep models has attracted
increasing attention for facial expression analysis as it helps to improve the
performance of expression recognition accuracy despite having small number of
real training images. However, learning from synthetic face images can be
problematic due to the distribution discrepancy between low-quality synthetic
images and real face images and may not achieve the desired performance when
the learned model applies to real world scenarios. To this end, we propose a
new attribute guided face image synthesis to perform a translation between
multiple image domains using a single model. In addition, we adopt the proposed
model to learn from synthetic faces by matching the feature distributions
between different domains while preserving each domain's characteristics. We
evaluate the effectiveness of the proposed approach on several face datasets on
generating realistic face images. We demonstrate that the expression
recognition performance can be enhanced by benefiting from our face synthesis
model. Moreover, we also conduct experiments on a near-infrared dataset
containing facial expression videos of drivers to assess the performance using
in-the-wild data for driver emotion recognition.Comment: 8 pages, 8 figures, 5 tables, accepted by FG 2019. arXiv admin note:
substantial text overlap with arXiv:1905.0028
3D Face Synthesis Driven by Personality Impression
Synthesizing 3D faces that give certain personality impressions is commonly
needed in computer games, animations, and virtual world applications for
producing realistic virtual characters. In this paper, we propose a novel
approach to synthesize 3D faces based on personality impression for creating
virtual characters. Our approach consists of two major steps. In the first
step, we train classifiers using deep convolutional neural networks on a
dataset of images with personality impression annotations, which are capable of
predicting the personality impression of a face. In the second step, given a 3D
face and a desired personality impression type as user inputs, our approach
optimizes the facial details against the trained classifiers, so as to
synthesize a face which gives the desired personality impression. We
demonstrate our approach for synthesizing 3D faces giving desired personality
impressions on a variety of 3D face models. Perceptual studies show that the
perceived personality impressions of the synthesized faces agree with the
target personality impressions specified for synthesizing the faces. Please
refer to the supplementary materials for all results.Comment: 8pages;6 figure
- …