13 research outputs found
Photo-Realistic Facial Details Synthesis from Single Image
We present a single-image 3D face synthesis technique that can handle
challenging facial expressions while recovering fine geometric details. Our
technique employs expression analysis for proxy face geometry generation and
combines supervised and unsupervised learning for facial detail synthesis. On
proxy generation, we conduct emotion prediction to determine a new
expression-informed proxy. On detail synthesis, we present a Deep Facial Detail
Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs
both geometry and appearance loss functions. For geometry, we capture 366
high-quality 3D scans from 122 different subjects under 3 facial expressions.
For appearance, we use additional 20K in-the-wild face images and apply
image-based rendering to accommodate lighting variations. Comprehensive
experiments demonstrate that our framework can produce high-quality 3D faces
with realistic details under challenging facial expressions
Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
3D Morphable Model (3DMM) based methods have achieved great success in
recovering 3D face shapes from single-view images. However, the facial textures
recovered by such methods lack the fidelity as exhibited in the input images.
Recent work demonstrates high-quality facial texture recovering with generative
networks trained from a large-scale database of high-resolution UV maps of face
textures, which is hard to prepare and not publicly available. In this paper,
we introduce a method to reconstruct 3D facial shapes with high-fidelity
textures from single-view images in-the-wild, without the need to capture a
large-scale face texture database. The main idea is to refine the initial
texture generated by a 3DMM based method with facial details from the input
image. To this end, we propose to use graph convolutional networks to
reconstruct the detailed colors for the mesh vertices instead of reconstructing
the UV map. Experiments show that our method can generate high-quality results
and outperforms state-of-the-art methods in both qualitative and quantitative
comparisons.Comment: Accepted to CVPR 2020. The source code is available at
https://github.com/FuxiCV/3D-Face-GCN
State-of-the-Art in 3D Face Reconstruction from a Single RGB Image
Since diverse and complex emotions need to be expressed by different facial deformation and appearances, facial animation has become a serious and on-going challenge for computer animation industry. Face reconstruction techniques based on 3D morphable face model and deep learning provide one effective solution to reuse existing databases and create believable animation of new characters from images or videos in seconds, which greatly reduce heavy manual operations and a lot of time. In this paper, we review the databases and state-of-the-art methods of 3D face reconstruction from a single RGB image. First, we classify 3D reconstruction methods into three categories and review each of them. These three categories are: Shape-from-Shading (SFS), 3D Morphable Face Model (3DMM), and Deep Learning (DL) based 3D face reconstruction. Next, we introduce existing 2D and 3D facial databases. After that, we review 10 methods of deep learning-based 3D face reconstruction and evaluate four representative ones among them. Finally, we draw conclusions of this paper and discuss future research directions
Video-driven Neural Physically-based Facial Asset for Production
Production-level workflows for producing convincing 3D dynamic human faces
have long relied on an assortment of labor-intensive tools for geometry and
texture generation, motion capture and rigging, and expression synthesis.
Recent neural approaches automate individual components but the corresponding
latent representations cannot provide artists with explicit controls as in
conventional tools. In this paper, we present a new learning-based,
video-driven approach for generating dynamic facial geometries with
high-quality physically-based assets. For data collection, we construct a
hybrid multiview-photometric capture stage, coupling with ultra-fast video
cameras to obtain raw 3D facial assets. We then set out to model the facial
expression, geometry and physically-based textures using separate VAEs where we
impose a global MLP based expression mapping across the latent spaces of
respective networks, to preserve characteristics across respective attributes.
We also model the delta information as wrinkle maps for the physically-based
textures, achieving high-quality 4K dynamic textures. We demonstrate our
approach in high-fidelity performer-specific facial capture and cross-identity
facial motion retargeting. In addition, our multi-VAE-based neural asset, along
with the fast adaptation schemes, can also be deployed to handle in-the-wild
videos. Besides, we motivate the utility of our explicit facial disentangling
strategy by providing various promising physically-based editing results with
high realism. Comprehensive experiments show that our technique provides higher
accuracy and visual fidelity than previous video-driven facial reconstruction
and animation methods.Comment: For project page, see https://sites.google.com/view/npfa/ Notice: You
may not copy, reproduce, distribute, publish, display, perform, modify,
create derivative works, transmit, or in any way exploit any such content,
nor may you distribute any part of this content over any network, including a
local area network, sell or offer it for sale, or use such content to
construct any kind of databas