25,929 research outputs found
CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images
With the powerfulness of convolution neural networks (CNN), CNN based face
reconstruction has recently shown promising performance in reconstructing
detailed face shape from 2D face images. The success of CNN-based methods
relies on a large number of labeled data. The state-of-the-art synthesizes such
data using a coarse morphable face model, which however has difficulty to
generate detailed photo-realistic images of faces (with wrinkles). This paper
presents a novel face data generation method. Specifically, we render a large
number of photo-realistic face images with different attributes based on
inverse rendering. Furthermore, we construct a fine-detailed face image dataset
by transferring different scales of details from one image to another. We also
construct a large number of video-type adjacent frame pairs by simulating the
distribution of real video data. With these nicely constructed datasets, we
propose a coarse-to-fine learning framework consisting of three convolutional
networks. The networks are trained for real-time detailed 3D face
reconstruction from monocular video as well as from a single image. Extensive
experimental results demonstrate that our framework can produce high-quality
reconstruction but with much less computation time compared to the
state-of-the-art. Moreover, our method is robust to pose, expression and
lighting due to the diversity of data.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence, 201
3D face reconstruction with geometry details from a single image
3D face reconstruction from a single image is a classical and challenging problem with wide applications in many areas. Inspired by recent works in face animation from RGB-D or monocular video inputs, we develop a novel method for reconstructing 3D faces from unconstrained 2D images using a coarse-to-fine optimization strategy. First, a smooth coarse 3D face is generated from an example-based bilinear face model by aligning the projection of 3D face landmarks with 2D landmarks detected from the input image. Afterward, using local corrective deformation fields, the coarse 3D face is refined using photometric consistency constraints, resulting in a medium face shape. Finally, a shape-from-shading method is applied on the medium face to recover fine geometric details. Our method outperforms the state-of-the-art approaches in terms of accuracy and detail recovery, which is demonstrated in extensive experiments using real-world models and publicly available data sets
Long-range concealed object detection through active covert illumination
© 2015 SPIE. When capturing a scene for surveillance, the addition of rich 3D data can dramatically improve the accuracy of object detection or face recognition. Traditional 3D techniques, such as geometric stereo, only provide a coarse grained reconstruction of the scene and are ill-suited to fine analysis. Photometric stereo is a well established technique providing dense, high-resolution, reconstructions, using active artificial illumination of an object from multiple directions to gather surface information. It is typically used indoors, at short range
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
In this work, we present a multimodal solution to the problem of 4D face
reconstruction from monocular videos. 3D face reconstruction from 2D images is
an under-constrained problem due to the ambiguity of depth. State-of-the-art
methods try to solve this problem by leveraging visual information from a
single image or video, whereas 3D mesh animation approaches rely more on audio.
However, in most cases (e.g. AR/VR applications), videos include both visual
and speech information. We propose AVFace that incorporates both modalities and
accurately reconstructs the 4D facial and lip motion of any speaker, without
requiring any 3D ground truth for training. A coarse stage estimates the
per-frame parameters of a 3D morphable model, followed by a lip refinement, and
then a fine stage recovers facial geometric details. Due to the temporal audio
and video information captured by transformer-based modules, our method is
robust in cases when either modality is insufficient (e.g. face occlusions).
Extensive qualitative and quantitative evaluation demonstrates the superiority
of our method over the current state-of-the-art
3D corrective nose reconstruction from a single image
There is a steadily growing range of applications that can benefit from facial reconstruction techniques, leading to an increasing demand for reconstruction of high-quality 3D face models. While it is an important expressive part of the human face, the nose has received less attention than other expressive regions in the face reconstruction literature. When applying existing reconstruction methods to facial images, the reconstructed nose models are often inconsistent with the desired shape and expression. In this paper, we propose a coarse-to-fine 3D nose reconstruction and correction pipeline to build a nose model from a single image, where 3D and 2D nose curve correspondences are adaptively updated and refined. We first correct the reconstruction result coarsely using constraints of 3D-2D sparse landmark correspondences, and then heuristically update a dense 3D-2D curve correspondence based on the coarsely corrected result. A final refinement step is performed to correct the shape based on the updated 3D-2D dense curve constraints. Experimental results show the advantages of our method for 3D nose reconstruction over existing methods
Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
3D Morphable Model (3DMM) based methods have achieved great success in
recovering 3D face shapes from single-view images. However, the facial textures
recovered by such methods lack the fidelity as exhibited in the input images.
Recent work demonstrates high-quality facial texture recovering with generative
networks trained from a large-scale database of high-resolution UV maps of face
textures, which is hard to prepare and not publicly available. In this paper,
we introduce a method to reconstruct 3D facial shapes with high-fidelity
textures from single-view images in-the-wild, without the need to capture a
large-scale face texture database. The main idea is to refine the initial
texture generated by a 3DMM based method with facial details from the input
image. To this end, we propose to use graph convolutional networks to
reconstruct the detailed colors for the mesh vertices instead of reconstructing
the UV map. Experiments show that our method can generate high-quality results
and outperforms state-of-the-art methods in both qualitative and quantitative
comparisons.Comment: Accepted to CVPR 2020. The source code is available at
https://github.com/FuxiCV/3D-Face-GCN
Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz
The reconstruction of dense 3D models of face geometry and appearance from a
single image is highly challenging and ill-posed. To constrain the problem,
many approaches rely on strong priors, such as parametric face models learned
from limited 3D scan data. However, prior models restrict generalization of the
true diversity in facial geometry, skin reflectance and illumination. To
alleviate this problem, we present the first approach that jointly learns 1) a
regressor for face shape, expression, reflectance and illumination on the basis
of 2) a concurrently learned parametric face model. Our multi-level face model
combines the advantage of 3D Morphable Models for regularization with the
out-of-space generalization of a learned corrective space. We train end-to-end
on in-the-wild images without dense annotations by fusing a convolutional
encoder with a differentiable expert-designed renderer and a self-supervised
training loss, both defined at multiple detail levels. Our approach compares
favorably to the state-of-the-art in terms of reconstruction quality, better
generalizes to real world faces, and runs at over 250 Hz.Comment: CVPR 2018 (Oral). Project webpage:
https://gvv.mpi-inf.mpg.de/projects/FML
- …