1,281 research outputs found
Automatic facial expression tracking for 4D range scans
This paper presents a fully automatic approach of spatio-temporal facial expression tracking for 4D range scans without any manual interventions (such as specifying landmarks). The approach consists of three steps: rigid registration, facial model reconstruction, and facial expression tracking. A Scaling Iterative Closest Points (SICP) algorithm is introduced to compute the optimal rigid registration between a template facial model and a range scan with consideration of the scale problem. A deformable model, physically based on thin shells, is proposed to faithfully reconstruct the facial surface and texture from that range data. And then the reconstructed facial model is used to track facial expressions presented in a sequence of range scans by the deformable model
Automatic 3D facial modelling with deformable models.
Facial modelling and animation has been an active research subject in computer graphics since the 1970s. Due to extremely complex biomechanical structures of human faces and peoples visual familiarity with human faces, modelling and animating realistic human faces is still one of greatest challenges in computer graphics. Since we are so familiar with human faces and very sensitive to unnatural subtle changes in human faces, it usually requires a tremendous amount of artistry and manual work to create a convincing facial model and animation. There is a clear need of developing automatic techniques for facial modelling in order to reduce manual labouring. In order to obtain a realistic facial model of an individual, it is now common to make use of 3D scanners to capture range scans from the individual and then fit a template to the range scans. However, most existing template-fitting methods require manually selected landmarks to warp the template to the range scans. It would be tedious to select landmarks by hand over a large set of range scans. Another way to reduce repeated work is synthesis by reusing existing data. One example is expression cloning, which copies facial expression from one face to another instead of creating them from scratch. This aim of this study is to develop a fully automatic framework for template-based facial modelling, facial expression transferring and facial expression tracking from range scans. In this thesis, the author developed an extension of the iterative closest points (ICP) algorithm, which is able to match a template with range scans in different scales, and a deformable model, which can be used to recover the shapes of range scans and to establish correspondences between facial models. With the registration method and the deformable model, the author proposed a fully automatic approach to reconstructing facial models and textures from range scans without re-quiring any manual interventions. In order to reuse existing data for facial modelling, the author formulated and solved the problem of facial expression transferring in the framework of discrete differential geometry. The author also applied his methods to face tracking for 4D range scans. The results demonstrated the robustness of the registration method and the capabilities of the deformable model. A number of possible directions for future work were pointed out
Capture, Learning, and Synthesis of 3D Speaking Styles
Audio-driven 3D facial animation has been widely explored, but achieving
realistic, human-like performance is still unsolved. This is due to the lack of
available 3D datasets, models, and standard evaluation metrics. To address
this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans
captured at 60 fps and synchronized audio from 12 speakers. We then train a
neural network on our dataset that factors identity from facial motion. The
learned model, VOCA (Voice Operated Character Animation) takes any speech
signal as input - even speech in languages other than English - and
realistically animates a wide range of adult faces. Conditioning on subject
labels during training allows the model to learn a variety of realistic
speaking styles. VOCA also provides animator controls to alter speaking style,
identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball
rotations) during animation. To our knowledge, VOCA is the only realistic 3D
facial animation model that is readily applicable to unseen subjects without
retargeting. This makes VOCA suitable for tasks like in-game video, virtual
reality avatars, or any scenario in which the speaker, speech, or language is
not known in advance. We make the dataset and model available for research
purposes at http://voca.is.tue.mpg.de.Comment: To appear in CVPR 201
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
To facilitate the analysis of human actions, interactions and emotions, we
compute a 3D model of human body pose, hand pose, and facial expression from a
single monocular image. To achieve this, we use thousands of 3D scans to train
a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with
fully articulated hands and an expressive face. Learning to regress the
parameters of SMPL-X directly from images is challenging without paired images
and 3D ground truth. Consequently, we follow the approach of SMPLify, which
estimates 2D features and then optimizes model parameters to fit the features.
We improve on SMPLify in several significant ways: (1) we detect 2D features
corresponding to the face, hands, and feet and fit the full SMPL-X model to
these; (2) we train a new neural network pose prior using a large MoCap
dataset; (3) we define a new interpenetration penalty that is both fast and
accurate; (4) we automatically detect gender and the appropriate body models
(male, female, or neutral); (5) our PyTorch implementation achieves a speedup
of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to
both controlled images and images in the wild. We evaluate 3D accuracy on a new
curated dataset comprising 100 images with pseudo ground-truth. This is a step
towards automatic expressive human capture from monocular RGB data. The models,
code, and data are available for research purposes at
https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
Applications of Face Analysis and Modeling in Media Production
Facial expressions play an important role in day-by-day communication as well as media production. This article surveys automatic facial analysis and modeling methods using computer vision techniques and their applications for media production. The authors give a brief overview of the psychology of face perception and then describe some of the applications of computer vision and pattern recognition applied to face recognition in media production. This article also covers the automatic generation of face models, which are used in movie and TV productions for special effects in order to manipulate people's faces or combine real actors with computer graphics
- …