1,808 research outputs found
Recovering facial shape using a statistical model of surface normal direction
In this paper, we show how a statistical model of facial shape can be embedded within a shape-from-shading algorithm. We describe how facial shape can be captured using a statistical model of variations in surface normal direction. To construct this model, we make use of the azimuthal equidistant projection to map the distribution of surface normals from the polar representation on a unit sphere to Cartesian points on a local tangent plane. The distribution of surface normal directions is captured using the covariance matrix for the projected point positions. The eigenvectors of the covariance matrix define the modes of shape-variation in the fields of transformed surface normals. We show how this model can be trained using surface normal data acquired from range images and how to fit the model to intensity images of faces using constraints on the surface normal direction provided by Lambert's law. We demonstrate that the combination of a global statistical constraint and local irradiance constraint yields an efficient and accurate approach to facial shape recovery and is capable of recovering fine local surface details. We assess the accuracy of the technique on a variety of images with ground truth and real-world images
2D-to-3D facial expression transfer
Β© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Automatically changing the expression and physical features of a face from an input image is a topic that has been traditionally tackled in a 2D domain. In this paper, we bring this problem to 3D and propose a framework that given an input RGB video of a human face under a neutral expression, initially computes his/her 3D shape and then performs a transfer to a new and potentially non-observed expression. For this purpose, we parameterize the rest shape --obtained from standard factorization approaches over the input video-- using a triangular mesh which is further clustered into larger macro-segments. The expression transfer problem is then posed as a direct mapping between this shape and a source shape, such as the blend shapes of an off-the-shelf 3D dataset of human facial expressions. The mapping is resolved to be geometrically consistent between 3D models by requiring points in specific regions to map on semantic equivalent regions. We validate the approach on several synthetic and real examples of input faces that largely differ from the source shapes, yielding very realistic expression transfers even in cases with topology changes, such as a synthetic video sequence of a single-eyed cyclops.Peer ReviewedPostprint (author's final draft
3D Face Modelling, Analysis and Synthesis
Human faces have always been of a special interest to researchers in the computer vision and graphics areas. There has been an explosion in the number of studies around accurately modelling, analysing and synthesising realistic faces for various applications. The importance of human faces emerges from the fact that they are invaluable means of effective communication, recognition, behaviour analysis, conveying emotions, etc. Therefore, addressing the automatic visual perception of human faces efficiently could open up many influential applications in various domains, e.g. virtual/augmented reality, computer-aided surgeries, security and surveillance, entertainment, and many more. However, the vast variability associated with the geometry and appearance of human faces captured in unconstrained videos and images renders their automatic analysis and understanding very challenging even today.
The primary objective of this thesis is to develop novel methodologies of 3D computer vision for human faces that go beyond the state of the art and achieve unprecedented quality and robustness. In more detail, this thesis advances the state of the art in 3D facial shape reconstruction and tracking, fine-grained 3D facial motion estimation, expression recognition and facial synthesis with the aid of 3D face modelling. We give a special attention to the case where the input comes from monocular imagery data captured under uncontrolled settings, a.k.a. \textit{in-the-wild} data. This kind of data are available in abundance nowadays on the internet. Analysing these data pushes the boundaries of currently available computer vision algorithms and opens up many new crucial applications in the industry. We define the four targeted vision problems (3D facial reconstruction tracking, fine-grained 3D facial motion estimation, expression recognition, facial synthesis) in this thesis as the four 3D-based essential systems for the automatic facial behaviour understanding and show how they rely on each other. Finally, to aid the research conducted in this thesis, we collect and annotate a large-scale videos dataset of monocular facial performances. All of our proposed methods demonstarte very promising quantitative and qualitative results when compared to the state-of-the-art methods
Analysis of 3D Face Reconstruction
This thesis investigates the long standing problem of 3D reconstruction from a single 2D face
image. Face reconstruction from a single 2D face image is an ill posed problem involving estimation of the intrinsic and the extrinsic camera parameters, light parameters, shape parameters
and the texture parameters. The proposed approach has many potential applications in the
law enforcement, surveillance, medicine, computer games and the entertainment industries.
This problem is addressed using an analysis by synthesis framework by reconstructing a 3D
face model from identity photographs. The identity photographs are a widely used medium for
face identi cation and can be found on identity cards and passports.
The novel contribution of this thesis is a new technique for creating 3D face models from a single
2D face image. The proposed method uses the improved dense 3D correspondence obtained
using rigid and non-rigid registration techniques. The existing reconstruction methods use the
optical
ow method for establishing 3D correspondence. The resulting 3D face database is used
to create a statistical shape model.
The existing reconstruction algorithms recover shape by optimizing over all the parameters
simultaneously. The proposed algorithm simplifies the reconstruction problem by using a step
wise approach thus reducing the dimension of the parameter space and simplifying the opti-
mization problem. In the alignment step, a generic 3D face is aligned with the given 2D face
image by using anatomical landmarks. The texture is then warped onto the 3D model by using
the spatial alignment obtained previously. The 3D shape is then recovered by optimizing over
the shape parameters while matching a texture mapped model to the target image.
There are a number of advantages of this approach. Firstly, it simpli es the optimization requirements and makes the optimization more robust. Second, there is no need to accurately
recover the illumination parameters. Thirdly, there is no need for recovering the texture parameters by using a texture synthesis approach. Fourthly, quantitative analysis is used for
improving the quality of reconstruction by improving the cost function. Previous methods use
qualitative methods such as visual analysis, and face recognition rates for evaluating reconstruction accuracy.
The improvement in the performance of the cost function occurs as a result of improvement
in the feature space comprising the landmark and intensity features. Previously, the feature
space has not been evaluated with respect to reconstruction accuracy thus leading to inaccurate
assumptions about its behaviour.
The proposed approach simpli es the reconstruction problem by using only identity images,
rather than placing eff ort on overcoming the pose, illumination and expression (PIE) variations.
This makes sense, as frontal face images under standard illumination conditions are widely
available and could be utilized for accurate reconstruction. The reconstructed 3D models with
texture can then be used for overcoming the PIE variations
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
To facilitate the analysis of human actions, interactions and emotions, we
compute a 3D model of human body pose, hand pose, and facial expression from a
single monocular image. To achieve this, we use thousands of 3D scans to train
a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with
fully articulated hands and an expressive face. Learning to regress the
parameters of SMPL-X directly from images is challenging without paired images
and 3D ground truth. Consequently, we follow the approach of SMPLify, which
estimates 2D features and then optimizes model parameters to fit the features.
We improve on SMPLify in several significant ways: (1) we detect 2D features
corresponding to the face, hands, and feet and fit the full SMPL-X model to
these; (2) we train a new neural network pose prior using a large MoCap
dataset; (3) we define a new interpenetration penalty that is both fast and
accurate; (4) we automatically detect gender and the appropriate body models
(male, female, or neutral); (5) our PyTorch implementation achieves a speedup
of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to
both controlled images and images in the wild. We evaluate 3D accuracy on a new
curated dataset comprising 100 images with pseudo ground-truth. This is a step
towards automatic expressive human capture from monocular RGB data. The models,
code, and data are available for research purposes at
https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201
Gazedirector: Fully articulated eye gaze redirection in video
We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior
ΠΠ΅ΡΠΊΠΎΠ½ΡΠ°ΠΊΡΠ½ΡΠΉ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³ Π΄ΡΡ Π°Π½ΠΈΡ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΎΠΏΡΠΈΡΠ΅ΡΠΊΠΈΡ Π΄Π°ΡΡΠΈΠΊΠΎΠ²
Π¦ΡΠ»Π»Ρ Π΄Π°Π½ΠΎΡ ΡΠΎΠ±ΠΎΡΠΈ Ρ ΠΊΠ»Π°ΡΠΈΡΡΠΊΠ°ΡΡΡ ΠΏΡΠ΄Ρ
ΠΎΠ΄ΡΠ² Π΄ΠΎ Π±Π΅Π·ΠΊΠΎΠ½ΡΠ°ΠΊΡΠ½ΠΎΠ³ΠΎ ΠΌΠΎΠ½ΡΡΠΎΡΠΈΠ½Π³Ρ Π΄ΠΈΡ
Π°Π½Π½Ρ Ρ ΡΠΎΠ·ΡΠΎΠ±ΠΊΠ° ΡΡΡΡΠΊΡΡΡΠΈ ΡΠΈΡΡΠ΅ΠΌΠΈ ΠΌΠΎΠ½ΡΡΠΎΡΠΈΠ½Π³Ρ Π· ΡΡΡΠ½Π΅Π½Π½ΡΠΌ Π°ΡΡΠ΅ΡΠ°ΠΊΡΡΠ² ΠΌΡΠΌΡΠΊΠΈ. Π£ΡΡ Π½Π°ΡΠ²Π½Ρ ΠΌΠ΅ΡΠΎΠ΄ΠΈ Π±ΡΠ»ΠΈ ΡΠΎΠ·Π΄ΡΠ»Π΅Π½Ρ Π½Π° Π΄Π²Ρ ΠΎΡΠ½ΠΎΠ²Π½Ρ Π³ΡΡΠΏΠΈ: ΠΌΠ΅ΡΠΎΠ΄ΠΈ Π½Π° ΠΎΡΠ½ΠΎΠ²Ρ Π²ΠΈΠ·Π½Π°ΡΠ΅Π½Π½Ρ Π΄ΠΈΡ
Π°Π½Π½Ρ Π· 3-D Π·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½Π½Ρ ΠΎΠ±'ΡΠΊΡΠ° Ρ ΠΌΠ΅ΡΠΎΠ΄ΠΈ Π½Π° ΠΎΡΠ½ΠΎΠ²Ρ 2-D ΠΎΠ±ΡΠΎΠ±ΠΊΠΈ Π·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½Ρ. ΠΡΠ»Π° ΡΠΎΠ·ΡΠΎΠ±Π»Π΅Π½Π° ΡΡΡΡΠΊΡΡΡΠ° ΡΠΈΡΡΠ΅ΠΌΠΈ ΠΌΠΎΠ½ΡΡΠΎΡΠΈΠ½Π³Ρ Π΄ΠΈΡ
Π°Π½Π½Ρ Π½Π° ΠΎΡΠ½ΠΎΠ²Ρ ΠΎΠΏΡΠΈΡΠ½ΠΈΡ
ΡΠ΅Π½ΡΠΎΡΡΠ² Π· ΠΌΠΎΠΆΠ»ΠΈΠ²ΡΡΡΡ Π²ΠΈΠ΄Π°Π»Π΅Π½Π½Ρ Π°ΡΡΠ΅ΡΠ°ΠΊΡΡΠ² ΠΌΡΠΌΡΠΊΠΈ. ΠΠΎΠ²ΠΈΠΉ ΠΏΡΠ΄Ρ
ΡΠ΄ Π΄ΠΎΠ·Π²ΠΎΠ»ΡΡ ΠΏΠΎΠΊΡΠ°ΡΠΈΡΠΈ ΠΌΠΎΠ½ΡΡΠΎΡΠΈΠ½Π³ Π΄ΠΈΡ
Π°Π½Π½Ρ Π΄Π»Ρ ΠΎΠ±'ΡΠΊΡΡΠ² Π² ΠΏΠΎΠ»ΠΎΠΆΠ΅Π½Π½Ρ Π»Π΅ΠΆΠ°ΡΠΈ Π½Π° ΡΠΏΠΈΠ½Ρ Ρ Π² ΠΏΠΎΠ·ΠΈΡΡΡ ΡΠΈΠ΄ΡΡΠΈ.The main goal of this paper is to develop classification of non-contact respiration monitoring approaches and proposal of structure for system with facial artifacts rejection. All available techniques were divided into two main groups: based on reconstruction of respiration from 3-D image of object and based on 2-D image processing of techniques. Structure of system for respiration monitoring using optical sensors with facial artifacts removing was developed. New approach allows improving of respiration monitoring for objects in supine position and in a sitting position.Π¦Π΅Π»ΡΡ ΡΠ°Π±ΠΎΡΡ ΡΠ²Π»ΡΠ΅ΡΡΡ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ΠΎΠ² ΠΊ Π±Π΅ΡΠΊΠΎΠ½ΡΠ°ΠΊΡΠ½ΠΎΠΌΡ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³Ρ Π΄ΡΡ
Π°Π½ΠΈΡ ΠΈ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠΊΠ° ΡΡΡΡΠΊΡΡΡΡ ΡΠΈΡΡΠ΅ΠΌΡ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³Π° Ρ ΡΡΡΡΠ°Π½Π΅Π½ΠΈΠ΅ΠΌ Π°ΡΡΠ΅ΡΠ°ΠΊΡΠΎΠ² ΠΌΠΈΠΌΠΈΠΊΠΈ. ΠΡΠ΅ ΠΈΠΌΠ΅ΡΡΠΈΠ΅ΡΡ ΠΌΠ΅ΡΠΎΠ΄Ρ Π±ΡΠ»ΠΈ ΡΠ°Π·Π΄Π΅Π»Π΅Π½Ρ Π½Π° Π΄Π²Π΅ ΠΎΡΠ½ΠΎΠ²Π½ΡΠ΅ Π³ΡΡΠΏΠΏΡ: ΠΌΠ΅ΡΠΎΠ΄Ρ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ Π΄ΡΡ
Π°Π½ΠΈΡ ΠΈΠ· 3-D ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ ΠΎΠ±ΡΠ΅ΠΊΡΠ° ΠΈ ΠΌΠ΅ΡΠΎΠ΄Ρ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ 2-D ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΠΉ. ΠΡΠ»Π° ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½Π° ΡΡΡΡΠΊΡΡΡΠ° ΡΠΈΡΡΠ΅ΠΌΡ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³Π° Π΄ΡΡ
Π°Π½ΠΈΡ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΎΠΏΡΠΈΡΠ΅ΡΠΊΠΈΡ
Π΄Π°ΡΡΠΈΠΊΠΎΠ² Ρ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡΡΡ ΡΠ΄Π°Π»Π΅Π½ΠΈΡ Π°ΡΡΠ΅ΡΠ°ΠΊΡΠΎΠ² ΠΌΠΈΠΌΠΈΠΊΠΈ. ΠΠΎΠ²ΡΠΉ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΠ΅Ρ ΡΠ»ΡΡΡΠΈΡΡ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³ Π΄ΡΡ
Π°Π½ΠΈΡ Π΄Π»Ρ ΠΎΠ±ΡΠ΅ΠΊΡΠΎΠ² Π² ΠΏΠΎΠ»ΠΎΠΆΠ΅Π½ΠΈΠΈ Π»Π΅ΠΆΠ° Π½Π° ΡΠΏΠΈΠ½Π΅ ΠΈ Π² ΠΏΠΎΠ·ΠΈΡΠΈΠΈ ΡΠΈΠ΄Ρ
- β¦