14 research outputs found

    Recovering facial shape using a statistical model of surface normal direction

    Get PDF
    In this paper, we show how a statistical model of facial shape can be embedded within a shape-from-shading algorithm. We describe how facial shape can be captured using a statistical model of variations in surface normal direction. To construct this model, we make use of the azimuthal equidistant projection to map the distribution of surface normals from the polar representation on a unit sphere to Cartesian points on a local tangent plane. The distribution of surface normal directions is captured using the covariance matrix for the projected point positions. The eigenvectors of the covariance matrix define the modes of shape-variation in the fields of transformed surface normals. We show how this model can be trained using surface normal data acquired from range images and how to fit the model to intensity images of faces using constraints on the surface normal direction provided by Lambert's law. We demonstrate that the combination of a global statistical constraint and local irradiance constraint yields an efficient and accurate approach to facial shape recovery and is capable of recovering fine local surface details. We assess the accuracy of the technique on a variety of images with ground truth and real-world images

    Combining Dense Nonrigid Structure from Motion and 3D Morphable Models for Monocular 4D Face Reconstruction

    Get PDF
    This is the author accepted manuscript. The final version is available from ACM via the DOI in this record Monocular 4D face reconstruction is a challenging problem, especially in the case that the input video is captured under unconstrained conditions, i.e. "in the wild". The majority of the state-of-the-art approaches build upon 3D Morphable Modelling (3DMM), which has been proven to be more robust than model-free approaches such as Shape from Shading (SfS) or Structure from Motion (SfM). While offering visually plausible shape reconstruction results that resemble real faces, 3DMMs adhere to the model space learned from exemplar faces during the training phase, often yielding facial reconstructions that are excessively smooth and look too similar even across captured faces with completely different facial characteristics. This is due to the fact that 3DMMs are typically used as hard constraints on the reconstructed 3D shape. To overcome these limitations, in this paper we propose to combine 3DMMs with Dense Nonrigid Structure from Motion (DNSM), which is much less robust but has the potential of reconstructing fine details and capturing the subject-specific facial characteristics of every input. We effectively combine the best of both worlds by introducing a novel dense variational framework, which we solve efficiently by designing a convex optimisation strategy. In contrast to previous methods, we incorporate 3DMM as a soft constraint, penalizing both departure of reconstructed faces from the 3DMM subspace and variation of the identity component of the 3DMM over different frames of the input video. As demonstrated in qualitative and quantitative experiments, our method is robust, accurately estimates the 3D facial shape over time and outperforms other state-of-the-art methods of 4D face reconstruction

    Adaptive face modelling for reconstructing 3D face shapes from single 2D images

    Get PDF
    Example-based statistical face models using principle component analysis (PCA) have been widely deployed for three-dimensional (3D) face reconstruction and face recognition. The two common factors that are generally concerned with such models are the size of the training dataset and the selection of different examples in the training set. The representational power (RP) of an example-based model is its capability to depict a new 3D face for a given 2D face image. The RP of the model can be increased by correspondingly increasing the number of training samples. In this contribution, a novel approach is proposed to increase the RP of the 3D face reconstruction model by deforming a set of examples in the training dataset. A PCA-based 3D face model is adapted for each new near frontal input face image to reconstruct the 3D face shape. Further an extended Tikhonov regularisation method has been

    Recovering 3D facial shape via coupled 2D/3D space learning

    Full text link

    Binary Pattern Analysis for 3D Facial Action Unit Detection

    Full text link
    In this paper we propose new binary pattern features for use in the problem of 3D facial action unit (AU) detection. Two representations of 3D facial geometries are employed, the depth map and the Azimuthal Projection Distance Image (APDI). To these the traditional Local Binary Pattern is applied, along with Local Phase Quantisation, Gabor filters and Monogenic filters, followed by the binary pattern feature extraction method. Feature vectors are formed for each feature type through concatenation of histograms formed from the resulting binary numbers. Feature selection is then performed using a two-stage GentleBoost approach. Finally, we apply Support Vector Machines as classifiers for detection of each AU. This system is tested in two ways. First we perform 10-fold cross-validation on the Bosphorus database, and then we perform cross-database testing by training on this database and then testing on apex frames from the D3DFACS database, achieving promising results in both

    3D Reconstruction of 'In-the-Wild' Faces in Images and Videos

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this record 3D Morphable Models (3DMMs) are powerful statistical models of 3D facial shape and texture, and are among the state-of-the-art methods for reconstructing facial shape from single images. With the advent of new 3D sensors, many 3D facial datasets have been collected containing both neutral as well as expressive faces. However, all datasets are captured under controlled conditions. Thus, even though powerful 3D facial shape models can be learnt from such data, it is difficult to build statistical texture models that are sufficient to reconstruct faces captured in unconstrained conditions ('in-the-wild'). In this paper, we propose the first 'in-the-wild' 3DMM by combining a statistical model of facial identity and expression shape with an 'in-the-wild' texture model. We show that such an approach allows for the development of a greatly simplified fitting procedure for images and videos, as there is no need to optimise with regards to the illumination parameters. We have collected three new benchmarks that combine 'in-the-wild' images and video with ground truth 3D facial geometry, the first of their kind, and report extensive quantitative evaluations using them that demonstrate our method is state-of-the-art.Engineering and Physical Sciences Research Council (EPSRC

    3D-reconstruction of human jaw from a single image : integration between statistical shape from shading and shape from shading.

    Get PDF
    Object modeling is a fundamental problem in engineering, involving talents from computer-aided design, computational geometry, computer vision and advanced manufacturing. The process of object modeling takes three stages: sensing, representation, and analysis. Various sensors may be used to capture information about objects; optical cam- eras and laser scanners are common with rigid objects, while X-ray, CT and MRI are common with biological organs. These sensors may provide a direct or indirect inference about the object, requiring a geometric representation in the computer that is suitable for subsequent usage. Geometric representations that are compact, i.e., capture the main features of the objects with minimal number of data points or vertices, fall into the domain of computational geometry. Once a compact object representation is in the computer, various analysis steps can be conducted, including recognition, coding, transmission, etc. The subject matter of this thesis is object reconstruction from a sequence of optical images. An approach to estimate the depth of the visible portion of the human teeth from intraoral cameras has been developed, extending the classical shape from shading (SFS) solution to non-Lambertian surfaces with known object illumination characteristics. To augment the visible portion, and in order to have the entire jaw reconstructed without the use of CT or MRI or even X-rays, additional information will be added to database of human jaws. This database has been constructed from an adult population with variations in teeth size, degradation and alignments. The database contains both shape and albedo information for the population. Using this database, a novel statistical shape from shading (SSFS) approach has been created. To obtain accurate result from shape from shading and statistical shape from shading, final step will be integrated two approaches (SFS,SSFS) by using Iterative Closest Point algorithm (ICP). Keywords: computer vision, shading, 3D shape reconstruction, shape from shading, statistical, shape from shading, Iterative Closest Point

    3D reconstruction for plastic surgery simulation based on statistical shape models

    Get PDF
    This thesis has been accomplished in Crisalix in collaboration with the Universitat Pompeu Fabra within the program of Doctorats Industrials. Crisalix has the mission of enhancing the communication between professionals of plastic surgery and patients by providing a solution to the most common question during the surgery planning process of ``How will I look after the surgery?''. The solution proposed by Crisalix is based in 3D imaging technology. This technology generates the 3D reconstruction that accurately represents the area of the patient that is going to be operated. This is followed by the possibility of creating multiple simulations of the plastic procedure, which results in the representation of the possible outcomes of the surgery. This thesis presents a framework capable to reconstruct 3D shapes of faces and breasts of plastic surgery patients from 2D images and 3D scans. The 3D reconstruction of an object is a challenging problem with many inherent ambiguities. Statistical model based methods are a powerful approach to overcome some of these ambiguities. We follow the intuition of maximizing the use of available prior information by introducing it into statistical model based methods to enhance their properties. First, we explore Active Shape Models (ASM) which are a well known method to perform 2D shapes alignment. However, it is challenging to maintain prior information (e.g. small set of given landmarks) unchanged once the statistical model constraints are applied. We propose a new weighted regularized projection into the parameter space which allows us to obtain shapes that at the same time fulfill the imposed shape constraints and are plausible according to the statistical model. Second, we extend this methodology to be applied to 3D Morphable Models (3DMM), which are a widespread method to perform 3D reconstruction. However, existing methods present some limitations. Some of them are based in non-linear optimizations computationally expensive that can get stuck in local minima. Another limitation is that not all the methods provide enough resolution to represent accurately the anatomy details needed for this application. Given the medical use of the application, the accuracy and robustness of the method, are important factors to take into consideration. We show how 3DMM initialization and 3DMM fitting can be improved using our weighted regularized projection. Finally, we present a framework capable to reconstruct 3D shapes of plastic surgery patients from two possible inputs: 2D images and 3D scans. Our method is used in different stages of the 3D reconstruction pipeline: shape alignment; 3DMM initialization and 3DMM fitting. The developed methods have been integrated in the production environment of Crisalix, proving their validity.Aquesta tesi ha estat realitzada a Crisalix amb la col·laboració de la Universitat Pompeu Fabra sota el pla de Doctorats Industrials. Crisalix té com a objectiu la millora de la comunicació entre els professionals de la cirurgia plàstica i els pacients, proporcionant una solució a la pregunta que sorgeix més freqüentment durant el procés de planificació d'una operació quirúrgica ``Com em veuré després de la cirurgia?''. La solució proposada per Crisalix està basada en la tecnologia d'imatge 3D. Aquesta tecnologia genera la reconstrucció 3D de la zona del pacient operada, seguit de la possibilitat de crear múltiples simulacions obtenint la representació dels possibles resultats de la cirurgia. Aquesta tesi presenta un sistema capaç de reconstruir cares i pits de pacients de cirurgia plàstica a partir de fotos 2D i escanegis. La reconstrucció en 3D d'un objecte és un problema complicat degut a la presència d'ambigüitats. Els mètodes basats en models estadístics son adequats per mitigar-les. En aquest treball, hem seguit la intuïció de maximitzar l'ús d'informació prèvia, introduint-la al model estadístic per millorar les seves propietats. En primer lloc, explorem els Active Shape Models (ASM) que són un conegut mètode fet servir per alinear contorns d'objectes 2D. No obstant, un cop aplicades les correccions de forma del model estadístic, es difícil de mantenir informació de la que es disposava a priori (per exemple, un petit conjunt de punts donat) inalterada. Proposem una nova projecció ponderada amb un terme de regularització, que permet obtenir formes que compleixen les restriccions de forma imposades i alhora són plausibles en concordança amb el model estadístic. En segon lloc, ampliem la metodologia per aplicar-la als anomenats 3D Morphable Models (3DMM) que són un mètode extensivament utilitzat per fer reconstrucció 3D. No obstant, els mètodes de 3DMM existents presenten algunes limitacions. Alguns estan basats en optimitzacions no lineals, computacionalment costoses i que poden quedar atrapades en mínims locals. Una altra limitació, és que no tots el mètodes proporcionen la resolució adequada per representar amb precisió els detalls de l'anatomia. Donat l'ús mèdic de l'aplicació, la precisió i la robustesa són factors molt importants a tenir en compte. Mostrem com la inicialització i l'ajustament de 3DMM poden ser millorats fent servir la projecció ponderada amb regularització proposada. Finalment, es presenta un sistema capaç de reconstruir models 3D de pacients de cirurgia plàstica a partir de dos possibles tipus de dades: imatges 2D i escaneigs en 3D. El nostre mètode es fa servir en diverses etapes del procés de reconstrucció: alineament de formes en imatge, la inicialització i l'ajustament de 3DMM. Els mètodes desenvolupats han estat integrats a l'entorn de producció de Crisalix provant la seva validesa

    Three-dimensional modeling of the human jaw/teeth using optics and statistics.

    Get PDF
    Object modeling is a fundamental problem in engineering, involving talents from computer-aided design, computational geometry, computer vision and advanced manufacturing. The process of object modeling takes three stages: sensing, representation, and analysis. Various sensors may be used to capture information about objects; optical cameras and laser scanners are common with rigid objects, while X-ray, CT and MRI are common with biological organs. These sensors may provide a direct or an indirect inference about the object, requiring a geometric representation in the computer that is suitable for subsequent usage. Geometric representations that are compact, i.e., capture the main features of the objects with a minimal number of data points or vertices, fall into the domain of computational geometry. Once a compact object representation is in the computer, various analysis steps can be conducted, including recognition, coding, transmission, etc. The subject matter of this dissertation is object reconstruction from a sequence of optical images using shape from shading (SFS) and SFS with shape priors. The application domain is dentistry. Most of the SFS approaches focus on the computational part of the SFS problem, i.e. the numerical solution. As a result, the imaging model in most conventional SFS algorithms has been simplified under three simple, but restrictive assumptions: (1) the camera performs an orthographic projection of the scene, (2) the surface has a Lambertian reflectance and (3) the light source is a single point source at infinity. Unfortunately, such assumptions are no longer held in the case of reconstruction of real objects as intra-oral imaging environment for human teeth. In this work, we introduce a more realistic formulation of the SFS problem by considering the image formation components: the camera, the light source, and the surface reflectance. This dissertation proposes a non-Lambertian SFS algorithm under perspective projection which benefits from camera calibration parameters. The attenuation of illumination is taken account due to near-field imaging. The surface reflectance is modeled using the Oren-Nayar-Wolff model which accounts for the retro-reflection case. In this context, a new variational formulation is proposed that relates an evolving surface model with image information, taking into consideration that the image is taken by a perspective camera with known parameters. A new energy functional is formulated to incorporate brightness, smoothness and integrability constraints. In addition, to further improve the accuracy and practicality of the results, 3D shape priors are incorporated in the proposed SFS formulation. This strategy is motivated by the fact that humans rely on strong prior information about the 3D world around us in order to perceive 3D shape information. Such information is statistically extracted from training 3D models of the human teeth. The proposed SFS algorithms have been used in two different frameworks in this dissertation: a) holistic, which stitches a sequence of images in order to cover the entire jaw, and then apply the SFS, and b) piece-wise, which focuses on a specific tooth or a segment of the human jaw, and applies SFS using physical teeth illumination characteristics. To augment the visible portion, and in order to have the entire jaw reconstructed without the use of CT or MRI or even X-rays, prior information were added which gathered from a database of human jaws. This database has been constructed from an adult population with variations in teeth size, degradation and alignments. The database contains both shape and albedo information for the population. Using this database, a novel statistical shape from shading (SSFS) approach has been created. Extending the work on human teeth analysis, Finite Element Analysis (FEA) is adapted for analyzing and calculating stresses and strains of dental structures. Previous Finite Element (FE) studies used approximate 2D models. In this dissertation, an accurate three-dimensional CAD model is proposed. 3D stress and displacements of different teeth type are successfully carried out. A newly developed open-source finite element solver, Finite Elements for Biomechanics (FEBio), has been used. The limitations of the experimental and analytical approaches used for stress and displacement analysis are overcome by using FEA tool benefits such as dealing with complex geometry and complex loading conditions

    Face modeling for face recognition in the wild.

    Get PDF
    Face understanding is considered one of the most important topics in computer vision field since the face is a rich source of information in social interaction. Not only does the face provide information about the identity of people, but also of their membership in broad demographic categories (including sex, race, and age), and about their current emotional state. Facial landmarks extraction is the corner stone in the success of different facial analyses and understanding applications. In this dissertation, a novel facial modeling is designed for facial landmarks detection in unconstrained real life environment from different image modalities including infra-red and visible images. In the proposed facial landmarks detector, a part based model is incorporated with holistic face information. In the part based model, the face is modeled by the appearance of different face part(e.g., right eye, left eye, left eyebrow, nose, mouth) and their geometric relation. The appearance is described by a novel feature referred to as pixel difference feature. This representation is three times faster than the state-of-art in feature representation. On the other hand, to model the geometric relation between the face parts, the complex Bingham distribution is adapted from the statistical community into computer vision for modeling the geometric relationship between the facial elements. The global information is incorporated with the local part model using a regression model. The model results outperform the state-of-art in detecting facial landmarks. The proposed facial landmark detector is tested in two computer vision problems: boosting the performance of face detectors by rejecting pseudo faces and camera steering in multi-camera network. To highlight the applicability of the proposed model for different image modalities, it has been studied in two face understanding applications which are face recognition from visible images and physiological measurements for autistic individuals from thermal images. Recognizing identities from faces under different poses, expressions and lighting conditions from a complex background is an still unsolved problem even with accurate detection of landmark. Therefore, a learning similarity measure is proposed. The proposed measure responds only to the difference in identities and filter illuminations and pose variations. similarity measure makes use of statistical inference in the image plane. Additionally, the pose challenge is tackled by two new approaches: assigning different weights for different face part based on their visibility in image plane at different pose angles and synthesizing virtual facial images for each subject at different poses from single frontal image. The proposed framework is demonstrated to be competitive with top performing state-of-art methods which is evaluated on standard benchmarks in face recognition in the wild. The other framework for the face understanding application, which is a physiological measures for autistic individual from infra-red images. In this framework, accurate detecting and tracking Superficial Temporal Arteria (STA) while the subject is moving, playing, and interacting in social communication is a must. It is very challenging to track and detect STA since the appearance of the STA region changes over time and it is not discriminative enough from other areas in face region. A novel concept in detection, called supporter collaboration, is introduced. In support collaboration, the STA is detected and tracked with the help of face landmarks and geometric constraint. This research advanced the field of the emotion recognition
    corecore