15 research outputs found

    Markov Chain Monte Carlo for Automated Face Image Analysis

    Get PDF
    We present a novel fully probabilistic method to interpret a single face image with the 3D Morphable Model. The new method is based on Bayesian inference and makes use of unreliable image-based information. Rather than searching a single optimal solution, we infer the posterior distribution of the model parameters given the target image. The method is a stochastic sampling algorithm with a propose-and-verify architecture based on the Metropolis–Hastings algorithm. The stochastic method can robustly integrate unreliable information and therefore does not rely on feed-forward initialization. The integrative concept is based on two ideas, a separation of proposal moves and their verification with the model (Data-Driven Markov Chain Monte Carlo), and filtering with the Metropolis acceptance rule. It does not need gradients and is less prone to local optima than standard fitters. We also introduce a new collective likelihood which models the average difference between the model and the target image rather than individual pixel differences. The average value shows a natural tendency towards a normal distribution, even when the individual pixel-wise difference is not Gaussian. We employ the new fitting method to calculate posterior models of 3D face reconstructions from single real-world images. A direct application of the algorithm with the 3D Morphable Model leads us to a fully automatic face recognition system with competitive performance on the Multi-PIE database without any database adaptation

    Informed MCMC with Bayesian Neural Networks for Facial Image Analysis

    Full text link
    Computer vision tasks are difficult because of the large variability in the data that is induced by changes in light, background, partial occlusion as well as the varying pose, texture, and shape of objects. Generative approaches to computer vision allow us to overcome this difficulty by explicitly modeling the physical image formation process. Using generative object models, the analysis of an observed image is performed via Bayesian inference of the posterior distribution. This conceptually simple approach tends to fail in practice because of several difficulties stemming from sampling the posterior distribution: high-dimensionality and multi-modality of the posterior distribution as well as expensive simulation of the rendering process. The main difficulty of sampling approaches in a computer vision context is choosing the proposal distribution accurately so that maxima of the posterior are explored early and the algorithm quickly converges to a valid image interpretation. In this work, we propose to use a Bayesian Neural Network for estimating an image dependent proposal distribution. Compared to a standard Gaussian random walk proposal, this accelerates the sampler in finding regions of the posterior with high value. In this way, we can significantly reduce the number of samples needed to perform facial image analysis.Comment: Accepted to the Bayesian Deep Learning Workshop at NeurIPS 201

    Morphable Face Models - An Open Framework

    Full text link
    In this paper, we present a novel open-source pipeline for face registration based on Gaussian processes as well as an application to face image analysis. Non-rigid registration of faces is significant for many applications in computer vision, such as the construction of 3D Morphable face models (3DMMs). Gaussian Process Morphable Models (GPMMs) unify a variety of non-rigid deformation models with B-splines and PCA models as examples. GPMM separate problem specific requirements from the registration algorithm by incorporating domain-specific adaptions as a prior model. The novelties of this paper are the following: (i) We present a strategy and modeling technique for face registration that considers symmetry, multi-scale and spatially-varying details. The registration is applied to neutral faces and facial expressions. (ii) We release an open-source software framework for registration and model-building, demonstrated on the publicly available BU3D-FE database. The released pipeline also contains an implementation of an Analysis-by-Synthesis model adaption of 2D face images, tested on the Multi-PIE and LFW database. This enables the community to reproduce, evaluate and compare the individual steps of registration to model-building and 3D/2D model fitting. (iii) Along with the framework release, we publish a new version of the Basel Face Model (BFM-2017) with an improved age distribution and an additional facial expression model

    A Closest Point Proposal for MCMC-based Probabilistic Surface Registration

    Full text link
    We propose to view non-rigid surface registration as a probabilistic inference problem. Given a target surface, we estimate the posterior distribution of surface registrations. We demonstrate how the posterior distribution can be used to build shape models that generalize better and show how to visualize the uncertainty in the established correspondence. Furthermore, in a reconstruction task, we show how to estimate the posterior distribution of missing data without assuming a fixed point-to-point correspondence. We introduce the closest-point proposal for the Metropolis-Hastings algorithm. Our proposal overcomes the limitation of slow convergence compared to a random-walk strategy. As the algorithm decouples inference from modeling the posterior using a propose-and-verify scheme, we show how to choose different distance measures for the likelihood model. All presented results are fully reproducible using publicly available data and our open-source implementation of the registration framework

    CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images

    Full text link
    With the powerfulness of convolution neural networks (CNN), CNN based face reconstruction has recently shown promising performance in reconstructing detailed face shape from 2D face images. The success of CNN-based methods relies on a large number of labeled data. The state-of-the-art synthesizes such data using a coarse morphable face model, which however has difficulty to generate detailed photo-realistic images of faces (with wrinkles). This paper presents a novel face data generation method. Specifically, we render a large number of photo-realistic face images with different attributes based on inverse rendering. Furthermore, we construct a fine-detailed face image dataset by transferring different scales of details from one image to another. We also construct a large number of video-type adjacent frame pairs by simulating the distribution of real video data. With these nicely constructed datasets, we propose a coarse-to-fine learning framework consisting of three convolutional networks. The networks are trained for real-time detailed 3D face reconstruction from monocular video as well as from a single image. Extensive experimental results demonstrate that our framework can produce high-quality reconstruction but with much less computation time compared to the state-of-the-art. Moreover, our method is robust to pose, expression and lighting due to the diversity of data.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence, 201

    Photo-Realistic Facial Details Synthesis from Single Image

    Full text link
    We present a single-image 3D face synthesis technique that can handle challenging facial expressions while recovering fine geometric details. Our technique employs expression analysis for proxy face geometry generation and combines supervised and unsupervised learning for facial detail synthesis. On proxy generation, we conduct emotion prediction to determine a new expression-informed proxy. On detail synthesis, we present a Deep Facial Detail Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs both geometry and appearance loss functions. For geometry, we capture 366 high-quality 3D scans from 122 different subjects under 3 facial expressions. For appearance, we use additional 20K in-the-wild face images and apply image-based rendering to accommodate lighting variations. Comprehensive experiments demonstrate that our framework can produce high-quality 3D faces with realistic details under challenging facial expressions

    Reconstruction of three-dimensional facial geometric features related to fetal alcohol syndrome using adult surrogates

    Get PDF
    Fetal alcohol syndrome (FAS) is a condition caused by prenatal alcohol exposure. The diagnosis of FAS is based on the presence of central nervous system impairments, evidence of growth abnormalities and abnormal facial features. Direct anthropometry has traditionally been used to obtain facial data to assess the FAS facial features. Research efforts have focused on indirect anthropometry such as 3D surface imaging systems to collect facial data for facial analysis. However, 3D surface imaging systems are costly. As an alternative, approaches for 3D reconstruction from a single 2D image of the face using a 3D morphable model (3DMM) were explored in this research study. The research project was accomplished in several steps. 3D facial data were obtained from the publicly available BU-3DFE database, developed by the State University of New York. The 3D face scans in the training set were landmarked by different observers. The reliability and precision in selecting 3D landmarks were evaluated. The intraclass correlation coefficients for intra- and inter-observer reliability were greater than 0.95. The average intra-observer error was 0.26 mm and the average inter-observer error was 0.89 mm. A rigid registration was performed on the 3D face scans in the training set. Following rigid registration, a dense point-to-point correspondence across a set of aligned face scans was computed using the Gaussian process model fitting approach. A 3DMM of the face was constructed from the fully registered 3D face scans. The constructed 3DMM of the face was evaluated based on generalization, specificity, and compactness. The quantitative evaluations show that the constructed 3DMM achieves reliable results. 3D face reconstructions from single 2D images were estimated based on the 3DMM. The MetropolisHastings algorithm was used to fit the 3DMM features to 2D image features to generate the 3D face reconstruction. Finally, the geometric accuracy of the reconstructed 3D faces was evaluated based on ground-truth 3D face scans. The average root mean square error for the surface-to-surface comparisons between the reconstructed faces and the ground-truth face scans was 2.99 mm. In conclusion, a framework to estimate 3D face reconstructions from single 2D facial images was developed and the reconstruction errors were evaluated. The geometric accuracy of the 3D face reconstructions was comparable to that found in the literature. However, future work should consider minimizing reconstruction errors to acceptable clinical standards in order for the framework to be useful for 3D-from-2D reconstruction in general, and also for developing FAS applications. Finally, future work should consider estimating a 3D face using multi-view 2D images to increase the information available for 3D-from-2D reconstruction

    Evaluating 3D human face reconstruction from a frontal 2D image, focusing on facial regions associated with foetal alcohol syndrome

    Get PDF
    Foetal alcohol syndrome (FAS) is a preventable condition caused by maternal alcohol consumption during pregnancy. The FAS facial phenotype is an important factor for diagnosis, alongside central nervous system impairments and growth abnormalities. Current methods for analysing the FAS facial phenotype rely on 3D facial image data, obtained from costly and complex surface scanning devices. An alternative is to use 2D images, which are easy to acquire with a digital camera or smart phone. However, 2D images lack the geometric accuracy required for accurate facial shape analysis. Our research offers a solution through the reconstruction of 3D human faces from single or multiple 2D images. We have developed a framework for evaluating 3D human face reconstruction from a single-input 2D image using a 3D face model for potential use in FAS assessment. We first built a generative morphable model of the face from a database of registered 3D face scans with diverse skin tones. Then we applied this model to reconstruct 3D face surfaces from single frontal images using a model-driven sampling algorithm. The accuracy of the predicted 3D face shapes was evaluated in terms of surface reconstruction error and the accuracy of FAS-relevant landmark locations and distances. Results show an average root mean square error of 2.62 mm. Our framework has the potential to estimate 3D landmark positions for parts of the face associated with the FAS facial phenotype. Future work aims to improve on the accuracy and adapt the approach for use in clinical settings. Significance: Our study presents a framework for constructing and evaluating a 3D face model from 2D face scans and evaluating the accuracy of 3D face shape predictions from single images. The results indicate low generalisation error and comparability to other studies. The reconstructions also provide insight into specific regions of the face relevant to FAS diagnosis. The proposed approach presents a potential cost-effective and easily accessible imaging tool for FAS screening, yet its clinical application needs further research

    What computational model provides the best explanation of face representations in the primate brain?

    Get PDF
    Understanding how the brain represents the identity of complex objects is a central challenge of visual neuroscience. The principles governing object processing have been extensively studied in the macaque face patch system, a sub-network of inferotemporal (IT) cortex specialized for face processing (Tsao et al., 2006). A previous study reported that single face patch neurons encode axes of a generative model called the “active appearance” model (Chang and Tsao, 2017), which transforms 50-d feature vectors separately representing facial shape and facial texture into facial images (Cootes et al., 2001; Edwards et al., 1998). However, it remains unclear whether this model constitutes the best model for explaining face cell responses. Here, we recorded responses of cells in the most anterior face patch AM to a large set of real face images, and compared a large number of models for explaining neural responses. We found that the active appearance model better explained responses than any other model except CORnet-Z, a feedforward deep neural network trained on general object classification to classify non-face images, whose performance it tied on some face image sets and exceeded on others. Surprisingly, deep neural networks trained specifically on facial identification did not explain neural responses well. A major reason is that units in the network, unlike neurons, are less modulated by face-related factors unrelated to facial identification such as illumination
    corecore