56 research outputs found

    MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

    Get PDF
    In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13 page

    Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz

    Full text link
    The reconstruction of dense 3D models of face geometry and appearance from a single image is highly challenging and ill-posed. To constrain the problem, many approaches rely on strong priors, such as parametric face models learned from limited 3D scan data. However, prior models restrict generalization of the true diversity in facial geometry, skin reflectance and illumination. To alleviate this problem, we present the first approach that jointly learns 1) a regressor for face shape, expression, reflectance and illumination on the basis of 2) a concurrently learned parametric face model. Our multi-level face model combines the advantage of 3D Morphable Models for regularization with the out-of-space generalization of a learned corrective space. We train end-to-end on in-the-wild images without dense annotations by fusing a convolutional encoder with a differentiable expert-designed renderer and a self-supervised training loss, both defined at multiple detail levels. Our approach compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.Comment: CVPR 2018 (Oral). Project webpage: https://gvv.mpi-inf.mpg.de/projects/FML

    Алгоритм построения деформируемых 3D моделей лица и обоснование его применимости в системах распознавания личности

    Get PDF
    In the article, an algorithm for constructing deformable face models, based on the use of Active Shape Model method, Shepard method of landscape surfaces restoring and set of 3D particular face models, is described. Alternative to the EER, the assessment of accuracy in the task of the person recognition using their face image based on an anchored value of FAR is offered. The results of testing the algorithm are presented. We demonstrate the results of using the obtained models within the framework of recognition algorithm performance on a large base of several thousand images (FERET image database by 2000 year),which contains photographs of people at angles of 0, 45 and 90 degrees relative to the optical axis of the camera. Analysis of the results showed that the use of deformable face models does not reduce the quality of the person recognition by face image even under difficult initial conditions and in some cases leads to improving recognition results.Описан алгоритм автоматического построения деформируемых 3D моделей лица, основанного на использовании метода Active Shape Models, метода восстановления ландшафтных поверхностей Шепарда и набора частных 3D моделей лиц. Предложена альтернативная к EER оценке точности в задаче распознавания личности по изображению лица, основанная на фиксированном значении оценки FAR. Приведены результаты тестирования описанного алгоритма. Продемонстрированы результаты использования полученных моделей в рамках работы алгоритма распознавания на крупной базе из нескольких тысяч изображений (база изображений FERET за 2000 год), содержащей фотографии людей под углами 0, 45 и 90 градусов относительно оптической оси камеры. Анализ результатов показал, что применение деформируемых моделей лица не снижает качества распознавания личности по изображению лица даже при сложных начальных условиях, а в ряде случаев ведет к улучшению результатов распознавания

    EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment

    No full text
    Face performance capture and reenactment techniques use multiple cameras and sensors, positioned at a distance from the face or mounted on heavy wearable devices. This limits their applications in mobile and outdoor environments. We present EgoFace, a radically new lightweight setup for face performance capture and front-view videorealistic reenactment using a single egocentric RGB camera. Our lightweight setup allows operations in uncontrolled environments, and lends itself to telepresence applications such as video-conferencing from dynamic environments. The input image is projected into a low dimensional latent space of the facial expression parameters. Through careful adversarial training of the parameter-space synthetic rendering, a videorealistic animation is produced. Our problem is challenging as the human visual system is sensitive to the smallest face irregularities that could occur in the final results. This sensitivity is even stronger for video results. Our solution is trained in a pre-processing stage, through a supervised manner without manual annotations. EgoFace captures a wide variety of facial expressions, including mouth movements and asymmetrical expressions. It works under varying illuminations, background, movements, handles people from different ethnicities and can operate in real time

    Unsupervised Training for 3D Morphable Model Regression

    Full text link
    We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.Comment: CVPR 2018 version with supplemental material (http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html
    corecore