70,335 research outputs found

    Research on 3D reconstruction based on 2D face images.

    Get PDF
    3D face reconstruction is a popular research area in the field of computer vision and has a wide range of applications in various fields such as animation design, virtual reality, medical guidelines, and face recognition. Current commercial 3D face reconstruction generally relies on large image scanning equipment to fuse multiple images through sensors for 3D face reconstruction. However, this approach requires manual modelling, which is costly in terms of time and money, and expensive in terms of equipment, making it unpopular in practical applications. Compared to 3D face construction with multiple images, the single-image approach reduces computational time and economic costs, is relatively simple to implement and does not require specific Hardware equipment. Therefore, we focus on single-image approach in this dissertation and contribute in terms of research novelty and practical use. The main work is as follows: A unique pre-processing process is designed to separate face alignment from face reconstruction. In this dissertation, the Active Shape Model (ASM) algorithm is used for face alignment to detect the face feature points in the image. The face data is posing corrected so that the corrected face is better adapted to the face pose of the UV-Position Map. The UV coordinates are then used to map the 3D information onto the 2D image, creating a UV-3D mapping map. In order to enhance the effect, this dissertation also does face cropping to fill the whole space as much as possible with face data and expands the face dataset using rotation, scaling, panning and noise addition. Improving the neural network model by using the idea of residual learning to train the network model incrementally, emphasizing the reconstruction of the model for deep information. Face data characteristics are first extracted using the encoding and decoding layers, and then face features are learned using the residual learning layer. By comparing with the previous algorithm, we achieved a considerable lead on the 300W-LP face dataset, with a 35% reduction in NME error accumulation over the RPN algorithm. Based on the pre-processing methods and residual structures we proposed, the experimental results have shown good performance on 3D reconstruction of faces. The end-to-end approach based on deep learning achieves better reconstruction quality and accuracy compared to traditional, model-based face reconstruction methods

    Deep face recognition in the wild

    Get PDF
    Face recognition has attracted particular interest in biometric recognition with wide applications in security, entertainment, health, marketing. Recent years have witnessed rapid development of face recognition technique in both academic and industrial fields with the advent of (a) large amounts of available annotated training datasets, (b) Convolutional Neural Network (CNN) based deep structures, (c) affordable, powerful computation resources and (d) advanced loss functions. Despite the significant improvement and success, there are still challenges remaining to be tackled. This thesis contributes towards in the wild face recognition from three perspectives including network design, model compression, and model explanation. Firstly, although the facial landmarks capture pose, expression and shape information, they are only used as the pre-processing step in the current face recognition pipeline without considering their potential in improving model's representation. Thus, we propose the ``FAN-Face'' framework which gradually integrates features from different layers of a facial landmark localization network into different layers of the recognition network. This operation has broken the align-cropped data pre-possessing routine but achieved simple orthogonal improvement to deep face recognition. We attribute this success to the coarse to fine shape-related information stored in the alignment network helping to establish correspondence for face matching. Secondly, motivated by the success of knowledge distillation in model compression in the object classification task, we have examined current knowledge distillation methods on training lightweight face recognition models. By taking into account the classification problem at hand, we advocate a direct feature matching approach by letting the pre-trained classifier in teacher validate the feature representation from the student network. In addition, as the teacher network trained on labeled dataset alone is capable of capturing rich relational information among labels both in class space and feature space, we make first attempts to use unlabeled data to further enhance the model's performance under the knowledge distillation framework. Finally, to increase the interpretability of the ``black box'' deep face recognition model, we have developed a new structure with dynamic convolution which is able to provide clustering of the faces in terms of facial attributes. In particular, we propose to cluster the routing weights of dynamic convolution experts to learn facial attributes in an unsupervised manner without forfeiting face recognition accuracy. Besides, we also introduce group convolution into dynamic convolution to increase the expert granularity. We further confirm that the routing vector benefits the feature-based face reconstruction via the deep inversion technique

    MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

    Get PDF
    In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13 page

    Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition

    Full text link
    Two approaches are proposed for cross-pose face recognition, one is based on the 3D reconstruction of facial components and the other is based on the deep Convolutional Neural Network (CNN). Unlike most 3D approaches that consider holistic faces, the proposed approach considers 3D facial components. It segments a 2D gallery face into components, reconstructs the 3D surface for each component, and recognizes a probe face by component features. The segmentation is based on the landmarks located by a hierarchical algorithm that combines the Faster R-CNN for face detection and the Reduced Tree Structured Model for landmark localization. The core part of the CNN-based approach is a revised VGG network. We study the performances with different settings on the training set, including the synthesized data from 3D reconstruction, the real-life data from an in-the-wild database, and both types of data combined. We investigate the performances of the network when it is employed as a classifier or designed as a feature extractor. The two recognition approaches and the fast landmark localization are evaluated in extensive experiments, and compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
    • …
    corecore