76,787 research outputs found

    A 3D Face Modelling Approach for Pose-Invariant Face Recognition in a Human-Robot Environment

    Full text link
    Face analysis techniques have become a crucial component of human-machine interaction in the fields of assistive and humanoid robotics. However, the variations in head-pose that arise naturally in these environments are still a great challenge. In this paper, we present a real-time capable 3D face modelling framework for 2D in-the-wild images that is applicable for robotics. The fitting of the 3D Morphable Model is based exclusively on automatically detected landmarks. After fitting, the face can be corrected in pose and transformed back to a frontal 2D representation that is more suitable for face recognition. We conduct face recognition experiments with non-frontal images from the MUCT database and uncontrolled, in the wild images from the PaSC database, the most challenging face recognition database to date, showing an improved performance. Finally, we present our SCITOS G5 robot system, which incorporates our framework as a means of image pre-processing for face analysis

    Using 3D morphable models for face recognition in video

    Get PDF
    The 3D Morphable Face Model (3DMM) has been used for over a decade for creating 3D models from single images of faces. This model is based on a PCA model of the 3D shape and texture generated from a limited number of 3D scans. The goal of fitting a 3DMM to an image is to find the model coefficients, the lighting and other imaging variables from which we can remodel that image as accurately as possible. The model coefficients consist of texture and of shape descriptors, and can without further processing be used in verification and recognition experiments. Until now little research has been performed into the influence of the diverse parameters of the 3DMM on the recognition performance. In this paper we will introduce a Bayesian-based method for texture backmapping from multiple images. Using the information from multiple (non-frontal) views we construct a frontal view which can be used as input to 2D face recognition software. We also show how the number of triangles used in the fitting proces influences the recognition performance using the shape descriptors. The verification results of the 3DMM are compared to state-of-the-art 2D face recognition software on the MultiPIE dataset. The 2D FR software outperforms the Morphable Model, but the Morphable Model can be useful as a preprocesser to synthesize a frontal view from a non-frontal view and also combine images with multiple views to a single frontal view. We show results for this preprocessing technique by using an average face shape, a fitted face shape, with a MM texture, with the original texture and with a hybrid texture. The preprocessor has improved the verification results significantly on the dataset

    Expression invariant face recognition using multi-stage 3D face fitting with 3D morphable face model

    Get PDF
    This paper aims to propose a new fully automated three-dimensional model based, real-time capable approach to recognize facial expressions from visual images of human faces in real time scenario. A multistage 3D fitting algorithm is applied with a morphable model to ensure the high accuracy and speed of the process in addition to eliminating the pose and illumination effects during the recognition process. The idea of the model is to update parameters at each stage in the fitting process. Feature extraction will be done using active appearance model while the feature classification will be done using the tree model to insure a good processing speed. This proposed model will show good results when shape, texture and extrinsic variations occur in the 3D domain since the combination of multistage fitting algorithm and tree model can enhance the speed and accuracy of the system recognition capabilities. This 3D morphable model algorithm can be widely used for 3D face analysis and 3D face recognition in real time scenarios

    Model based methods for locating, enhancing and recognising low resolution objects in video

    Get PDF
    Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process

    Fine-Grained Head Pose Estimation Without Keypoints

    Full text link
    Estimating the head pose of a person is a crucial problem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment. Traditionally head pose is computed by estimating some keypoints from the target face and solving the 2D to 3D correspondence problem with a mean human head model. We argue that this is a fragile method because it relies entirely on landmark detection performance, the extraneous head model and an ad-hoc fitting step. We present an elegant and robust way to determine pose by training a multi-loss convolutional neural network on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles (yaw, pitch and roll) directly from image intensities through joint binned pose classification and regression. We present empirical tests on common in-the-wild pose benchmark datasets which show state-of-the-art results. Additionally we test our method on a dataset usually used for pose estimation using depth and start to close the gap with state-of-the-art depth pose methods. We open-source our training and testing code as well as release our pre-trained models.Comment: Accepted to Computer Vision and Pattern Recognition Workshops (CVPRW), 2018 IEEE Conference on. IEEE, 201

    UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition

    Full text link
    Recently proposed robust 3D face alignment methods establish either dense or sparse correspondence between a 3D face model and a 2D facial image. The use of these methods presents new challenges as well as opportunities for facial texture analysis. In particular, by sampling the image using the fitted model, a facial UV can be created. Unfortunately, due to self-occlusion, such a UV map is always incomplete. In this paper, we propose a framework for training Deep Convolutional Neural Network (DCNN) to complete the facial UV map extracted from in-the-wild images. To this end, we first gather complete UV maps by fitting a 3D Morphable Model (3DMM) to various multiview image and video datasets, as well as leveraging on a new 3D dataset with over 3,000 identities. Second, we devise a meticulously designed architecture that combines local and global adversarial DCNNs to learn an identity-preserving facial UV completion model. We demonstrate that by attaching the completed UV to the fitted mesh and generating instances of arbitrary poses, we can increase pose variations for training deep face recognition/verification models, and minimise pose discrepancy during testing, which lead to better performance. Experiments on both controlled and in-the-wild UV datasets prove the effectiveness of our adversarial UV completion model. We achieve state-of-the-art verification accuracy, 94.05%94.05\%, under the CFP frontal-profile protocol only by combining pose augmentation during training and pose discrepancy reduction during testing. We will release the first in-the-wild UV dataset (we refer as WildUV) that comprises of complete facial UV maps from 1,892 identities for research purposes
    corecore