44 research outputs found

    Linear Facial Expression Transfer With Active Appearance Models

    Get PDF
    The issue of transferring facial expressions from one person's face to another's has been an area of interest for the movie industry and the computer graphics community for quite some time. In recent years, with the proliferation of online image and video collections and web applications, such as Google Street View, the question of preserving privacy through face de-identification has gained interest in the computer vision community. In this paper, we focus on the problem of real-time dynamic facial expression transfer using an Active Appearance Model framework. We provide a theoretical foundation for a generalisation of two well-known expression transfer methods and demonstrate the improved visual quality of the proposed linear extrapolation transfer method on examples of face swapping and expression transfer using the AVOZES data corpus. Realistic talking faces can be generated in real-time at low computational cost

    3D facial geometric features for constrained local model

    Get PDF
    We propose a 3D Constrained Local Model framework for deformable face alignment in depth image. Our framework exploits the intrinsic 3D geometric information in depth data by utilizing robust histogram-based 3D geometric features that are based on normal vectors. In addition, we demonstrate the fusion of intensity data and 3D features that further improves the facial landmark localization accuracy. The experiments are conducted on publicly available FRGC database. The results show that our 3D features based CLM completely outperforms the raw depth features based CLM in term of fitting accuracy and robustness, and the fusion of intensity and 3D depth feature further improves the performance. Another benefit is that the proposed 3D features in our framework do not require any pre-processing procedure on the data

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

    From pixels to response maps: discriminative image filtering for face alignment in the wild

    Get PDF
    We propose a face alignment framework that relies on the texture model generated by the responses of discriminatively trained part-based filters. Unlike standard texture models built from pixel intensities or responses generated by generic filters (e.g. Gabor), our framework has two important advantages. Firstly, by virtue of discriminative training, invariance to external variations (like identity, pose, illumination and expression) is achieved. Secondly, we show that the responses generated by discriminatively trained filters (or patch-experts) are sparse and can be modeled using a very small number of parameters. As a result, the optimization methods based on the proposed texture model can better cope with unseen variations. We illustrate this point by formulating both part-based and holistic approaches for generic face alignment and show that our framework outperforms the state-of-the-art on multiple ”wild” databases. The code and dataset annotations are available for research purposes from http://ibug.doc.ic.ac.uk/resources

    Learning-based face synthesis for pose-robust recognition from single image

    Get PDF
    Face recognition in real-world conditions requires the ability to deal with a number of conditions, such as variations in pose, illumination and expression. In this paper, we focus on variations in head pose and use a computationally efficient regression-based approach for synthesising face images in different poses, which are used to extend the face recognition training set. In this data-driven approach, the correspondences between facial landmark points in frontal and non-frontal views are learnt offline from manually annotated training data via Gaussian Process Regression. We then use this learner to synthesise non-frontal face images from any unseen frontal image. To demonstrate the utility of this approach, two frontal face recognition systems (the commonly used PCA and the recent Multi-Region Histograms) are augmented with synthesised non-frontal views for each person. This synthesis and augmentation approach is experimentally validated on the FERET dataset, showing a considerable improvement in recognition rates for ±40◩ and ±60◩ views, while maintaining high recognition rates for ±15◩ and ±25◩ views

    Linear facial expression transfer with active appearance models

    Get PDF

    Evaluating AAM fitting methods for facial expression recognition

    Get PDF

    A Weakly Supervised Approach to Emotion-change Prediction and Improved Mood Inference

    Full text link
    Whilst a majority of affective computing research focuses on inferring emotions, examining mood or understanding the \textit{mood-emotion interplay} has received significantly less attention. Building on prior work, we (a) deduce and incorporate emotion-change (Δ\Delta) information for inferring mood, without resorting to annotated labels, and (b) attempt mood prediction for long duration video clips, in alignment with the characterisation of mood. We generate the emotion-change (Δ\Delta) labels via metric learning from a pre-trained Siamese Network, and use these in addition to mood labels for mood classification. Experiments evaluating \textit{unimodal} (training only using mood labels) vs \textit{multimodal} (training using mood plus Δ\Delta labels) models show that mood prediction benefits from the incorporation of emotion-change information, emphasising the importance of modelling the mood-emotion interplay for effective mood inference.Comment: 9 pages, 3 figures, 6 tables, published in IEEE International Conference on Affective Computing and Intelligent Interactio

    The utility of synthetic images for face modelling and its applications

    Get PDF
    In recent years, deformable face model based approaches have been a very active area of research. The Active Appearance Model (AAM) has been by far the most popular approach for generating the face models and has been used in several facial image analysis based applications. This thesis investigates in detail the utility of synthetically generated facial images for face modelling and its applications, such as facial performance transfer and pose-invariant face recognition, with a particular focus on AAM. Beginning with a detailed overview of the AAM framework, an extensive application oriented review is presented. This includes a comparative study of various existing 2D variants of AAM fitting methods for the task of automatic facial expression recognition (FER) and auditory-visual automatic speech recognition (AV-ASR). For FER, the experiments were performed under both person dependent and person independent scenarios. In contrast for A V-ASR, the experiments were performed under the person dependent scenarios where the main focus was on accurately capturing the lip movements. Overall, the best results were obtained by using the Iterative Error Bound Minimisation method, which consistently resulted in accurate face model alignment even when the initial face detection used to initialise the fitting procedure was poor. Furthermore, to demonstrate the utility of the existing AAM framework, a novel approach of learning the mapping between the parameters of two completely independent AAMs is presented to facilitate the facial performance transfer from one subject to another in a realistic manner, a problem which is of a particular interest to the computer graphics community. The main advantage of modelling this parametric correspondence is that it allows a meaningful transfer of both the non-rigid shape and texture across faces irrespective of the speaker's gender, shape and size of the faces, and illumination conditions. Although this application oriented review shows the potential benefits of the AAM framework, its usability is limited due to the requirement of the pseudodense annotation of landmark points for every training image, which typically have to be annotated in a tedious and error-prone manual process. Therefore, a method for automatic annotation of face images, with arbitrary expressions and poses, and automatic model building is presented. Firstly, an approach that utilises the MPEG-4 based facial animation system to generate a set of synthetic frontal face images, with different facial expressions, from a single annotated frontal face image is proposed. Secondly, a regression-based approach for automatic annotation of non-frontal face images with arbitrary expressions that uses only the annotated frontal face images is presented. This approach employs the idea of initially learning the mapping betweeQ. the landmark points of frontal face images and the corresponding non-frontal face images at various poses. Using this learnt mapping, synthetic images of unseen faces at various poses are generated by predicting the new landmark locations and warping the texture from the frontal image. These synthetic face images are used for generating a synthetic deformable face model that is used to perforn1 fitting on unseen face images and, hence, annotate them automatically. This drastically reduces the problem of automatic face annotation and deformable model building to a problem of annotating a single frontal face image. The generalisability of the proposed approach is demonstrated by automatically annotating the face images from five publicly available databases and the results are verified by comparing them to the ground truth obtained from manual annotations. Furthermore, a fully automatic pose-invariant face recognition system is presented that can handle continuous pose variations, is not database specific, and can achieve high accuracy without any manual intervention. The main idea is to explore the problem of pose normalising each gallery and probe image (i.e. to synthesise a frontal view of each face image) before performing the face recognition task. Firstly, to achieve full automation, a robust and fully automatic view-based AAM system is presented for locating the facial landmark points and estimating the 3D head pose from an unseen single 2D face image. Secondly, novel 2D and 3D pose normalisation methods are proposed that leverage on the accurate 2D facial feature points and head pose information extracted by the view-based AAM system. The current pose-invariant face recognition system can handle pose variation up to {u00B1}45{u00B0} in yaw angles and {u00B1}30{u00B0} in pitch angles. Extensive face recognition experiments were conducted on five publicly available databases. The results clearly show excellent generalisability of the proposed system by achieving high accuracy on all five databases and outperforming other automatic methods convincingly, with the proposed 3D pose normalisation method outperforming the proposed 2D pose normalisation method
    corecore