352 research outputs found

    CNN Based 3D Facial Expression Recognition Using Masking And Landmark Features

    Get PDF
    Automatically recognizing facial expression is an important part for human-machine interaction. In this paper, we first review the previous studies on both 2D and 3D facial expression recognition, and then summarize the key research questions to solve in the future. Finally, we propose a 3D facial expression recognition (FER) algorithm using convolutional neural networks (CNNs) and landmark features/masks, which is invariant to pose and illumination variations due to the solely use of 3D geometric facial models without any texture information. The proposed method has been tested on two public 3D facial expression databases: BU-4DFE and BU-3DFE. The results show that the CNN model benefits from the masking, and the combination of landmark and CNN features can further improve the 3D FER accuracy

    3D Face Reconstruction by Learning from Synthetic Data

    Full text link
    Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications. Here, we introduce a learning-based approach for reconstructing a three-dimensional face from a single image. Recent face recovery methods rely on accurate localization of key characteristic points. In contrast, the proposed approach is based on a Convolutional-Neural-Network (CNN) which extracts the face geometry directly from its image. Although such deep architectures outperform other models in complex computer vision problems, training them properly requires a large dataset of annotated examples. In the case of three-dimensional faces, currently, there are no large volume data sets, while acquiring such big-data is a tedious task. As an alternative, we propose to generate random, yet nearly photo-realistic, facial images for which the geometric form is known. The suggested model successfully recovers facial shapes from real images, even for faces with extreme expressions and under various lighting conditions.Comment: The first two authors contributed equally to this wor

    Py-Feat: Python Facial Expression Analysis Toolbox

    Full text link
    Studying facial expressions is a notoriously difficult endeavor. Recent advances in the field of affective computing have yielded impressive progress in automatically detecting facial expressions from pictures and videos. However, much of this work has yet to be widely disseminated in social science domains such as psychology. Current state of the art models require considerable domain expertise that is not traditionally incorporated into social science training programs. Furthermore, there is a notable absence of user-friendly and open-source software that provides a comprehensive set of tools and functions that support facial expression research. In this paper, we introduce Py-Feat, an open-source Python toolbox that provides support for detecting, preprocessing, analyzing, and visualizing facial expression data. Py-Feat makes it easy for domain experts to disseminate and benchmark computer vision models and also for end users to quickly process, analyze, and visualize face expression data. We hope this platform will facilitate increased use of facial expression data in human behavior research.Comment: 25 pages, 3 figures, 5 table

    Fiducial Focus Augmentation for Facial Landmark Detection

    Full text link
    Deep learning methods have led to significant improvements in the performance on the facial landmark detection (FLD) task. However, detecting landmarks in challenging settings, such as head pose changes, exaggerated expressions, or uneven illumination, continue to remain a challenge due to high variability and insufficient samples. This inadequacy can be attributed to the model's inability to effectively acquire appropriate facial structure information from the input images. To address this, we propose a novel image augmentation technique specifically designed for the FLD task to enhance the model's understanding of facial structures. To effectively utilize the newly proposed augmentation technique, we employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss to achieve collective learning of high-level feature representations from two different views of the input images. Furthermore, we employ a Transformer + CNN-based network with a custom hourglass module as the robust backbone for the Siamese framework. Extensive experiments show that our approach outperforms multiple state-of-the-art approaches across various benchmark datasets.Comment: Accepted to BMVC'2

    3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame

    Full text link
    Facial expression spotting is the preliminary step for micro- and macro-expression analysis. The task of reliably spotting such expressions in video sequences is currently unsolved. The current best systems depend upon optical flow methods to extract regional motion features, before categorisation of that motion into a specific class of facial movement. Optical flow is susceptible to drift error, which introduces a serious problem for motions with long-term dependencies, such as high frame-rate macro-expression. We propose a purely deep learning solution which, rather than track frame differential motion, compares via a convolutional model, each frame with two temporally local reference frames. Reference frames are sampled according to calculated micro- and macro-expression durations. We show that our solution achieves state-of-the-art performance (F1-score of 0.126) in a dataset of high frame-rate (200 fps) long video sequences (SAMM-LV) and is competitive in a low frame-rate (30 fps) dataset (CAS(ME)2). In this paper, we document our deep learning model and parameters, including how we use local contrast normalisation, which we show is critical for optimal results. We surpass a limitation in existing methods, and advance the state of deep learning in the domain of facial expression spotting

    Face De-Identification for Privacy Protection

    Get PDF
    The ability to record, store and analyse images of faces economically, rapidly and on a vast scale brings people’s attention to privacy. The current privacy protection approaches for face images are mainly through masking, blurring or black-out which, however, removes data utilities along with the identifying information. As a result, these ad hoc methods are hardly used for data publishing or in further researches. The technique of de-identification attempts to remove identifying information from a dataset while preserving the data utility as much as possible. The research on de-identify structured data has been established while it remains a challenge to de-identify unstructured data such as face data in images and videos. The k-Same face de-identification was the first method that attempted to use an established de-identification theory, k-anonymity, to de-identify a face image dataset. The k-Same face de-identification is also the starting point of this thesis. Re-identification risk and data utility are two incompatible aspects in face de-identification. The focus of this thesis is to improve the privacy protection performance of a face de-identification system while providing data utility preserving solutions for different application scenarios. This thesis first proposes the k-Same furthest face de-identification method which introduces the wrong-map protection to the k-Same-M face de-identification, where the identity loss is maximised by replacing an original face with the face that has the least similarity to it. The data utility of face images has been considered from two aspects in this thesis, the dataset-wise data utility such as data distribution of the data set and the individual-wise data utility such as the facial expression in an individual image. With the aim to preserve the diversity of a face image dataset, the k-Diff-furthest face de-identification method is proposed, which extends the k-Same-furthest method and can provide the wrong-map protection. With respect to the data utility of an individual face image, the visual quality and the preservation of facial expression are discussed in this thesis. A method to merge the isolated de-identified face region and its original image background is presented. The described method can increase the visual quality of a de-identified face image in terms of fidelity and intelligibility. A novel solution to preserving facial expressions in de-identified face images is presented, which can preserve not only the category of facial expressions but also the intensity of face Action Units. Finally, an integration of the Active Appearance Model (AAM) and Generative Adversarial Network (GAN) is presented, which achieves the synthesis of realistic face images with shallow neural network architectures
    • …
    corecore