352 research outputs found
CNN Based 3D Facial Expression Recognition Using Masking And Landmark Features
Automatically recognizing facial expression is an important part for human-machine interaction. In this paper, we first review the previous studies on both 2D and 3D facial expression recognition, and then summarize the key research questions to solve in the future. Finally, we propose a 3D facial expression recognition (FER) algorithm using convolutional neural networks (CNNs) and landmark features/masks, which is invariant to pose and illumination variations due to the solely use of 3D geometric facial models without any texture information. The proposed method has been tested on two public 3D facial expression databases: BU-4DFE and BU-3DFE. The results show that the CNN model benefits from the masking, and the combination of landmark and CNN features can further improve the 3D FER accuracy
3D Face Reconstruction by Learning from Synthetic Data
Fast and robust three-dimensional reconstruction of facial geometric
structure from a single image is a challenging task with numerous applications.
Here, we introduce a learning-based approach for reconstructing a
three-dimensional face from a single image. Recent face recovery methods rely
on accurate localization of key characteristic points. In contrast, the
proposed approach is based on a Convolutional-Neural-Network (CNN) which
extracts the face geometry directly from its image. Although such deep
architectures outperform other models in complex computer vision problems,
training them properly requires a large dataset of annotated examples. In the
case of three-dimensional faces, currently, there are no large volume data
sets, while acquiring such big-data is a tedious task. As an alternative, we
propose to generate random, yet nearly photo-realistic, facial images for which
the geometric form is known. The suggested model successfully recovers facial
shapes from real images, even for faces with extreme expressions and under
various lighting conditions.Comment: The first two authors contributed equally to this wor
Py-Feat: Python Facial Expression Analysis Toolbox
Studying facial expressions is a notoriously difficult endeavor. Recent
advances in the field of affective computing have yielded impressive progress
in automatically detecting facial expressions from pictures and videos.
However, much of this work has yet to be widely disseminated in social science
domains such as psychology. Current state of the art models require
considerable domain expertise that is not traditionally incorporated into
social science training programs. Furthermore, there is a notable absence of
user-friendly and open-source software that provides a comprehensive set of
tools and functions that support facial expression research. In this paper, we
introduce Py-Feat, an open-source Python toolbox that provides support for
detecting, preprocessing, analyzing, and visualizing facial expression data.
Py-Feat makes it easy for domain experts to disseminate and benchmark computer
vision models and also for end users to quickly process, analyze, and visualize
face expression data. We hope this platform will facilitate increased use of
facial expression data in human behavior research.Comment: 25 pages, 3 figures, 5 table
Fiducial Focus Augmentation for Facial Landmark Detection
Deep learning methods have led to significant improvements in the performance
on the facial landmark detection (FLD) task. However, detecting landmarks in
challenging settings, such as head pose changes, exaggerated expressions, or
uneven illumination, continue to remain a challenge due to high variability and
insufficient samples. This inadequacy can be attributed to the model's
inability to effectively acquire appropriate facial structure information from
the input images. To address this, we propose a novel image augmentation
technique specifically designed for the FLD task to enhance the model's
understanding of facial structures. To effectively utilize the newly proposed
augmentation technique, we employ a Siamese architecture-based training
mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss to
achieve collective learning of high-level feature representations from two
different views of the input images. Furthermore, we employ a Transformer +
CNN-based network with a custom hourglass module as the robust backbone for the
Siamese framework. Extensive experiments show that our approach outperforms
multiple state-of-the-art approaches across various benchmark datasets.Comment: Accepted to BMVC'2
3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame
Facial expression spotting is the preliminary step for micro- and
macro-expression analysis. The task of reliably spotting such expressions in
video sequences is currently unsolved. The current best systems depend upon
optical flow methods to extract regional motion features, before categorisation
of that motion into a specific class of facial movement. Optical flow is
susceptible to drift error, which introduces a serious problem for motions with
long-term dependencies, such as high frame-rate macro-expression. We propose a
purely deep learning solution which, rather than track frame differential
motion, compares via a convolutional model, each frame with two temporally
local reference frames. Reference frames are sampled according to calculated
micro- and macro-expression durations. We show that our solution achieves
state-of-the-art performance (F1-score of 0.126) in a dataset of high
frame-rate (200 fps) long video sequences (SAMM-LV) and is competitive in a low
frame-rate (30 fps) dataset (CAS(ME)2). In this paper, we document our deep
learning model and parameters, including how we use local contrast
normalisation, which we show is critical for optimal results. We surpass a
limitation in existing methods, and advance the state of deep learning in the
domain of facial expression spotting
Face De-Identification for Privacy Protection
The ability to record, store and analyse images of faces economically, rapidly and on a vast scale brings people’s attention to privacy. The current privacy protection approaches for face images are mainly through masking, blurring or black-out which, however, removes data utilities along with the identifying information. As a result, these ad hoc methods are hardly used for data publishing or in further researches. The technique of de-identification attempts to remove identifying information from a dataset while preserving the data utility as much as possible. The research on de-identify structured data has been established while it remains a challenge to de-identify unstructured data such as face data in images and videos. The k-Same face de-identification was the first method that attempted to use an established de-identification theory, k-anonymity, to de-identify a face image dataset. The k-Same face de-identification is also the starting point of this thesis.
Re-identification risk and data utility are two incompatible aspects in face de-identification. The focus of this thesis is to improve the privacy protection performance of a face de-identification system while providing data utility preserving solutions for different application scenarios. This thesis first proposes the k-Same furthest face de-identification method which introduces the wrong-map protection to the k-Same-M face de-identification, where the identity loss is maximised by replacing an original face with the face that has the least similarity to it.
The data utility of face images has been considered from two aspects in this thesis, the dataset-wise data utility such as data distribution of the data set and the individual-wise data utility such as the facial expression in an individual image. With the aim to preserve the diversity of a face image dataset, the k-Diff-furthest face de-identification method is proposed, which extends the k-Same-furthest method and can provide the wrong-map protection.
With respect to the data utility of an individual face image, the visual quality and the preservation of facial expression are discussed in this thesis. A method to merge the isolated de-identified face region and its original image background is presented. The described method can increase the visual quality of a de-identified face image in terms of fidelity and intelligibility. A novel solution to preserving facial expressions in de-identified face images is presented, which can preserve not only the category of facial expressions but also the intensity of face Action Units.
Finally, an integration of the Active Appearance Model (AAM) and Generative
Adversarial Network (GAN) is presented, which achieves the synthesis of realistic face images with shallow neural network architectures
- …