2,529 research outputs found
Machine Analysis of Facial Expressions
No abstract
Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
Facial action unit (AU) detection and face alignment are two highly
correlated tasks since facial landmarks can provide precise AU locations to
facilitate the extraction of meaningful local features for AU detection. Most
existing AU detection works often treat face alignment as a preprocessing and
handle the two tasks independently. In this paper, we propose a novel
end-to-end deep learning framework for joint AU detection and face alignment,
which has not been explored before. In particular, multi-scale shared features
are learned firstly, and high-level features of face alignment are fed into AU
detection. Moreover, to extract precise local features, we propose an adaptive
attention learning module to refine the attention map of each AU adaptively.
Finally, the assembled local features are integrated with face alignment
features and global features for AU detection. Experiments on BP4D and DISFA
benchmarks demonstrate that our framework significantly outperforms the
state-of-the-art methods for AU detection.Comment: This paper has been accepted by ECCV 201
Unsupervised Training for 3D Morphable Model Regression
We present a method for training a regression network from image pixels to 3D
morphable model coordinates using only unlabeled photographs. The training loss
is based on features from a facial recognition network, computed on-the-fly by
rendering the predicted faces with a differentiable renderer. To make training
from features feasible and avoid network fooling effects, we introduce three
objectives: a batch distribution loss that encourages the output distribution
to match the distribution of the morphable model, a loopback loss that ensures
the network can correctly reinterpret its own output, and a multi-view identity
loss that compares the features of the predicted 3D face and the input
photograph from multiple viewing angles. We train a regression network using
these objectives, a set of unlabeled photographs, and the morphable model
itself, and demonstrate state-of-the-art results.Comment: CVPR 2018 version with supplemental material
(http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html
A dynamic texture based approach to recognition of facial actions and their temporal models
In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set
A Review on Facial Expression Recognition Techniques
Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification
Exploiting Emotional Dependencies with Graph Convolutional Networks for Facial Expression Recognition
Over the past few years, deep learning methods have shown remarkable results
in many face-related tasks including automatic facial expression recognition
(FER) in-the-wild. Meanwhile, numerous models describing the human emotional
states have been proposed by the psychology community. However, we have no
clear evidence as to which representation is more appropriate and the majority
of FER systems use either the categorical or the dimensional model of affect.
Inspired by recent work in multi-label classification, this paper proposes a
novel multi-task learning (MTL) framework that exploits the dependencies
between these two models using a Graph Convolutional Network (GCN) to recognize
facial expressions in-the-wild. Specifically, a shared feature representation
is learned for both discrete and continuous recognition in a MTL setting.
Moreover, the facial expression classifiers and the valence-arousal regressors
are learned through a GCN that explicitly captures the dependencies between
them. To evaluate the performance of our method under real-world conditions we
perform extensive experiments on the AffectNet and Aff-Wild2 datasets. The
results of our experiments show that our method is capable of improving the
performance across different datasets and backbone architectures. Finally, we
also surpass the previous state-of-the-art methods on the categorical model of
AffectNet.Comment: 9 pages, 8 figures, 5 tables, revised submission to the 16th IEEE
International Conference on Automatic Face and Gesture Recognitio
AuE-IPA: An AU Engagement Based Infant Pain Assessment Method
Recent studies have found that pain in infancy has a significant impact on
infant development, including psychological problems, possible brain injury,
and pain sensitivity in adulthood. However, due to the lack of specialists and
the fact that infants are unable to express verbally their experience of pain,
it is difficult to assess infant pain. Most existing infant pain assessment
systems directly apply adult methods to infants ignoring the differences
between infant expressions and adult expressions. Meanwhile, as the study of
facial action coding system continues to advance, the use of action units (AUs)
opens up new possibilities for expression recognition and pain assessment. In
this paper, a novel AuE-IPA method is proposed for assessing infant pain by
leveraging different engagement levels of AUs. First, different engagement
levels of AUs in infant pain are revealed, by analyzing the class activation
map of an end-to-end pain assessment model. The intensities of top-engaged AUs
are then used in a regression model for achieving automatic infant pain
assessment. The model proposed is trained and experimented on YouTube
Immunization dataset, YouTube Blood Test dataset, and iCOPEVid dataset. The
experimental results show that our AuE-IPA method is more applicable to infants
and possesses stronger generalization ability than end-to-end assessment model
and the classic PSPI metric
An audiovisual and contextual approach for categorical and continuous emotion recognition in-the-wild
In this work we tackle the task of video-based audio-visual emotion
recognition, within the premises of the 2nd Workshop and Competition on
Affective Behavior Analysis in-the-wild (ABAW2). Poor illumination conditions,
head/body orientation and low image resolution constitute factors that can
potentially hinder performance in case of methodologies that solely rely on the
extraction and analysis of facial features. In order to alleviate this problem,
we leverage both bodily and contextual features, as part of a broader emotion
recognition framework. We choose to use a standard CNN-RNN cascade as the
backbone of our proposed model for sequence-to-sequence (seq2seq) learning.
Apart from learning through the RGB input modality, we construct an aural
stream which operates on sequences of extracted mel-spectrograms. Our extensive
experiments on the challenging and newly assembled Aff-Wild2 dataset verify the
validity of our intuitive multi-stream and multi-modal approach towards emotion
recognition in-the-wild. Emphasis is being laid on the the beneficial influence
of the human body and scene context, as aspects of the emotion recognition
process that have been left relatively unexplored up to this point. All the
code was implemented using PyTorch and is publicly available.Comment: 7 pages, 1 figure, 3 tables, accepted to the 2nd Workshop and
Competition on Affective Behavior Analysis In-the-Wild (ABAW2
- …