27,348 research outputs found
Multi-modality Empowered Network For Facial Action Unit Detection
This paper presents a new thermal empowered multi-task network (TEMT-Net) to improve facial action unit detection. Our primary goal is to leverage the situation that the training set has multi-modality data while the application scenario only has one modality. Thermal images are robust to illumination and face color. In the proposed multi-task framework, we utilize both modality data. Action unit detection and facial landmark detection are correlated tasks. To utilize the advantage and the correlation of different modalities and different tasks, we propose a novel thermal empowered multi-task deep neural network learning approach for action unit detection, facial landmark detection and thermal image reconstruction simultaneously. The thermal image generator and facial landmark detection provide regularization on the learned features with shared factors as the input color images. Extensive experiments are conducted on the BP4D and MMSE databases, with the comparison to the state-of-the-art methods. The experiments show that the multi-modality framework improves the AU detection significantly
Affective Behavior Analysis using Action Unit Relation Graph and Multi-task Cross Attention
Facial behavior analysis is a broad topic with various categories such as
facial emotion recognition, age, and gender recognition. Many studies focus on
individual tasks while the multi-task learning approach is still an open
research issue and requires more research. In this paper, we present our
solution and experiment result for the Multi-Task Learning challenge of the
Affective Behavior Analysis in-the-wild competition. The challenge is a
combination of three tasks: action unit detection, facial expression
recognition, and valance-arousal estimation. To address this challenge, we
introduce a cross-attentive module to improve multi-task learning performance.
Additionally, a facial graph is applied to capture the association among action
units. As a result, we achieve the evaluation measure of 128.8 on the
validation data provided by the organizers, which outperforms the baseline
result of 30
Multi-Conditional Latent Variable Model for Joint Facial Action Unit Detection
We propose a novel multi-conditional latent variable model for simultaneous facial feature fusion and detection of facial action units. In our approach we exploit the structure-discovery capabilities of generative models such as Gaussian processes, and the discriminative power of classifiers such as logistic function. This leads to superior performance compared to existing classifiers for the target task that exploit either the discriminative or generative property, but not both. The model learning is performed via an efficient, newly proposed Bayesian learning strategy based on Monte Carlo sampling. Consequently, the learned model is robust to data overfitting, regardless of the number of both input features and jointly estimated facial action units. Extensive qualitative and quantitative experimental evaluations are performed on three publicly available datasets (CK+, Shoulder-pain and DISFA). We show that the proposed model outperforms the state-of-the-art methods for the target task on (i) feature fusion, and (ii) multiple facial action unit detection
Multi-scale Promoted Self-adjusting Correlation Learning for Facial Action Unit Detection
Facial Action Unit (AU) detection is a crucial task in affective computing
and social robotics as it helps to identify emotions expressed through facial
expressions. Anatomically, there are innumerable correlations between AUs,
which contain rich information and are vital for AU detection. Previous methods
used fixed AU correlations based on expert experience or statistical rules on
specific benchmarks, but it is challenging to comprehensively reflect complex
correlations between AUs via hand-crafted settings. There are alternative
methods that employ a fully connected graph to learn these dependencies
exhaustively. However, these approaches can result in a computational explosion
and high dependency with a large dataset. To address these challenges, this
paper proposes a novel self-adjusting AU-correlation learning (SACL) method
with less computation for AU detection. This method adaptively learns and
updates AU correlation graphs by efficiently leveraging the characteristics of
different levels of AU motion and emotion representation information extracted
in different stages of the network. Moreover, this paper explores the role of
multi-scale learning in correlation information extraction, and design a simple
yet effective multi-scale feature learning (MSFL) method to promote better
performance in AU detection. By integrating AU correlation information with
multi-scale features, the proposed method obtains a more robust feature
representation for the final AU detection. Extensive experiments show that the
proposed method outperforms the state-of-the-art methods on widely used AU
detection benchmark datasets, with only 28.7\% and 12.0\% of the parameters and
FLOPs of the best method, respectively. The code for this method is available
at \url{https://github.com/linuxsino/Self-adjusting-AU}.Comment: 13pages, 7 figure
Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
Facial action unit (AU) detection and face alignment are two highly
correlated tasks since facial landmarks can provide precise AU locations to
facilitate the extraction of meaningful local features for AU detection. Most
existing AU detection works often treat face alignment as a preprocessing and
handle the two tasks independently. In this paper, we propose a novel
end-to-end deep learning framework for joint AU detection and face alignment,
which has not been explored before. In particular, multi-scale shared features
are learned firstly, and high-level features of face alignment are fed into AU
detection. Moreover, to extract precise local features, we propose an adaptive
attention learning module to refine the attention map of each AU adaptively.
Finally, the assembled local features are integrated with face alignment
features and global features for AU detection. Experiments on BP4D and DISFA
benchmarks demonstrate that our framework significantly outperforms the
state-of-the-art methods for AU detection.Comment: This paper has been accepted by ECCV 201
Facial Action Unit Detection With Deep Convolutional Neural Networks
The facial features are the most important tool to understand an individual\u27s state of mind. Automated recognition of facial expressions and particularly Facial Action Units defined by Facial Action Coding System (FACS) is challenging research problem in the field of computer vision and machine learning. Researchers are working on deep learning algorithms to improve state of the art in the area. Automated recognition of facial action units has man applications ranging from developmental psychology to human robot interface design where companies are using this technology to improve their consumer devices (like unlocking phone) and for entertainment like FaceApp. Recent studies suggest that detecting these facial features, which is a multi-label classification problem, can be solved using a problem transformation approach in which multi-label problems converted into single-label problem with BinaryRelevance classifier.
In this thesis, convolutional neural network is used as it can go substantially deeper, more accurate, though requires lots of data to train the algorithm. It usually results in a significant feature map obtained from each layer of the network. We introduce Modified DenseNet considering DenseNet as a baseline model. Averaging all the features obtained from each block of DenseNet gives importance to each level of features which can get lost during concatenating the layers in DenseNet and other state of the art classification models.
Detection of Facial Action Units (AUs) can be determined by selecting threshold for the probabilities obtained by training the Modified DenseNet model. Threshold selection can be done with the help of Matthew Correlation Coefficient. Using Matthew Correlation Coefficient, AU correlation can take into account which was missing for previous studies using BinaryRelevance classifier as it does not consider label’s correlation because it treats every target variable independently. Modifying DenseNet model helped to improve results by reusing features and alleviating the vanishing-gradient problem.
We evaluated our proposed architecture on a competitive Facial Action Unit Detection task (EmotioNet) database which includes 950,000 images with annotated AUs. Modified DenseNet obtain significant improvements over the state-of-the-art methods on most of them by comparing with the accuracy and other metrics of evaluation and requiring less computation time as compared to problem transformation methods
Facial Action Unit Detection Using Attention and Relation Learning
Attention mechanism has recently attracted increasing attentions in the field
of facial action unit (AU) detection. By finding the region of interest of each
AU with the attention mechanism, AU-related local features can be captured.
Most of the existing attention based AU detection works use prior knowledge to
predefine fixed attentions or refine the predefined attentions within a small
range, which limits their capacity to model various AUs. In this paper, we
propose an end-to-end deep learning based attention and relation learning
framework for AU detection with only AU labels, which has not been explored
before. In particular, multi-scale features shared by each AU are learned
firstly, and then both channel-wise and spatial attentions are adaptively
learned to select and extract AU-related local features. Moreover, pixel-level
relations for AUs are further captured to refine spatial attentions so as to
extract more relevant local features. Without changing the network
architecture, our framework can be easily extended for AU intensity estimation.
Extensive experiments show that our framework (i) soundly outperforms the
state-of-the-art methods for both AU detection and AU intensity estimation on
the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can
adaptively capture the correlated regions of each AU, and (iii) also works well
under severe occlusions and large poses.Comment: This paper is accepted by IEEE Transactions on Affective Computin
- …