4,963 research outputs found
LEARNet Dynamic Imaging Network for Micro Expression Recognition
Unlike prevalent facial expressions, micro expressions have subtle,
involuntary muscle movements which are short-lived in nature. These minute
muscle movements reflect true emotions of a person. Due to the short duration
and low intensity, these micro-expressions are very difficult to perceive and
interpret correctly. In this paper, we propose the dynamic representation of
micro-expressions to preserve facial movement information of a video in a
single frame. We also propose a Lateral Accretive Hybrid Network (LEARNet) to
capture micro-level features of an expression in the facial region. The LEARNet
refines the salient expression features in accretive manner by incorporating
accretion layers (AL) in the network. The response of the AL holds the hybrid
feature maps generated by prior laterally connected convolution layers.
Moreover, LEARNet architecture incorporates the cross decoupled relationship
between convolution layers which helps in preserving the tiny but influential
facial muscle change information. The visual responses of the proposed LEARNet
depict the effectiveness of the system by preserving both high- and micro-level
edge features of facial expression. The effectiveness of the proposed LEARNet
is evaluated on four benchmark datasets: CASME-I, CASME-II, CAS(ME)^2 and SMIC.
The experimental results after investigation show a significant improvement of
4.03%, 1.90%, 1.79% and 2.82% as compared with ResNet on CASME-I, CASME-II,
CAS(ME)^2 and SMIC datasets respectively.Comment: Dynamic imaging, accretion, lateral, micro expression recognitio
Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories
In this paper, we propose a new approach for facial expression recognition
using deep covariance descriptors. The solution is based on the idea of
encoding local and global Deep Convolutional Neural Network (DCNN) features
extracted from still images, in compact local and global covariance
descriptors. The space geometry of the covariance matrices is that of Symmetric
Positive Definite (SPD) matrices. By conducting the classification of static
facial expressions using Support Vector Machine (SVM) with a valid Gaussian
kernel on the SPD manifold, we show that deep covariance descriptors are more
effective than the standard classification with fully connected layers and
softmax. Besides, we propose a completely new and original solution to model
the temporal dynamic of facial expressions as deep trajectories on the SPD
manifold. As an extension of the classification pipeline of covariance
descriptors, we apply SVM with valid positive definite kernels derived from
global alignment for deep covariance trajectories classification. By performing
extensive experiments on the Oulu-CASIA, CK+, and SFEW datasets, we show that
both the proposed static and dynamic approaches achieve state-of-the-art
performance for facial expression recognition outperforming many recent
approaches.Comment: A preliminary version of this work appeared in "Otberdout N, Kacem A,
Daoudi M, Ballihi L, Berretti S. Deep Covariance Descriptors for Facial
Expression Recognition, in British Machine Vision Conference 2018, BMVC 2018,
Northumbria University, Newcastle, UK, September 3-6, 2018. ; 2018 :159."
arXiv admin note: substantial text overlap with arXiv:1805.0386
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images
This paper is aimed at creating extremely small and fast convolutional neural
networks (CNN) for the problem of facial expression recognition (FER) from
frontal face images. To this end, we employed the popular knowledge
distillation (KD) method and identified two major shortcomings with its use: 1)
a fine-grained grid search is needed for tuning the temperature hyperparameter
and 2) to find the optimal size-accuracy balance, one needs to search for the
final network size (or the compression rate). On the other hand, KD is proved
to be useful for model compression for the FER problem, and we discovered that
its effects gets more and more significant with the decreasing model size. In
addition, we hypothesized that translation invariance achieved using
max-pooling layers would not be useful for the FER problem as the expressions
are sensitive to small, pixel-wise changes around the eye and the mouth.
However, we have found an intriguing improvement on generalization when
max-pooling is used. We conducted experiments on two widely-used FER datasets,
CK+ and Oulu-CASIA. Our smallest model (MicroExpNet), obtained using knowledge
distillation, is less than 1MB in size and works at 1851 frames per second on
an Intel i7 CPU. Despite being less accurate than the state-of-the-art,
MicroExpNet still provides significant insights for designing a
microarchitecture for the FER problem.Comment: International Conference on Image Processing Theory, Tools and
Applications (IPTA) 2019 camera ready version. Codes are available at:
https://github.com/cuguilke/microexpne
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
- …