13,800 research outputs found
A temporal latent topic model for facial expression recognition
Posters: no. 128LNCS v. 6495 is conference proceedings of the 10th Asian Conference on Computer Vision, Queens, ACCVIn this paper we extend the latent Dirichlet allocation (LDA) topic model to model facial expression dynamics. Our topic model integrates the temporal information of image sequences through redefining topic generation probability without involving new latent variables or increasing inference difficulties. A collapsed Gibbs sampler is derived for batch learning with labeled training dataset and an efficient learning method for testing data is also discussed. We describe the resulting temporal latent topic model (TLTM) in detail and show how it can be applied to facial expression recognition. Experiments on CMU expression database illustrate that the proposed TLTM is very efficient in facial expression recognition. © 2011 Springer-Verlag Berlin Heidelberg.postprintThe 10th Asian Conference on Computer Vision (ACCV 2010), Queenstown, New Zealand, 8-12 November 2010. In Lecture Notes in Computer Science, 2010, v. 6495, p. 51-6
Spatio-Temporal Facial Expression Recognition Using Convolutional Neural Networks and Conditional Random Fields
Automated Facial Expression Recognition (FER) has been a challenging task for
decades. Many of the existing works use hand-crafted features such as LBP, HOG,
LPQ, and Histogram of Optical Flow (HOF) combined with classifiers such as
Support Vector Machines for expression recognition. These methods often require
rigorous hyperparameter tuning to achieve good results. Recently Deep Neural
Networks (DNN) have shown to outperform traditional methods in visual object
recognition. In this paper, we propose a two-part network consisting of a
DNN-based architecture followed by a Conditional Random Field (CRF) module for
facial expression recognition in videos. The first part captures the spatial
relation within facial images using convolutional layers followed by three
Inception-ResNet modules and two fully-connected layers. To capture the
temporal relation between the image frames, we use linear chain CRF in the
second part of our network. We evaluate our proposed network on three publicly
available databases, viz. CK+, MMI, and FERA. Experiments are performed in
subject-independent and cross-database manners. Our experimental results show
that cascading the deep network architecture with the CRF module considerably
increases the recognition of facial expressions in videos and in particular it
outperforms the state-of-the-art methods in the cross-database experiments and
yields comparable results in the subject-independent experiments.Comment: To appear in 12th IEEE Conference on Automatic Face and Gesture
Recognition Worksho
The Many Moods of Emotion
This paper presents a novel approach to the facial expression generation
problem. Building upon the assumption of the psychological community that
emotion is intrinsically continuous, we first design our own continuous emotion
representation with a 3-dimensional latent space issued from a neural network
trained on discrete emotion classification. The so-obtained representation can
be used to annotate large in the wild datasets and later used to trained a
Generative Adversarial Network. We first show that our model is able to map
back to discrete emotion classes with a objectively and subjectively better
quality of the images than usual discrete approaches. But also that we are able
to pave the larger space of possible facial expressions, generating the many
moods of emotion. Moreover, two axis in this space may be found to generate
similar expression changes as in traditional continuous representations such as
arousal-valence. Finally we show from visual interpretation, that the third
remaining dimension is highly related to the well-known dominance dimension
from psychology
- …