32,834 research outputs found
An Empirical Comparison of Graph-based Dimensionality Reduction Algorithms on Facial Expression Recognition Tasks
Facial expression recognition is a topic of interest both in industry and academia. Recent approaches to facial expression recognition are based on mapping expressions to low dimensional manifolds. In this paper we revisit various dimensionality reduction algorithms using a graph-based paradigm. We compare eight dimensionality reduction algorithms on a facial expression recognition task. For this task, experimental results show that although Linear Discriminant Analysis (LDA)is the simplest and oldest supervised approach, its results are comparable to more flexible recent algorithms.LDA, on the other hand, is much simpler to tune, since it only depends on one parameter
Exploiting Emotional Dependencies with Graph Convolutional Networks for Facial Expression Recognition
Over the past few years, deep learning methods have shown remarkable results
in many face-related tasks including automatic facial expression recognition
(FER) in-the-wild. Meanwhile, numerous models describing the human emotional
states have been proposed by the psychology community. However, we have no
clear evidence as to which representation is more appropriate and the majority
of FER systems use either the categorical or the dimensional model of affect.
Inspired by recent work in multi-label classification, this paper proposes a
novel multi-task learning (MTL) framework that exploits the dependencies
between these two models using a Graph Convolutional Network (GCN) to recognize
facial expressions in-the-wild. Specifically, a shared feature representation
is learned for both discrete and continuous recognition in a MTL setting.
Moreover, the facial expression classifiers and the valence-arousal regressors
are learned through a GCN that explicitly captures the dependencies between
them. To evaluate the performance of our method under real-world conditions we
perform extensive experiments on the AffectNet and Aff-Wild2 datasets. The
results of our experiments show that our method is capable of improving the
performance across different datasets and backbone architectures. Finally, we
also surpass the previous state-of-the-art methods on the categorical model of
AffectNet.Comment: 9 pages, 8 figures, 5 tables, revised submission to the 16th IEEE
International Conference on Automatic Face and Gesture Recognitio
Time-Efficient Hybrid Approach for Facial Expression Recognition
Facial expression recognition is an emerging research area for improving human and computer interaction. This research plays a significant role in the field of social communication, commercial enterprise, law enforcement, and other computer interactions. In this paper, we propose a time-efficient hybrid design for facial expression recognition, combining image pre-processing steps and different Convolutional Neural Network (CNN) structures providing better accuracy and greatly improved training time. We are predicting seven basic emotions of human faces: sadness, happiness, disgust, anger, fear, surprise and neutral. The model performs well regarding challenging facial expression recognition where the emotion expressed could be one of several due to their quite similar facial characteristics such as anger, disgust, and sadness. The experiment to test the model was conducted across multiple databases and different facial orientations, and to the best of our knowledge, the model provided an accuracy of about 89.58% for KDEF dataset, 100% accuracy for JAFFE dataset and 71.975% accuracy for combined (KDEF + JAFFE + SFEW) dataset across these different scenarios. Performance evaluation was done by cross-validation techniques to avoid bias towards a specific set of images from a database
FML: Face Model Learning from Videos
Monocular image-based 3D reconstruction of faces is a long-standing problem
in computer vision. Since image data is a 2D projection of a 3D face, the
resulting depth ambiguity makes the problem ill-posed. Most existing methods
rely on data-driven priors that are built from limited 3D face scans. In
contrast, we propose multi-frame video-based self-supervised training of a deep
network that (i) learns a face identity model both in shape and appearance
while (ii) jointly learning to reconstruct 3D faces. Our face model is learned
using only corpora of in-the-wild video clips collected from the Internet. This
virtually endless source of training data enables learning of a highly general
3D face model. In order to achieve this, we propose a novel multi-frame
consistency loss that ensures consistent shape and appearance across multiple
frames of a subject's face, thus minimizing depth ambiguity. At test time we
can use an arbitrary number of frames, so that we can perform both monocular as
well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ,
Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
Generating 3D faces using Convolutional Mesh Autoencoders
Learned 3D representations of human faces are useful for computer vision
problems such as 3D face tracking and reconstruction from images, as well as
graphics applications such as character generation and animation. Traditional
models learn a latent representation of a face using linear subspaces or
higher-order tensor generalizations. Due to this linearity, they can not
capture extreme deformations and non-linear expressions. To address this, we
introduce a versatile model that learns a non-linear representation of a face
using spectral convolutions on a mesh surface. We introduce mesh sampling
operations that enable a hierarchical mesh representation that captures
non-linear variations in shape and expression at multiple scales within the
model. In a variational setting, our model samples diverse realistic 3D faces
from a multivariate Gaussian distribution. Our training data consists of 20,466
meshes of extreme expressions captured over 12 different subjects. Despite
limited training data, our trained model outperforms state-of-the-art face
models with 50% lower reconstruction error, while using 75% fewer parameters.
We also show that, replacing the expression space of an existing
state-of-the-art face model with our autoencoder, achieves a lower
reconstruction error. Our data, model and code are available at
http://github.com/anuragranj/com
- …