3,712 research outputs found
Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories
In this paper, we propose a new approach for facial expression recognition
using deep covariance descriptors. The solution is based on the idea of
encoding local and global Deep Convolutional Neural Network (DCNN) features
extracted from still images, in compact local and global covariance
descriptors. The space geometry of the covariance matrices is that of Symmetric
Positive Definite (SPD) matrices. By conducting the classification of static
facial expressions using Support Vector Machine (SVM) with a valid Gaussian
kernel on the SPD manifold, we show that deep covariance descriptors are more
effective than the standard classification with fully connected layers and
softmax. Besides, we propose a completely new and original solution to model
the temporal dynamic of facial expressions as deep trajectories on the SPD
manifold. As an extension of the classification pipeline of covariance
descriptors, we apply SVM with valid positive definite kernels derived from
global alignment for deep covariance trajectories classification. By performing
extensive experiments on the Oulu-CASIA, CK+, and SFEW datasets, we show that
both the proposed static and dynamic approaches achieve state-of-the-art
performance for facial expression recognition outperforming many recent
approaches.Comment: A preliminary version of this work appeared in "Otberdout N, Kacem A,
Daoudi M, Ballihi L, Berretti S. Deep Covariance Descriptors for Facial
Expression Recognition, in British Machine Vision Conference 2018, BMVC 2018,
Northumbria University, Newcastle, UK, September 3-6, 2018. ; 2018 :159."
arXiv admin note: substantial text overlap with arXiv:1805.0386
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Deep representation learning for human motion prediction and classification
Generative models of 3D human motion are often restricted to a small number
of activities and can therefore not generalize well to novel movements or
applications. In this work we propose a deep learning framework for human
motion capture data that learns a generic representation from a large corpus of
motion capture data and generalizes well to new, unseen, motions. Using an
encoding-decoding network that learns to predict future 3D poses from the most
recent past, we extract a feature representation of human motion. Most work on
deep learning for sequence prediction focuses on video and speech. Since
skeletal data has a different structure, we present and evaluate different
network architectures that make different assumptions about time dependencies
and limb correlations. To quantify the learned features, we use the output of
different layers for action classification and visualize the receptive fields
of the network units. Our method outperforms the recent state of the art in
skeletal motion prediction even though these use action specific training data.
Our results show that deep feedforward networks, trained from a generic mocap
database, can successfully be used for feature extraction from human motion
data and that this representation can be used as a foundation for
classification and prediction.Comment: This paper is published at the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 201
- …