Search CORE

385 research outputs found

MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images

Author: Akbaş Emre
Çuğu İlke
Şener Eren
Publication venue
Publication date: 01/01/2019
Field of study

This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (or the compression rate). On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size. In addition, we hypothesized that translation invariance achieved using max-pooling layers would not be useful for the FER problem as the expressions are sensitive to small, pixel-wise changes around the eye and the mouth. However, we have found an intriguing improvement on generalization when max-pooling is used. We conducted experiments on two widely-used FER datasets, CK+ and Oulu-CASIA. Our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1851 frames per second on an Intel i7 CPU. Despite being less accurate than the state-of-the-art, MicroExpNet still provides significant insights for designing a microarchitecture for the FER problem.Comment: International Conference on Image Processing Theory, Tools and Applications (IPTA) 2019 camera ready version. Codes are available at: https://github.com/cuguilke/microexpne

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Discriminatively Trained Latent Ordinal Model for Video Classification

Author: Sharma Gaurav
Sikka Karan
Publication venue
Publication date: 14/08/2017
Field of study

We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for "smile", running and jumping for "highjump"). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF -- it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text overlap with arXiv:1604.0150

arXiv.org e-Print Archive

MPG.PuRe

Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories

Author: Ballihi Lahoucine
Berretti Stefano
Daoudi Mohamed
Kacem Anis
Otberdout Naima
Publication venue
Publication date: 03/10/2019
Field of study

In this paper, we propose a new approach for facial expression recognition using deep covariance descriptors. The solution is based on the idea of encoding local and global Deep Convolutional Neural Network (DCNN) features extracted from still images, in compact local and global covariance descriptors. The space geometry of the covariance matrices is that of Symmetric Positive Definite (SPD) matrices. By conducting the classification of static facial expressions using Support Vector Machine (SVM) with a valid Gaussian kernel on the SPD manifold, we show that deep covariance descriptors are more effective than the standard classification with fully connected layers and softmax. Besides, we propose a completely new and original solution to model the temporal dynamic of facial expressions as deep trajectories on the SPD manifold. As an extension of the classification pipeline of covariance descriptors, we apply SVM with valid positive definite kernels derived from global alignment for deep covariance trajectories classification. By performing extensive experiments on the Oulu-CASIA, CK+, and SFEW datasets, we show that both the proposed static and dynamic approaches achieve state-of-the-art performance for facial expression recognition outperforming many recent approaches.Comment: A preliminary version of this work appeared in "Otberdout N, Kacem A, Daoudi M, Ballihi L, Berretti S. Deep Covariance Descriptors for Facial Expression Recognition, in British Machine Vision Conference 2018, BMVC 2018, Northumbria University, Newcastle, UK, September 3-6, 2018. ; 2018 :159." arXiv admin note: substantial text overlap with arXiv:1805.0386

arXiv.org e-Print Archive

HAL Descartes

Open Repository and Bibliography - Luxembourg

Hal-Diderot

Identity-adaptive Facial Expression Recognition Through Expression Regeneration Using Conditional Generative Adversarial Networks

Author: Yang Huiyuan
Yin Lijun
Zhang Zheng
Publication venue: Scholars\u27 Mine
Publication date: 05/06/2018
Field of study

Subject variation is a challenging issue for facial expression recognition, especially when handling unseen subjects with small-scale labeled facial expression databases. Although transfer learning has been widely used to tackle the problem, the performance degrades on new data. In this paper, we present a novel approach (so-called IA-gen) to alleviate the issue of subject variations by regenerating expressions from any input facial images. First of all, we train conditional generative models to generate six prototypic facial expressions from any given query face image while keeping the identity related information unchanged. Generative Adversarial Networks are employed to train the conditional generative models, and each of them is designed to generate one of the prototypic facial expression images. Second, a regular CNN (FER-Net) is fine- tuned for expression classification. After the corresponding prototypic facial expressions are regenerated from each facial image, we output the last FC layer of FER-Net as features for both the input image and the generated images. Based on the minimum distance between the input image and the generated expression images in the feature space, the input image is classified as one of the prototypic expressions consequently. Our proposed method can not only alleviate the influence of inter-subject variations but will also be flexible enough to integrate with any other FER CNNs for person-independent facial expression recognition. Our method has been evaluated on CK+, Oulu-CASIA, BU-3DFE and BU-4DFE databases, and the results demonstrate the effectiveness of our proposed method

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine