Search CORE

7 research outputs found

Emotion Recognition by Two View SVM_2K Classifier on Dynamic Facial Expression Features

Author: Bianchi-Berthouze N
Meng H
Romera-Paredes B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/03/2011
Field of study

A novel emotion recognition system has been proposed for classifying facial expression in videos. Firstly, two types of basic facial appearance descriptors were extracted. The first type of descriptor, called Motion History Histogram (MHH), was used to detect temporal changes of each pixels of the face. The second type of descriptor, called Histogram of Local Binary Patterns (LBP), was applied to each frame of the video and was used to capture local textural patterns. Secondly, based on these two basic types of descriptors, two new dynamic facial expression features called MHH_EOH and LBP_MCF were proposed. These two features incorporate both dynamic and local information. Finally, the Two View SVK_2K classifier was built to integrate these two dynamic features in an efficient way. The experimental results showed that this method outperformed the baseline results set by the FERA'11 challenge

UCL Discovery

Relative Facial Action Unit Detection

Author: Khademi Mahmoud
Morency Louis-Philippe
Publication venue
Publication date: 30/04/2014
Field of study

This paper presents a subject-independent facial action unit (AU) detection method by introducing the concept of relative AU detection, for scenarios where the neutral face is not provided. We propose a new classification objective function which analyzes the temporal neighborhood of the current frame to decide if the expression recently increased, decreased or showed no change. This approach is a significant change from the conventional absolute method which decides about AU classification using the current frame, without an explicit comparison with its neighboring frames. Our proposed method improves robustness to individual differences such as face scale and shape, age-related wrinkles, and transitions among expressions (e.g., lower intensity of expressions). Our experiments on three publicly available datasets (Extended Cohn-Kanade (CK+), Bosphorus, and DISFA databases) show significant improvement of our approach over conventional absolute techniques. Keywords: facial action coding system (FACS); relative facial action unit detection; temporal information;Comment: Accepted at IEEE Winter Conference on Applications of Computer Vision, Steamboat Springs Colorado, USA, 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Real-time Emotional State Detection from Facial Expression on Embedded Devices

Author: Juhar J
Meng H
Pleva M
Swash RM
Turabzadeh S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

From the last decade, researches on human facial emotion recognition disclosed that computing models built on regression modelling can produce applicable performance. However, many systems need extensive computing power to be run that prevents its wide applications such as robots and smart devices. In this proposed system, a real-time automatic facial expression system was designed, implemented and tested on an embedded device such as FPGA that can be a first step for a specific facial expression recognition chip for a social robot. The system was built and simulated in MATLAB and then was built on FPGA and it can carry out real time continuously emotional state recognition at 30 fps with 47.44% accuracy. The proposed graphic user interface is able to display the participant video and two dimensional predict labels of the emotion in real time together.The research presented in this paper was supported partially by the Slovak Research and Development Agency under the research projects APVV-15-0517 & APPV-15-0731 and by the Ministry of Education, Science, Research and Sport of the Slovak Republic under the project VEGA 1/0075/15

Brunel University Research Archive

Recommended from our members

Automatic emotional state detection and analysis on embedded devices

Author: Turabzadeh Saeed
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Master of Philosophy and was awarded by Brunel University LondonFrom the last decade, studies on human facial emotion recognition revealed that computing models based on regression modelling can produce applicable performance. In this study, an automatic facial expression real-time system was built and tested. The method is used in this study has been used widely in different areas such as Local Binary Pattern method, which has been used in many research projects in machine vision, and the K-Nearest Neighbour algorithm is method utilized for regression modelling. In this study, these two techniques has been used and implemented on the FPGA for the first time, on the side and joined together to great the model in such way to display a continues and automatic emotional state detection model on the monitor. To evaluate the effectiveness of the classifier technique for human emotion recognition from video, the model was designed and tested on MATLAB environment and then MATLAB Simulink environment that is capable of recognizing continuous facial expression in real time with a rate of 1 frame per second and implemented on a desktop PC. It has been evaluated in a testing dataset and the experimental results were promising with the accuracy of 51.28%. The datasets and labels used in this study are made from videos which, recorded twice from 5 participants while watching a video. In order to implement it in real-time in faster frame rate, the facial expression recognition system was built on FPGA. The model was built on Atlys™ Spartan-6 FPGA Development Board. It can perform continuously emotional state recognition in real time at a frame rate of 30 with the accuracy of 47.44%. A graphic user interface was designed to display the participant video in real time and also two dimensional predict labels of the emotion at the same time. This is the first time that automatic emotional state detection has been successfully implemented on FPGA by using LBP and K-NN techniques in such way to display a continues and automatic emotional state detection model on the monitor

Brunel University Research Archive

Facial expression recognition in the wild : from individual to group

Author: Dhall Abhinav
Publication venue
Publication date: 21/11/2018
Field of study

The progress in computing technology has increased the demand for smart systems capable of understanding human affect and emotional manifestations. One of the crucial factors in designing systems equipped with such intelligence is to have accurate automatic Facial Expression Recognition (FER) methods. In computer vision, automatic facial expression analysis is an active field of research for over two decades now. However, there are still a lot of questions unanswered. The research presented in this thesis attempts to address some of the key issues of FER in challenging conditions mentioned as follows: 1) creating a facial expressions database representing real-world conditions; 2) devising Head Pose Normalisation (HPN) methods which are independent of facial parts location; 3) creating automatic methods for the analysis of mood of group of people. The central hypothesis of the thesis is that extracting close to real-world data from movies and performing facial expression analysis on movies is a stepping stone in the direction of moving the analysis of faces towards real-world, unconstrained condition. A temporal facial expressions database, Acted Facial Expressions in the Wild (AFEW) is proposed. The database is constructed and labelled using a semi-automatic process based on closed caption subtitle based keyword search. Currently, AFEW is the largest facial expressions database representing challenging conditions available to the research community. For providing a common platform to researchers in order to evaluate and extend their state-of-the-art FER methods, the first Emotion Recognition in the Wild (EmotiW) challenge based on AFEW is proposed. An image-only based facial expressions database Static Facial Expressions In The Wild (SFEW) extracted from AFEW is proposed. Furthermore, the thesis focuses on HPN for real-world images. Earlier methods were based on fiducial points. However, as fiducial points detection is an open problem for real-world images, HPN can be error-prone. A HPN method based on response maps generated from part-detectors is proposed. The proposed shape-constrained method does not require fiducial points and head pose information, which makes it suitable for real-world images. Data from movies and the internet, representing real-world conditions poses another major challenge of the presence of multiple subjects to the research community. This defines another focus of this thesis where a novel approach for modeling the perception of mood of a group of people in an image is presented. A new database is constructed from Flickr based on keywords related to social events. Three models are proposed: averaging based Group Expression Model (GEM), Weighted Group Expression Model (GEM_w) and Augmented Group Expression Model (GEM_LDA). GEM_w is based on social contextual attributes, which are used as weights on each person's contribution towards the overall group's mood. Further, GEM_LDA is based on topic model and feature augmentation. The proposed framework is applied to applications of group candid shot selection and event summarisation. The application of Structural SIMilarity (SSIM) index metric is explored for finding similar facial expressions. The proposed framework is applied to the problem of creating image albums based on facial expressions, finding corresponding expressions for training facial performance transfer algorithms

The Australian National University

Multitask and transfer learning for multi-aspect data

Author: Romera Paredes B
Publication venue: UCL (University College London)
Publication date: 28/12/2014
Field of study

Supervised learning aims to learn functional relationships between inputs and outputs. Multitask learning tackles supervised learning tasks by performing them simultaneously to exploit commonalities between them. In this thesis, we focus on the problem of eliminating negative transfer in order to achieve better performance in multitask learning. We start by considering a general scenario in which the relationship between tasks is unknown. We then narrow our analysis to the case where data are characterised by a combination of underlying aspects, e.g., a dataset of images of faces, where each face is determined by a person's facial structure, the emotion being expressed, and the lighting conditions. In machine learning there have been numerous efforts based on multilinear models to decouple these aspects but these have primarily used techniques from the field of unsupervised learning. In this thesis we take inspiration from these approaches and hypothesize that supervised learning methods can also benefit from exploiting these aspects. The contributions of this thesis are as follows: 1. A multitask learning and transfer learning method that avoids negative transfer when there is no prescribed information about the relationships between tasks. 2. A multitask learning approach that takes advantage of a lack of overlapping features between known groups of tasks associated with different aspects. 3. A framework which extends multitask learning using multilinear algebra, with the aim of learning tasks associated with a combination of elements from different aspects. 4. A novel convex relaxation approach that can be applied both to the suggested framework and more generally to any tensor recovery problem. Through theoretical validation and experiments on both synthetic and real-world datasets, we show that the proposed approaches allow fast and reliable inferences. Furthermore, when performing learning tasks on an aspect of interest, accounting for secondary aspects leads to significantly more accurate results than using traditional approaches

UCL Discovery

Apprentissage neuronal de caractéristiques spatio-temporelles pour la classification automatique de séquences vidéo

Author: BACCOUCHE Moez
BASKURT Atilla
WOLF Christian
Publication venue
Publication date: 01/01/2013
Field of study

Cette thèse s'intéresse à la problématique de la classification automatique des séquences vidéo. L'idée est de se démarquer de la méthodologie dominante qui se base sur l'utilisation de caractéristiques conçues manuellement, et de proposer des modèles qui soient les plus génériques possibles et indépendants du domaine. Ceci est fait en automatisant la phase d'extraction des caractéristiques, qui sont dans notre cas générées par apprentissage à partir d'exemples, sans aucune connaissance a priori. Nous nous appuyons pour ce faire sur des travaux existants sur les modèles neuronaux pour la reconnaissance d'objets dans les images fixes, et nous étudions leur extension au cas de la vidéo. Plus concrètement, nous proposons deux modèles d'apprentissage des caractéristiques spatio-temporelles pour la classification vidéo : (i) Un modèle d'apprentissage supervisé profond, qui peut être vu comme une extension des modèles ConvNets au cas de la vidéo, et (ii) Un modèle d'apprentissage non supervisé, qui se base sur un schéma d'auto-encodage, et sur une représentation parcimonieuse sur-complète des données. Outre les originalités liées à chacune de ces deux approches, une contribution supplémentaire de cette thèse est une étude comparative entre plusieurs modèles de classification de séquences parmi les plus populaires de l'état de l'art. Cette étude a été réalisée en se basant sur des caractéristiques manuelles adaptées à la problématique de la reconnaissance d'actions dans les vidéos de football. Ceci a permis d'identifier le modèle de classification le plus performant (un réseau de neurone récurrent bidirectionnel à longue mémoire à court-terme -BLSTM-), et de justifier son utilisation pour le reste des expérimentations. Enfin, afin de valider la généricité des deux modèles proposés, ceux-ci ont été évalués sur deux problématiques différentes, à savoir la reconnaissance d'actions humaines (sur la base KTH), et la reconnaissance d'expressions faciales (sur la base GEMEP-FERA). L'étude des résultats a permis de valider les approches, et de montrer qu'elles obtiennent des performances parmi les meilleures de l'état de l'art (avec 95,83% de bonne reconnaissance pour la base KTH, et 87,57% pour la base GEMEP-FERA).This thesis focuses on the issue of automatic classification of video sequences. We aim, through this work, at standing out from the dominant methodology, which relies on so-called hand-crafted features, by proposing generic and problem-independent models. This can be done by automating the feature extraction process, which is performed in our case through a learning scheme from training examples, without any prior knowledge. To do so, we rely on existing neural-based methods, which are dedicated to object recognition in still images, and investigate their extension to the video case. More concretely, we introduce two learning-based models to extract spatio-temporal features for video classification: (i) A deep learning model, which is trained in a supervised way, and which can be considered as an extension of the popular ConvNets model to the video case, and (ii) An unsupervised learning model that relies on an auto-encoder scheme, and a sparse over-complete representation. Moreover, an additional contribution of this work lies in a comparative study between several sequence classification models. This study was performed using hand-crafted features especially designed to be optimal for the soccer action recognition problem. Obtained results have permitted to select the best classifier (a bidirectional long short-term memory recurrent neural network -BLSTM-) to be used for all experiments. In order to validate the genericity of the two proposed models, experiments were carried out on two different problems, namely human action recognition (using the KTH dataset) and facial expression recognition (using the GEMEP-FERA dataset). Obtained results show that our approaches achieve outstanding performances, among the best of the related works (with a recognition rate of 95,83% for the KTH dataset, and 87,57% for the GEMEP-FERA dataset).VILLEURBANNE-DOC'INSA-Bib. elec. (692669901) / SudocSudocFranceF

OpenGrey Repository