45 research outputs found

    A Review on Facial Expression Recognition Techniques

    Get PDF
    Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification

    Timing is everything: A spatio-temporal approach to the analysis of facial actions

    No full text
    This thesis presents a fully automatic facial expression analysis system based on the Facial Action Coding System (FACS). FACS is the best known and the most commonly used system to describe facial activity in terms of facial muscle actions (i.e., action units, AUs). We will present our research on the analysis of the morphological, spatio-temporal and behavioural aspects of facial expressions. In contrast with most other researchers in the field who use appearance based techniques, we use a geometric feature based approach. We will argue that that approach is more suitable for analysing facial expression temporal dynamics. Our system is capable of explicitly exploring the temporal aspects of facial expressions from an input colour video in terms of their onset (start), apex (peak) and offset (end). The fully automatic system presented here detects 20 facial points in the first frame and tracks them throughout the video. From the tracked points we compute geometry-based features which serve as the input to the remainder of our systems. The AU activation detection system uses GentleBoost feature selection and a Support Vector Machine (SVM) classifier to find which AUs were present in an expression. Temporal dynamics of active AUs are recognised by a hybrid GentleBoost-SVM-Hidden Markov model classifier. The system is capable of analysing 23 out of 27 existing AUs with high accuracy. The main contributions of the work presented in this thesis are the following: we have created a method for fully automatic AU analysis with state-of-the-art recognition results. We have proposed for the first time a method for recognition of the four temporal phases of an AU. We have build the largest comprehensive database of facial expressions to date. We also present for the first time in the literature two studies for automatic distinction between posed and spontaneous expressions

    Automatic detection of ADHD and ASD from expressive behaviour in RGBD data

    Get PDF
    Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder (ASD) are neurodevelopmental conditions which impact on a significant number of children and adults. Currently, the diagnosis of such disorders is done by experts who employ standard questionnaires and look for certain behavioural markers through manual observation. Such methods for their diagnosis are not only subjective, difficult to repeat, and costly but also extremely time consuming. In this work, we present a novel methodology to aid diagnostic predictions about the presence/absence of ADHD and ASD by automatic visual analysis of a person's behaviour. To do so, we conduct the questionnaires in a computer-mediated way while recording participants with modern RGBD (Colour+Depth) sensors. In contrast to previous automatic approaches which have focussed only on detecting certain behavioural markers, our approach provides a fully automatic end-to-end system to directly predict ADHD and ASD in adults. Using state of the art facial expression analysis based on Dynamic Deep Learning and 3D analysis of behaviour, we attain classification rates of 96% for Controls vs Condition (ADHD/ASD) groups and 94% for Comorbid (ADHD+ASD) vs ASD only group. We show that our system is a potentially useful time saving contribution to the clinical diagnosis of ADHD and ASD

    Emotion detection in real-time on an Android Smartphone

    Get PDF
    During the last decade the proliferation and capabilities of smartphones have exploded, reaching a situation in which, nowadays, almost everybody carries one of these devices with enormous computational power, cameras and several sensors at any time. The omnipresence, the capabilities and the tools available to facilitate developing applications are the perfect combination for proposing solutions to numerous problems, some existing and some created. In this research, an approach of the widely researched topic of automatically detecting human emotions is brought to the Android platform by means of an application. A system is developed using some existing tools in order to detect emotions with few computations so that it may run in real time on smartphones with less computational power. Given the limited time available for developing this project, the scope of this work is to set the foundation so that the application can be improved in the future by other researchers; therefore, some limitations are set, and some stages are not fully optimized. Finally, future improvements are proposed to facilitate the continuation of this project

    Laugh machine

    Full text link
    The Laugh Machine project aims at endowing virtual agents with the capability to laugh naturally, at the right moment and with the correct intensity, when interacting with human participants. In this report we present the technical development and evaluation of such an agent in one specific scenario: watching TV along with a participant. The agent must be able to react to both, the video and the participant’s behaviour. A full processing chain has been implemented, inte- grating components to sense the human behaviours, decide when and how to laugh and, finally, synthesize audiovisual laughter animations. The system was evaluated in its capability to enhance the affective experience of naive participants, with the help of pre and post-experiment questionnaires. Three interaction conditions have been compared: laughter-enabled or not, reacting to the participant’s behaviour or not. Preliminary results (the number of experiments is currently to small to obtain statistically significant differences) show that the interactive, laughter-enabled agent is positively perceived and is increasing the emotional dimension of the experiment

    Computationally efficient deformable 3D object tracking with a monocular RGB camera

    Get PDF
    182 p.Monocular RGB cameras are present in most scopes and devices, including embedded environments like robots, cars and home automation. Most of these environments have in common a significant presence of human operators with whom the system has to interact. This context provides the motivation to use the captured monocular images to improve the understanding of the operator and the surrounding scene for more accurate results and applications.However, monocular images do not have depth information, which is a crucial element in understanding the 3D scene correctly. Estimating the three-dimensional information of an object in the scene using a single two-dimensional image is already a challenge. The challenge grows if the object is deformable (e.g., a human body or a human face) and there is a need to track its movements and interactions in the scene.Several methods attempt to solve this task, including modern regression methods based on Deep NeuralNetworks. However, despite the great results, most are computationally demanding and therefore unsuitable for several environments. Computational efficiency is a critical feature for computationally constrained setups like embedded or onboard systems present in robotics and automotive applications, among others.This study proposes computationally efficient methodologies to reconstruct and track three-dimensional deformable objects, such as human faces and human bodies, using a single monocular RGB camera. To model the deformability of faces and bodies, it considers two types of deformations: non-rigid deformations for face tracking, and rigid multi-body deformations for body pose tracking. Furthermore, it studies their performance on computationally restricted devices like smartphones and onboard systems used in the automotive industry. The information extracted from such devices gives valuable insight into human behaviour a crucial element in improving human-machine interaction.We tested the proposed approaches in different challenging application fields like onboard driver monitoring systems, human behaviour analysis from monocular videos, and human face tracking on embedded devices

    Computationally efficient deformable 3D object tracking with a monocular RGB camera

    Get PDF
    182 p.Monocular RGB cameras are present in most scopes and devices, including embedded environments like robots, cars and home automation. Most of these environments have in common a significant presence of human operators with whom the system has to interact. This context provides the motivation to use the captured monocular images to improve the understanding of the operator and the surrounding scene for more accurate results and applications.However, monocular images do not have depth information, which is a crucial element in understanding the 3D scene correctly. Estimating the three-dimensional information of an object in the scene using a single two-dimensional image is already a challenge. The challenge grows if the object is deformable (e.g., a human body or a human face) and there is a need to track its movements and interactions in the scene.Several methods attempt to solve this task, including modern regression methods based on Deep NeuralNetworks. However, despite the great results, most are computationally demanding and therefore unsuitable for several environments. Computational efficiency is a critical feature for computationally constrained setups like embedded or onboard systems present in robotics and automotive applications, among others.This study proposes computationally efficient methodologies to reconstruct and track three-dimensional deformable objects, such as human faces and human bodies, using a single monocular RGB camera. To model the deformability of faces and bodies, it considers two types of deformations: non-rigid deformations for face tracking, and rigid multi-body deformations for body pose tracking. Furthermore, it studies their performance on computationally restricted devices like smartphones and onboard systems used in the automotive industry. The information extracted from such devices gives valuable insight into human behaviour a crucial element in improving human-machine interaction.We tested the proposed approaches in different challenging application fields like onboard driver monitoring systems, human behaviour analysis from monocular videos, and human face tracking on embedded devices

    Recognition of facial action units from video streams with recurrent neural networks : a new paradigm for facial expression recognition

    Get PDF
    Philosophiae Doctor - PhDThis research investigated the application of recurrent neural networks (RNNs) for recognition of facial expressions based on facial action coding system (FACS). Support vector machines (SVMs) were used to validate the results obtained by RNNs. In this approach, instead of recognizing whole facial expressions, the focus was on the recognition of action units (AUs) that are defined in FACS. Recurrent neural networks are capable of gaining knowledge from temporal data while SVMs, which are time invariant, are known to be very good classifiers. Thus, the research consists of four important components: comparison of the use of image sequences against single static images, benchmarking feature selection and network optimization approaches, study of inter-AU correlations by implementing multiple output RNNs, and study of difference images as an approach for performance improvement. In the comparative studies, image sequences were classified using a combination of Gabor filters and RNNs, while single static images were classified using Gabor filters and SVMs. Sets of 11 FACS AUs were classified by both approaches, where a single RNN/SVM classifier was used for classifying each AU. Results indicated that classifying FACS AUs using image sequences yielded better results than using static images. The average recognition rate (RR) and false alarm rate (FAR) using image sequences was 82.75% and 7.61%, respectively, while the classification using single static images yielded a RR and FAR of 79.47% and 9.22%, respectively. The better performance by the use of image sequences can be at- tributed to RNNs ability, as stated above, to extract knowledge from time-series data. Subsequent research then investigated benchmarking dimensionality reduction, feature selection and network optimization techniques, in order to improve the performance provided by the use of image sequences. Results showed that an optimized network, using weight decay, gave best RR and FAR of 85.38% and 6.24%, respectively. The next study was of the inter-AU correlations existing in the Cohn-Kanade database and their effect on classification models. To accomplish this, a model was developed for the classification of a set of AUs by a single multiple output RNN. Results indicated that high inter-AU correlations do in fact aid classification models to gain more knowledge and, thus, perform better. However, this was limited to AUs that start and reach apex at almost the same time. This suggests the need for availability of a larger database of AUs, which could provide both individual and AU combinations for further investigation. The final part of this research investigated use of difference images to track the motion of image pixels. Difference images provide both noise and feature reduction, an aspect that was studied. Results showed that the use of difference image sequences provided the best results, with RR and FAR of 87.95% and 3.45%, respectively, which is shown to be significant when compared to use of normal image sequences classified using RNNs. In conclusion, the research demonstrates that use of RNNs for classification of image sequences is a new and improved paradigm for facial expression recognition

    Affective Computing

    Get PDF
    This book provides an overview of state of the art research in Affective Computing. It presents new ideas, original results and practical experiences in this increasingly important research field. The book consists of 23 chapters categorized into four sections. Since one of the most important means of human communication is facial expression, the first section of this book (Chapters 1 to 7) presents a research on synthesis and recognition of facial expressions. Given that we not only use the face but also body movements to express ourselves, in the second section (Chapters 8 to 11) we present a research on perception and generation of emotional expressions by using full-body motions. The third section of the book (Chapters 12 to 16) presents computational models on emotion, as well as findings from neuroscience research. In the last section of the book (Chapters 17 to 22) we present applications related to affective computing
    corecore