41,403 research outputs found

    Multi-Sensory Emotion Recognition with Speech and Facial Expression

    Get PDF
    Emotion plays an important role in human beings’ daily lives. Understanding emotions and recognizing how to react to others’ feelings are fundamental to engaging in successful social interactions. Currently, emotion recognition is not only significant in human beings’ daily lives, but also a hot topic in academic research, as new techniques such as emotion recognition from speech context inspires us as to how emotions are related to the content we are uttering. The demand and importance of emotion recognition have highly increased in many applications in recent years, such as video games, human computer interaction, cognitive computing, and affective computing. Emotion recognition can be done from many sources including text, speech, hand, and body gesture as well as facial expression. Presently, most of the emotion recognition methods only use one of these sources. The emotion of human beings changes every second and using a single way to process the emotion recognition may not reflect the emotion correctly. This research is motivated by the desire to understand and evaluate human beings’ emotion from multiple ways such as speech and facial expressions. In this dissertation, multi-sensory emotion recognition has been exploited. The proposed framework can recognize emotion from speech, facial expression, and both of them. There are three important parts in the design of the system: the facial emotion recognizer, the speech emotion recognizer, and the information fusion. The information fusion part uses the results from the speech emotion recognition and facial emotion recognition. Then, a novel weighted method is used to integrate the results, and a final decision of the emotion is given after the fusion. The experiments show that with the weighted fusion methods, the accuracy can be improved to an average of 3.66% compared to fusion without adding weight. The improvement of the recognition rate can reach 18.27% and 5.66% compared to the speech emotion recognition and facial expression recognition, respectively. By improving the emotion recognition accuracy, the proposed multi-sensory emotion recognition system can help to improve the naturalness of human computer interaction

    Multi-Sensory Emotion Recognition with Speech and Facial Expression

    Get PDF
    Emotion plays an important role in human beings’ daily lives. Understanding emotions and recognizing how to react to others’ feelings are fundamental to engaging in successful social interactions. Currently, emotion recognition is not only significant in human beings’ daily lives, but also a hot topic in academic research, as new techniques such as emotion recognition from speech context inspires us as to how emotions are related to the content we are uttering. The demand and importance of emotion recognition have highly increased in many applications in recent years, such as video games, human computer interaction, cognitive computing, and affective computing. Emotion recognition can be done from many sources including text, speech, hand, and body gesture as well as facial expression. Presently, most of the emotion recognition methods only use one of these sources. The emotion of human beings changes every second and using a single way to process the emotion recognition may not reflect the emotion correctly. This research is motivated by the desire to understand and evaluate human beings’ emotion from multiple ways such as speech and facial expressions. In this dissertation, multi-sensory emotion recognition has been exploited. The proposed framework can recognize emotion from speech, facial expression, and both of them. There are three important parts in the design of the system: the facial emotion recognizer, the speech emotion recognizer, and the information fusion. The information fusion part uses the results from the speech emotion recognition and facial emotion recognition. Then, a novel weighted method is used to integrate the results, and a final decision of the emotion is given after the fusion. The experiments show that with the weighted fusion methods, the accuracy can be improved to an average of 3.66% compared to fusion without adding weight. The improvement of the recognition rate can reach 18.27% and 5.66% compared to the speech emotion recognition and facial expression recognition, respectively. By improving the emotion recognition accuracy, the proposed multi-sensory emotion recognition system can help to improve the naturalness of human computer interaction

    Facial Emotion Recognition Using Context Based Multimodal Approach

    Get PDF
    Emotions play a crucial role in person to person interaction. In recent years, there has been a growing interest in improving all aspects of interaction between humans and computers. The ability to understand human emotions is desirable for the computer in several applications especially by observing facial expressions. This paper explores a ways of humancomputer interaction that enable the computer to be more aware of the user’s emotional expressions we present a approach for the emotion recognition from a facial expression, hand and body posture. Our model uses multimodal emotion recognition system in which we use two different models for facial expression recognition and for hand and body posture recognition and then combining the result of both classifiers using a third classifier which give the resulting emotion . Multimodal system gives more accurate result than a signal or bimodal system

    Facial Expression Recognition Using SVM Classifier

    Get PDF
    Facial feature tracking and facial actions recognition from image sequence attracted great attention in computer vision field. Computational facial expression analysis is a challenging research topic in computer vision. It is required by many applications such as human-computer interaction, computer graphic animation and automatic facial expression recognition. In recent years, plenty of computer vision techniques have been developed to track or recognize the facial activities in three levels. First, in the bottom level, facial feature tracking, which usually detects and tracks prominent landmarks surrounding facial components (i.e., mouth, eyebrow, etc), captures the detailed face shape information; Second, facial actions recognition, i.e., recognize facial action units (AUs) defined in FACS, try to recognize some meaningful facial activities (i.e., lid tightener, eyebrow raiser, etc); In the top level, facial  expression analysis attempts to recognize some meaningful facial activities (i.e., lid tightener, eyebrow raiser, etc); In the top level, facial expression analysis attempts to recognize facial expressions that represent the human emotion states. In this proposed algorithm initially detecting eye and mouth, features of eye and mouth are extracted using Gabor filter, (Local Binary Pattern) LBP and PCA is used to reduce the dimensions of the features. Finally SVM is used to classification of expression and facial action units

    FPGA Implementation Of A Novel Robust Facial Expression Recognition Algorithm

    Get PDF
    A facial expression recognition system depicts about state of mind of a particular by showing their emotions, thus has potential application in various field of human computer interaction (HCI) such as to aid autistic children, robot control and many more. This work presents a robust and hardware efficient algorithm for facial expression recognition which gives very high rate of accuracy. Broadly, human facial expression has been categorized in seven categories, named as anger, disgust, fear, happy, sad, surprise with basic neutral emotion. The process of emotion recognition starts with the image capturing, detecting the face in the image of which emotion has to recognize, extracting robust and unique features of image which makes categorization efficient and classification of features for one of the above mentioned emotion categories. Face detection out of an image is done using existing Bayesian discriminating feature method. An algorithm is proposed for facial expression recognition, integrating Gabor filter bank and its features for feature extraction, statistical modelling which uses principal component analysis PCA and conditional density function for modelling of features and extended Bayes classifier for multi-class classification of emotion in a detected face. The multi class classification strategic has been applied based on highest value of log likelihood after training different emotions class. Robust features are extracted using Gabor filter with 8 frequency and 8 orientations. FPGA implementation of the extended Bayesian classifier is done on Xilinx10.1, VirtexIIpro FPGA using CORDIC unit for trigonometric functions. Facial expression images from JAFFE database have been used for training as well as testing. Very high accuracy (96.73 %) of emotion recognition has been obtained with proposed method

    Enhanced Emotion Recognition in Videos: A Convolutional Neural Network Strategy for Human Facial Expression Detection and Classification

    Get PDF
    The human face is essential in conveying emotions, as facial expressions serve as effective, natural, and universal indicators of emotional states. Automated emotion recognition has garnered increasing interest due to its potential applications in various fields, such as human-computer interaction, machine learning, robotic control, and driver emotional state monitoring. With artificial intelligence and computational power advancements, visual emotion recognition has become a prominent research area. Despite extensive research employing machine learning algorithms like convolutional neural networks (CNN), challenges remain concerning input data processing, emotion classification scope, data size, optimal CNN configurations, and performance evaluation. To address these issues, we propose a comprehensive CNN-based model for real-time detection and classification of five primary emotions: anger, happiness, neutrality, sadness, and surprise. We employ the Amsterdam Dynamic Facial Expression Set – Bath Intensity Variations (ADFES-BIV) video dataset, extracting image frames from the video samples. Image processing techniques such as histogram equalization, color conversion, cropping, and resizing are applied to the frames before labeling. The Viola-Jones algorithm is then used for face detection on the processed grayscale images. We develop and train a CNN on the processed image data, implementing dropout, batch normalization, and L2 regularization to reduce overfitting. The ideal hyperparameters are determined through trial and error, and the model's performance is evaluated. The proposed model achieves a recognition accuracy of 99.38%, with the confusion matrix, recall, precision, F1 score, and processing time further quantifying its performance characteristics. The model's generalization performance is assessed using images from the Warsaw Set of Emotional Facial Expression Pictures (WSEFEP) and Extended Cohn-Kanade Database (CK+) datasets. The results demonstrate the efficiency and usability of our proposed approach, contributing valuable insights into real-time visual emotion recognition

    Artificial Intelligence Tools for Facial Expression Analysis.

    Get PDF
    Inner emotions show visibly upon the human face and are understood as a basic guide to an individual’s inner world. It is, therefore, possible to determine a person’s attitudes and the effects of others’ behaviour on their deeper feelings through examining facial expressions. In real world applications, machines that interact with people need strong facial expression recognition. This recognition is seen to hold advantages for varied applications in affective computing, advanced human-computer interaction, security, stress and depression analysis, robotic systems, and machine learning. This thesis starts by proposing a benchmark of dynamic versus static methods for facial Action Unit (AU) detection. AU activation is a set of local individual facial muscle parts that occur in unison constituting a natural facial expression event. Detecting AUs automatically can provide explicit benefits since it considers both static and dynamic facial features. For this research, AU occurrence activation detection was conducted by extracting features (static and dynamic) of both nominal hand-crafted and deep learning representation from each static image of a video. This confirmed the superior ability of a pretrained model that leaps in performance. Next, temporal modelling was investigated to detect the underlying temporal variation phases using supervised and unsupervised methods from dynamic sequences. During these processes, the importance of stacking dynamic on top of static was discovered in encoding deep features for learning temporal information when combining the spatial and temporal schemes simultaneously. Also, this study found that fusing both temporal and temporal features will give more long term temporal pattern information. Moreover, we hypothesised that using an unsupervised method would enable the leaching of invariant information from dynamic textures. Recently, fresh cutting-edge developments have been created by approaches based on Generative Adversarial Networks (GANs). In the second section of this thesis, we propose a model based on the adoption of an unsupervised DCGAN for the facial features’ extraction and classification to achieve the following: the creation of facial expression images under different arbitrary poses (frontal, multi-view, and in the wild), and the recognition of emotion categories and AUs, in an attempt to resolve the problem of recognising the static seven classes of emotion in the wild. Thorough experimentation with the proposed cross-database performance demonstrates that this approach can improve the generalization results. Additionally, we showed that the features learnt by the DCGAN process are poorly suited to encoding facial expressions when observed under multiple views, or when trained from a limited number of positive examples. Finally, this research focuses on disentangling identity from expression for facial expression recognition. A novel technique was implemented for emotion recognition from a single monocular image. A large-scale dataset (Face vid) was created from facial image videos which were rich in variations and distribution of facial dynamics, appearance, identities, expressions, and 3D poses. This dataset was used to train a DCNN (ResNet) to regress the expression parameters from a 3D Morphable Model jointly with a back-end classifier

    A multimodal emotion detection system during human-robot interaction

    Get PDF
    In this paper, a multimodal user-emotion detection system for social robots is presented. This system is intended to be used during human-robot interaction, and it is integrated as part of the overall interaction system of the robot: the Robotics Dialog System (RDS). Two modes are used to detect emotions: the voice and face expression analysis. In order to analyze the voice of the user, a new component has been developed: Gender and Emotion Voice Analysis (GEVA), which is written using the Chuck language. For emotion detection in facial expressions, the system, Gender and Emotion Facial Analysis (GEFA), has been also developed. This last system integrates two third-party solutions: Sophisticated High-speed Object Recognition Engine (SHORE) and Computer Expression Recognition Toolbox (CERT). Once these new components (GEVA and GEFA) give their results, a decision rule is applied in order to combine the information given by both of them. The result of this rule, the detected emotion, is integrated into the dialog system through communicative acts. Hence, each communicative act gives, among other things, the detected emotion of the user to the RDS so it can adapt its strategy in order to get a greater satisfaction degree during the human-robot dialog. Each of the new components, GEVA and GEFA, can also be used individually. Moreover, they are integrated with the robotic control platform ROS (Robot Operating System). Several experiments with real users were performed to determine the accuracy of each component and to set the final decision rule. The results obtained from applying this decision rule in these experiments show a high success rate in automatic user emotion recognition, improving the results given by the two information channels (audio and visual) separately.The authors gratefully acknowledge the funds provided by the Spanish MICINN (Ministry of Science and Innovation) through the project “Aplicaciones de los robots sociales”, DPI2011-26980 from the Spanish Ministry of Economy and Competitiveness. Moreover, the research leading to these results has received funding from the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    Facial feature representation and recognition

    Get PDF
    Facial expression provides an important behavioral measure for studies of emotion, cognitive processes, and social interaction. Facial expression representation and recognition have become a promising research area during recent years. Its applications include human-computer interfaces, human emotion analysis, and medical care and cure. In this dissertation, the fundamental techniques will be first reviewed, and the developments of the novel algorithms and theorems will be presented later. The objective of the proposed algorithm is to provide a reliable, fast, and integrated procedure to recognize either seven prototypical, emotion-specified expressions (e.g., happy, neutral, angry, disgust, fear, sad, and surprise in JAFFE database) or the action units in CohnKanade AU-coded facial expression image database. A new application area developed by the Infant COPE project is the recognition of neonatal facial expressions of pain (e.g., air puff, cry, friction, pain, and rest in Infant COPE database). It has been reported in medical literature that health care professionals have difficulty in distinguishing newborn\u27s facial expressions of pain from facial reactions of other stimuli. Since pain is a major indicator of medical problems and the quality of patient care depends on the quality of pain management, it is vital that the methods to be developed should accurately distinguish an infant\u27s signal of pain from a host of minor distress signal. The evaluation protocol used in the Infant COPE project considers two conditions: person-dependent and person-independent. The person-dependent means that some data of a subject are used for training and other data of the subject for testing. The person-independent means that the data of all subjects except one are used for training and this left-out one subject is used for testing. In this dissertation, both evaluation protocols are experimented. The Infant COPE research of neonatal pain classification is a first attempt at applying the state-of-the-art face recognition technologies to actual medical problems. The objective of Infant COPE project is to bypass these observational problems by developing a machine classification system to diagnose neonatal facial expressions of pain. Since assessment of pain by machine is based on pixel states, a machine classification system of pain will remain objective and will exploit the full spectrum of information available in a neonate\u27s facial expressions. Furthermore, it will be capable of monitoring neonate\u27s facial expressions when he/she is left unattended. Experimental results using the Infant COPE database and evaluation protocols indicate that the application of face classification techniques in pain assessment and management is a promising area of investigation. One of the challenging problems for building an automatic facial expression recognition system is how to automatically locate the principal facial parts since most existing algorithms capture the necessary face parts by cropping images manually. In this dissertation, two systems are developed to detect facial features, especially for eyes. The purpose is to develop a fast and reliable system to detect facial features automatically and correctly. By combining the proposed facial feature detection, the facial expression and neonatal pain recognition systems can be robust and efficient
    corecore