4 research outputs found

    Multimodal Approach for Emotion Recognition Using a Formal Computational Model

    Get PDF
    International audience— Emotions play a crucial role in human-computer interaction. They are generally expressed and perceived through multiple modalities such as speech, facial expressions, physiological signals. Indeed, the complexity of emotions makes the acquisition very difficult and makes unimodal systems (i.e., the observation of only one source of emotion) unreliable and often unfeasible in applications of high complexity. Moreover the lack of a standard in human emotions modeling hinders the sharing of affective information between applications. In this paper, we present a multimodal approach for the emotion recognition from many sources of information. This paper aims to provide a multi-modal system for emotion recognition and exchange that will facilitate inter-systems exchanges and improve the credibility of emotional interaction between users and computers. We elaborate a multimodal emotion recognition method from Physiological Data based on signal processing algorithms. Our method permits to recognize emotion composed of several aspects like simulated and masked emotions. This method uses a new multidimensional model to represent emotional states based on an algebraic representation. The experimental results show that the proposed multimodal emotion recognition method improves the recognition rates in comparison to the unimodal approach. Compared to the state of art multimodal techniques, the proposed method gives a good results with 72% of correct

    An Efficient Boosted Classifier Tree-Based Feature Point Tracking System for Facial Expression Analysis

    Get PDF
    The study of facial movement and expression has been a prominent area of research since the early work of Charles Darwin. The Facial Action Coding System (FACS), developed by Paul Ekman, introduced the first universal method of coding and measuring facial movement. Human-Computer Interaction seeks to make human interaction with computer systems more effective, easier, safer, and more seamless. Facial expression recognition can be broken down into three distinctive subsections: Facial Feature Localization, Facial Action Recognition, and Facial Expression Classification. The first and most important stage in any facial expression analysis system is the localization of key facial features. Localization must be accurate and efficient to ensure reliable tracking and leave time for computation and comparisons to learned facial models while maintaining real-time performance. Two possible methods for localizing facial features are discussed in this dissertation. The Active Appearance Model is a statistical model describing an object\u27s parameters through the use of both shape and texture models, resulting in appearance. Statistical model-based training for object recognition takes multiple instances of the object class of interest, or positive samples, and multiple negative samples, i.e., images that do not contain objects of interest. Viola and Jones present a highly robust real-time face detection system, and a statistically boosted attentional detection cascade composed of many weak feature detectors. A basic algorithm for the elimination of unnecessary sub-frames while using Viola-Jones face detection is presented to further reduce image search time. A real-time emotion detection system is presented which is capable of identifying seven affective states (agreeing, concentrating, disagreeing, interested, thinking, unsure, and angry) from a near-infrared video stream. The Active Appearance Model is used to place 23 landmark points around key areas of the eyes, brows, and mouth. A prioritized binary decision tree then detects, based on the actions of these key points, if one of the seven emotional states occurs as frames pass. The completed system runs accurately and achieves a real-time frame rate of approximately 36 frames per second. A novel facial feature localization technique utilizing a nested cascade classifier tree is proposed. A coarse-to-fine search is performed in which the regions of interest are defined by the response of Haar-like features comprising the cascade classifiers. The individual responses of the Haar-like features are also used to activate finer-level searches. A specially cropped training set derived from the Cohn-Kanade AU-Coded database is also developed and tested. Extensions of this research include further testing to verify the novel facial feature localization technique presented for a full 26-point face model, and implementation of a real-time intensity sensitive automated Facial Action Coding System

    Bimodal information analysis for emotion recognition

    No full text
    We present an audio-visual information analysis system for automatic emotion recognition. We propose an approach for the analysis of video sequences which combines facial expressions observed visually with acoustic features to automatically recognize five universal emotion classes: Anger, Disgust, Happiness, Sadness and Surprise. The visual component of our system evaluates the facial expressions using a bank of 20 Gabor filters that spatially sample the images. The audio analysis is based on global statistics of voice pitch and intensity along with the temporal features like speech rate and Mel Frequency Cepstrum Coefficients. We combine the two modalities at feature and score level to compare the respective joint emotion recognition rates. The emotions are instantaneously classified using a Support Vector Machine and the temporal inference is drawn based on scores obtained as the output of the classifier. This approach is validated on a posed audio-visual database and a natural interactive database to test the robustness of our algorithm. The experiments performed on these databases provide encouraging results with the best combined recognition rate being 82%.Nous présentons un système d'analyse des informations audiovisuelles pour la reconnaissance automatique d'émotion. Nous proposons une méthode pour l'analyse de séquences vidéo qui combine des observations visuelles et sonores permettant de reconnaître automatiquement cinq classes d'émotion universelle : la colère, le dégoût, le bonheur, la tristesse et la surprise. Le composant visuel de notre système évalue les expressions du visage à l'aide d'une banque de 20 filtres Gabor qui échantillonne les images dans l'espace. L'analyse audio est basée sur des données statistiques du ton et de l'intensité de la voix ainsi que sur des caractéristiques temporelles comme le rythme du discours et les coefficients de fréquence Mel Cepstrum. Nous combinons les deux modalités, fonctionnalité et pointage, pour comparer les taux de reconnaissance respectifs. Les émotions sont classifiées instantanément à l'aide d'une « Support Vector Machine » et l'inférence temporelle est déterminée en se basant sur le pointage obtenu à la sortie du classificateur. Cette approche est validée en utilisant une base de données audiovisuelles et une base de données interactives naturelles afin de vérifier la robustesse de notre algorithme. Les expériences effectuées sur ces bases de données fournissent des résultats encourageants avec un taux de reconnaissance combinée pouvant atteindre 82%
    corecore