8 research outputs found

    Video Based Emotion Recognition Using CNN and BRNN

    Get PDF
    Video-based Emotion recognition is rather challenging than vision task. It needs to model spatial information of each image frame as well as the temporal contextual correlations among sequential frames. For this purpose, we propose hierarchical deep network architecture to extract high-level spatial temporal features. Two classic neural networks, Convolutional neural network (CNN) and Bi-directional recurrent neural network (BRNN) are employed to capture facial textural characteristics in spatial domain and dynamic emotion changes in temporal domain. We endeavor to coordinate the two networks by optimizing each of them to boost the performance of the emotion recognition as well as to achieve greater accuracy as compared with baselines

    Background suppressing Gabor energy filtering

    Get PDF
    In the field of facial emotion recognition, early research advanced with the use of Gabor filters. However, these filters lack generalization and result in undesirably large feature vector size. In recent work, more attention has been given to other local appearance features. Two desired characteristics in a facial appearance feature are generalization capability, and the compactness of representation. In this paper, we propose a novel texture feature inspired by Gabor energy filters, called background suppressing Gabor energy filtering. The feature has a generalization component that removes background texture. It has a reduced feature vector size due to maximal representation and soft orientation histograms, and it is awhite box representation. We demonstrate improved performance on the non-trivial Audio/Visual Emotion Challenge 2012 grand-challenge dataset by a factor of 7.17 over the Gabor filter on the development set. We also demonstrate applicability of our approach beyond facial emotion recognition which yields improved classification rate over the Gabor filter for four bioimaging datasets by an average of 8.22%

    ROZPOZNAWANIE EMOCJI W TEKSTACH POLSKOJĘZYCZNYCH Z WYKORZYSTANIEM METODY SŁÓW KLUCZOWYCH

    Get PDF
    Dynamic development of social networks caused that the Internet has become the most popular communication medium. A vast majority of the messages are exchanged in text format and very often reflect authors’ emotional states. Detection of the emotions in text is widely used in e-commerce or telemedicine becoming the milestone in the field of human-computer interaction. The paper presents a method of emotion recognition in Polish-language texts based on the keywords detection algorithm with lemmatization. The obtained accuracy is about 60%. The first Polish-language database of keywords expressing emotions has been also developed.Dynamiczny rozwój sieci społecznościowych sprawił, że Internet stał się najpopularniejszym medium komunikacyjnym. Zdecydowana większość komunikatów wymieniana jest w postaci widomości tekstowych, które niejednokrotnie odzwierciedlają stan emocjonalny autora. Identyfikacja emocji w tekstach znajduje szerokie zastosowanie w handlu elektronicznym, czy telemedycynie, stając się jednocześnie ważnym elementem w komunikacji człowiek-komputer. W niniejszym artykule zaprezentowano metodę rozpoznawania emocji w tekstach polskojęzycznych opartą o algorytm detekcji słów kluczowych i lematyzację. Uzyskano dokładność rzędu 60%. Opracowano również pierwszą polskojęzyczną bazę słów kluczowych wyrażających emocje

    3D shape estimation in video sequences provides high precision evaluation of facial expressions

    Get PDF
    Abstract Person independent and pose invariant estimation of facial expressions and action unit (AU) intensity estimation is important for situation analysis and for automated video annotation. We evaluated raw 2D shape data of the CK+ database, used Procrustes transformation and the multi-class SVM leave-one-out method for classification. We found close to 100% performance demonstrating the relevance and the strength of details of the shape. Precise 3D shape information was computed by means of Constrained Local Models (CLM) on video sequences. Such sequences offer the opportunity to compute a time-averaged '3D Personal Mean Shape' (PMS) from the estimated CLM shapes, which -upon subtraction -gives rise to person independent emotion estimation. On CK+ data PMS showed significant improvements over AU0 normalization; performance reached and sometimes surpassed state-ofthe-art results on emotion classification and on AU intensity estimation. 3D PMS from 3D CLM offers pose invariant emotion estimation that we studied by rendering a 3D emotional database for different poses and different subjects from the BU 4DFE database. Frontal shapes derived from CLM fits of the 3D shape were evaluated. Results demonstrate that shape estimation alone can be used for robust, high quality pose invariant emotion classification and AU intensity estimation

    Sparse models for positive definite matrices

    Get PDF
    University of Minnesota Ph.D. dissertation. Febrauary 2015. Major: Electrical Engineering. Advisor: Nikolaos P. Papanikolopoulos. 1 computer file (PDF); ix, 141 pages.Sparse models have proven to be extremely successful in image processing, computer vision and machine learning. However, a majority of the effort has been focused on vector-valued signals. Higher-order signals like matrices are usually vectorized as a pre-processing step, and treated like vectors thereafter for sparse modeling. Symmetric positive definite (SPD) matrices arise in probability and statistics and the many domains built upon them. In computer vision, a certain type of feature descriptor called the region covariance descriptor, used to characterize an object or image region, belongs to this class of matrices. Region covariances are immensely popular in object detection, tracking, and classification. Human detection and recognition, texture classification, face recognition, and action recognition are some of the problems tackled using this powerful class of descriptors. They have also caught on as useful features for speech processing and recognition.Due to the popularity of sparse modeling in the vector domain, it is enticing to apply sparse representation techniques to SPD matrices as well. However, SPD matrices cannot be directly vectorized for sparse modeling, since their implicit structure is lost in the process, and the resulting vectors do not adhere to the positive definite manifold geometry. Therefore, to extend the benefits of sparse modeling to the space of positive definite matrices, we must develop dedicated sparse algorithms that respect the positive definite structure and the geometry of the manifold. The primary goal of this thesis is to develop sparse modeling techniques for symmetric positive definite matrices. First, we propose a novel sparse coding technique for representing SPD matrices using sparse linear combinations of a dictionary of atomic SPD matrices. Next, we present a dictionary learning approach wherein these atoms are themselves learned from the given data, in a task-driven manner. The sparse coding and dictionary learning approaches are then specialized to the case of rank-1 positive semi-definite matrices. A discriminative dictionary learning approach from vector sparse modeling is extended to the scenario of positive definite dictionaries. We present efficient algorithms and implementations, with practical applications in image processing and computer vision for the proposed techniques

    Emotion Recognition from Arbitrary View Facial Images

    No full text
    Abstract. Emotion recognition from facial images is a very active research topic in human computer interaction (HCI). However, most of the previous approaches only focus on the frontal or nearly frontal view facial images. In contrast to the frontal/nearly-frontal view images, emotion recognition from non-frontal view or even arbitrary view facial images is much more difficult yet of more practical utility. To handle the emotion recognition problem from arbitrary view facial images, in this paper we propose a novel method based on the regional covariance matrix (RCM) representation of facial images. We also develop a new discriminant analysis theory, aiming at reducing the dimensionality of the facial feature vectors while preserving the most discriminative information, by minimizing an estimated multiclass Bayes error derived under the Gaussian mixture model (GMM). We further propose an efficient algorithm to solve the optimal discriminant vectors of the proposed discriminant analysis method. We render thousands of multi-view 2D facial images from the BU-3DFE database and conduct extensive experiments on the generated database to demonstrate the effectiveness of the proposed method. It is worth noting that our method does not require face alignment or facial landmark points localization, making it very attractive.
    corecore