1,823 research outputs found

    Machine Analysis of Facial Expressions

    Get PDF
    No abstract

    Affective Man-Machine Interface: Unveiling human emotions through biosignals

    Get PDF
    As is known for centuries, humans exhibit an electrical profile. This profile is altered through various psychological and physiological processes, which can be measured through biosignals; e.g., electromyography (EMG) and electrodermal activity (EDA). These biosignals can reveal our emotions and, as such, can serve as an advanced man-machine interface (MMI) for empathic consumer products. However, such a MMI requires the correct classification of biosignals to emotion classes. This chapter starts with an introduction on biosignals for emotion detection. Next, a state-of-the-art review is presented on automatic emotion classification. Moreover, guidelines are presented for affective MMI. Subsequently, a research is presented that explores the use of EDA and three facial EMG signals to determine neutral, positive, negative, and mixed emotions, using recordings of 21 people. A range of techniques is tested, which resulted in a generic framework for automated emotion classification with up to 61.31% correct classification of the four emotion classes, without the need of personal profiles. Among various other directives for future research, the results emphasize the need for parallel processing of multiple biosignals

    A combined cepstral distance method for emotional speech recognition

    Get PDF
    Affective computing is not only the direction of reform in artificial intelligence but also exemplification of the advanced intelligent machines. Emotion is the biggest difference between human and machine. If the machine behaves with emotion, then the machine will be accepted by more people. Voice is the most natural and can be easily understood and accepted manner in daily communication. The recognition of emotional voice is an important field of artificial intelligence. However, in recognition of emotions, there often exists the phenomenon that two emotions are particularly vulnerable to confusion. This article presents a combined cepstral distance method in two-group multi-class emotion classification for emotional speech recognition. Cepstral distance combined with speech energy is well used as speech signal endpoint detection in speech recognition. In this work, the use of cepstral distance aims to measure the similarity between frames in emotional signals and in neutral signals. These features are input for directed acyclic graph support vector machine classification. Finally, a two-group classification strategy is adopted to solve confusion in multi-emotion recognition. In the experiments, Chinese mandarin emotion database is used and a large training set (1134 + 378 utterances) ensures a powerful modelling capability for predicting emotion. The experimental results show that cepstral distance increases the recognition rate of emotion sad and can balance the recognition results with eliminating the over fitting. And for the German corpus Berlin emotional speech database, the recognition rate between sad and boring, which are very difficult to distinguish, is up to 95.45%

    Subject-Independent Emotion Recognition Based on Physiological Signals: A Three-Stage Decision Method

    Get PDF
    Background: Collaboration between humans and computers has become pervasive and ubiquitous, however current computer systems are limited in that they fail to address the emotional component. An accurate understanding of human emotions is necessary for these computers to trigger proper feedback. Among multiple emotional channels, physiological signals are synchronous with emotional responses; therefore, analyzing physiological changes is a recognized way to estimate human emotions. In this paper, a three-stage decision method is proposed to recognize four emotions based on physiological signals in the multi-subject context. Emotion detection is achieved by using a stage-divided strategy in which each stage deals with a fine-grained goal. Methods: The decision method consists of three stages. During the training process, the initial stage transforms mixed training subjects to separate groups, thus eliminating the effect of individual differences. The second stage categorizes four emotions into two emotion pools in order to reduce recognition complexity. The third stage trains a classifier based on emotions in each emotion pool. During the testing process, a test case or test trial will be initially classified to a group followed by classification into an emotion pool in the second stage. An emotion will be assigned to the test trial in the final stage. In this paper we consider two different ways of allocating four emotions into two emotion pools. A comparative analysis is also carried out between the proposal and other methods. Results: An average recognition accuracy of 77.57% was achieved on the recognition of four emotions with the best accuracy of 86.67% to recognize the positive and excited emotion. Using differing ways of allocating four emotions into two emotion pools, we found there is a difference in the effectiveness of a classifier on learning each emotion. When compared to other methods, the proposed method demonstrates a significant improvement in recognizing four emotions in the multi-subject context. Conclusions: The proposed three-stage decision method solves a crucial issue which is \u27individual differences\u27 in multi-subject emotion recognition and overcomes the suboptimal performance with respect to direct classification of multiple emotions. Our study supports the observation that the proposed method represents a promising methodology for recognizing multiple emotions in the multi-subject context

    Automatic Speech Emotion Recognition Using Machine Learning

    Get PDF
    This chapter presents a comparative study of speech emotion recognition (SER) systems. Theoretical definition, categorization of affective state and the modalities of emotion expression are presented. To achieve this study, an SER system, based on different classifiers and different methods for features extraction, is developed. Mel-frequency cepstrum coefficients (MFCC) and modulation spectral (MS) features are extracted from the speech signals and used to train different classifiers. Feature selection (FS) was applied in order to seek for the most relevant feature subset. Several machine learning paradigms were used for the emotion classification task. A recurrent neural network (RNN) classifier is used first to classify seven emotions. Their performances are compared later to multivariate linear regression (MLR) and support vector machines (SVM) techniques, which are widely used in the field of emotion recognition for spoken audio signals. Berlin and Spanish databases are used as the experimental data set. This study shows that for Berlin database all classifiers achieve an accuracy of 83% when a speaker normalization (SN) and a feature selection are applied to the features. For Spanish database, the best accuracy (94 %) is achieved by RNN classifier without SN and with FS

    Cross validation of bi-modal health-related stress assessment

    Get PDF
    This study explores the feasibility of objective and ubiquitous stress assessment. 25 post-traumatic stress disorder patients participated in a controlled storytelling (ST) study and an ecologically valid reliving (RL) study. The two studies were meant to represent an early and a late therapy session, and each consisted of a "happy" and a "stress triggering" part. Two instruments were chosen to assess the stress level of the patients at various point in time during therapy: (i) speech, used as an objective and ubiquitous stress indicator and (ii) the subjective unit of distress (SUD), a clinically validated Likert scale. In total, 13 statistical parameters were derived from each of five speech features: amplitude, zero-crossings, power, high-frequency power, and pitch. To model the emotional state of the patients, 28 parameters were selected from this set by means of a linear regression model and, subsequently, compressed into 11 principal components. The SUD and speech model were cross-validated, using 3 machine learning algorithms. Between 90% (2 SUD levels) and 39% (10 SUD levels) correct classification was achieved. The two sessions could be discriminated in 89% (for ST) and 77% (for RL) of the cases. This report fills a gap between laboratory and clinical studies, and its results emphasize the usefulness of Computer Aided Diagnostics (CAD) for mental health care

    Detection of emotions in Parkinson's disease using higher order spectral features from brain's electrical activity

    Get PDF
    Non-motor symptoms in Parkinson's disease (PD) involving cognition and emotion have been progressively receiving more attention in recent times. Electroencephalogram (EEG) signals, being an activity of central nervous system, can reflect the underlying true emotional state of a person. This paper presents a computational framework for classifying PD patients compared to healthy controls (HC) using emotional information from the brain's electrical activity

    Affect Recognition in Human Emotional Speech using Probabilistic Support Vector Machines

    Get PDF
    The problem of inferring human emotional state automatically from speech has become one of the central problems in Man Machine Interaction (MMI). Though Support Vector Machines (SVMs) were used in several worksfor emotion recognition from speech, the potential of using probabilistic SVMs for this task is not explored. The emphasis of the current work is on how to use probabilistic SVMs for the efficient recognition of emotions from speech. Emotional speech corpuses for two Dravidian languages- Telugu & Tamil- were constructed for assessing the recognition accuracy of Probabilistic SVMs. Recognition accuracy of the proposed model is analyzed using both Telugu and Tamil emotional speech corpuses and compared with three of the existing works. Experimental results indicated that the proposed model is significantly better compared with the existing methods
    corecore