749 research outputs found

    Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

    No full text
    In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    Dizertační práce se zabývá rozpoznáním emočního stavu mluvčích z řečového signálu. Práce je rozdělena do dvou hlavních častí, první část popisuju navržené metody pro rozpoznání emočního stavu z hraných databází. V rámci této části jsou představeny výsledky rozpoznání použitím dvou různých databází s různými jazyky. Hlavními přínosy této části je detailní analýza rozsáhlé škály různých příznaků získaných z řečového signálu, návrh nových klasifikačních architektur jako je například „emoční párování“ a návrh nové metody pro mapování diskrétních emočních stavů do dvou dimenzionálního prostoru. Druhá část se zabývá rozpoznáním emočních stavů z databáze spontánní řeči, která byla získána ze záznamů hovorů z reálných call center. Poznatky z analýzy a návrhu metod rozpoznání z hrané řeči byly využity pro návrh nového systému pro rozpoznání sedmi spontánních emočních stavů. Jádrem navrženého přístupu je komplexní klasifikační architektura založena na fúzi různých systémů. Práce se dále zabývá vlivem emočního stavu mluvčího na úspěšnosti rozpoznání pohlaví a návrhem systému pro automatickou detekci úspěšných hovorů v call centrech na základě analýzy parametrů dialogu mezi účastníky telefonních hovorů.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    Gender Classification in Emotional Speech

    Get PDF

    Speaker-independent negative emotion recognition

    Get PDF
    This work aims to provide a method able to distinguish between negative and non-negative emotions in vocal interaction. A large pool of 1418 features is extracted for that purpose. Several of those features are tested in emotion recognition for the first time. Next, feature selection is applied separately to male and female utterances. In particular, a bidirectional Best First search with backtracking is applied. The first contribution is the demonstration that a significant number of features, first tested here, are retained after feature selection. The selected features are then fed as input to support vector machines with various kernel functions as well as to the K nearest neighbors classifier. The second contribution is in the speaker-independent experiments conducted in order to cope with the limited number of speakers present in the commonly used emotion speech corpora. Speaker-independent systems are known to be more robust and present a better generalization ability than the speaker-dependent ones. Experimental results are reported for the Berlin emotional speech database. The best performing classifier is found to be the support vector machine with the Gaussian radial basis function kernel. Correctly classified utterances are 86.73%±3.95% for male subjects and 91.73%±4.18% for female subjects. The last contribution is in the statistical analysis of the performance of the support vector machine classifier against the K nearest neighbors classifier as well as the statistical analysis of the various support vector machine kernels impact. © 2010 IEEE

    The classification problem in machine learning: an overview with study cases in emotion recognition and music-speech differentiation

    Get PDF
    This work addresses the well-known classification problem in machine learning -- The goal of this study is to approach the reader to the methodological aspects of the feature extraction, feature selection and classifier performance through simple and understandable theoretical aspects and two study cases -- Finally, a very good classification performance was obtained for the emotion recognition from speec

    Exploring Language-Independent Emotional Acoustic Features via Feature Selection

    Full text link
    We propose a novel feature selection strategy to discover language-independent acoustic features that tend to be responsible for emotions regardless of languages, linguistics and other factors. Experimental results suggest that the language-independent feature subset discovered yields the performance comparable to the full feature set on various emotional speech corpora.Comment: 15 pages, 2 figures, 6 table

    Recognition of Emotion from Speech: A Review

    Get PDF

    Towards Cognizant Hearing Aids: Modeling of Content, Affect and Attention

    Get PDF

    On the development of an automatic voice pleasantness classification and intensity estimation system

    Get PDF
    In the last few years, the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems, the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how pleasant is a voice from a perceptual point of view when the final application is a speech based interface. In this paper we present an objective definition for voice pleasantness based on the composition of a representative feature subset and a new automatic voice pleasantness classification and intensity estimation system. Our study is based on a database composed by European Portuguese female voices but the methodology can be extended to male voices or to other languages. In the objective performance evaluation the system achieved a 9.1% error rate for voice pleasantness classification and a 15.7% error rate for voice pleasantness intensity estimation.Work partially supported by ERDF funds, the Spanish Government (TEC2009-14094-C04-04), and Xunta de Galicia (CN2011/019, 2009/062
    corecore