143 research outputs found

    Foreword ACII 2013

    Get PDF

    AV+ EC 2015--the first affect recognition challenge bridging across audio, video, and physiological data

    Get PDF
    We present the first Audio-Visual+ Emotion recognition Challenge and workshop (AV+EC 2015) aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological emotion analysis. This is the 5th event in the AVEC series, but the very first Challenge that bridges across audio, video and physiological data. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the audio, video and physiological emotion recognition communities, to compare the relative merits of the three approaches to emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge, the dataset and the performance of the baseline system

    AV+ EC 2015--the first affect recognition challenge bridging across audio, video, and physiological data

    Get PDF
    We present the first Audio-Visual+ Emotion recognition Challenge and workshop (AV+EC 2015) aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological emotion analysis. This is the 5th event in the AVEC series, but the very first Challenge that bridges across audio, video and physiological data. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the audio, video and physiological emotion recognition communities, to compare the relative merits of the three approaches to emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge, the dataset and the performance of the baseline system

    AVEC 2016 – Depression, mood, and emotion recognition workshop and challenge

    Get PDF
    The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multi-modal information processing and to bring together the depression and emotion recognition communities, as well as the audio, video and physiological processing communities, to compare the relative merits of the various approaches to depression and emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks

    Continuous Estimation of Emotions in Speech by Dynamic Cooperative Speaker Models

    Get PDF
    Automatic emotion recognition from speech has been recently focused on the prediction of time-continuous dimensions (e.g., arousal and valence) of spontaneous and realistic expressions of emotion, as found in real-life interactions. However, the automatic prediction of such emotions poses several challenges, such as the subjectivity found in the definition of a gold standard from a pool of raters and the issue of data scarcity in training models. In this work, we introduce a novel emotion recognition system, based on ensemble of single-speaker-regression-models (SSRMs). The estimation of emotion is provided by combining a subset of the initial pool of SSRMs selecting those that are most concordance among them. The proposed approach allows the addition or removal of speakers from the ensemble without the necessity to re-build the entire machine learning system. The simplicity of this aggregation strategy, coupled with the flexibility assured by the modular architecture, and the promising results obtained on the RECOLA database highlight the potential implications of the proposed method in a real-life scenario and in particular in WEB-based applications

    Feature Learning from Spectrograms for Assessment of Personality Traits

    Full text link
    Several methods have recently been proposed to analyze speech and automatically infer the personality of the speaker. These methods often rely on prosodic and other hand crafted speech processing features extracted with off-the-shelf toolboxes. To achieve high accuracy, numerous features are typically extracted using complex and highly parameterized algorithms. In this paper, a new method based on feature learning and spectrogram analysis is proposed to simplify the feature extraction process while maintaining a high level of accuracy. The proposed method learns a dictionary of discriminant features from patches extracted in the spectrogram representations of training speech segments. Each speech segment is then encoded using the dictionary, and the resulting feature set is used to perform classification of personality traits. Experiments indicate that the proposed method achieves state-of-the-art results with a significant reduction in complexity when compared to the most recent reference methods. The number of features, and difficulties linked to the feature extraction process are greatly reduced as only one type of descriptors is used, for which the 6 parameters can be tuned automatically. In contrast, the simplest reference method uses 4 types of descriptors to which 6 functionals are applied, resulting in over 20 parameters to be tuned.Comment: 12 pages, 3 figure

    Integration of Wavelet and Recurrence Quantification Analysis in Emotion Recognition of Bilinguals

    Get PDF
    Background: This study offers a robust framework for the classification of autonomic signals into five affective states during the picture viewing. To this end, the following emotion categories studied: five classes of the arousal-valence plane (5C), three classes of arousal (3A), and three categories of valence (3V). For the first time, the linguality information also incorporated into the recognition procedure. Precisely, the main objective of this paper was to present a fundamental approach for evaluating and classifying the emotions of monolingual and bilingual college students.Methods: Utilizing the nonlinear dynamics, the recurrence quantification measures of the wavelet coefficients extracted. To optimize the feature space, different feature selection approaches, including generalized discriminant analysis (GDA), principal component analysis (PCA), kernel PCA, and linear discriminant analysis (LDA), were examined. Finally, considering linguality information, the classification was performed using a probabilistic neural network (PNN).Results: Using LDA and the PNN, the highest recognition rates of 95.51%, 95.7%, and 95.98% were attained for the 5C, 3A, and 3V, respectively. Considering the linguality information, a further improvement of the classification rates accomplished.Conclusion: The proposed methodology can provide a valuable tool for discriminating affective states in practical applications within the area of human-computer interfaces
    corecore