Search CORE

4,202 research outputs found

Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks

Author: Chang Jonathan
Scherer Stefan
Publication venue
Publication date: 22/04/2017
Field of study

Automatically assessing emotional valence in human speech has historically been a difficult task for machine learning algorithms. The subtle changes in the voice of the speaker that are indicative of positive or negative emotional states are often "overshadowed" by voice characteristics relating to emotional intensity or emotional activation. In this work we explore a representation learning approach that automatically derives discriminative representations of emotional speech. In particular, we investigate two machine learning strategies to improve classifier performance: (1) utilization of unlabeled data using a deep convolutional generative adversarial network (DCGAN), and (2) multitask learning. Within our extensive experiments we leverage a multitask annotated emotional corpus as well as a large unlabeled meeting corpus (around 100 hours). Our speaker-independent classification experiments show that in particular the use of unlabeled data in our investigations improves performance of the classifiers and both fully supervised baseline approaches are outperformed considerably. We improve the classification of emotional valence on a discrete 5-point scale to 43.88% and on a 3-point scale to 49.80%, which is competitive to state-of-the-art performance

arXiv.org e-Print Archive

Crossref

Affective Music Information Retrieval

Author: Wang Hsin-Min
Wang Ju-Chiang
Yang Yi-Hsuan
Publication venue
Publication date: 18/02/2015
Field of study

Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition and emotion-based music retrieval. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio

arXiv.org e-Print Archive

CiteSeerX

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

Author: Cummins Nicholas
Han Jing
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 21/09/2018
Field of study

Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities

arXiv.org e-Print Archive

OPUS Augsburg

Using the Fisher Vector Representation for Audio-based Emotion Recognition

Author: Gosztolya Gábor
Publication venue: 'Obuda University'
Publication date: 01/01/2020
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library