4,890 research outputs found

    Affective Music Information Retrieval

    Full text link
    Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition and emotion-based music retrieval. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio

    Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons

    Get PDF
    The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate "feature time series". 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations. We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis

    Feature extraction based on bio-inspired model for robust emotion recognition

    Get PDF
    Emotional state identification is an important issue to achieve more natural speech interactive systems. Ideally, these systems should also be able to work in real environments in which generally exist some kind of noise. Several bio-inspired representations have been applied to artificial systems for speech processing under noise conditions. In this work, an auditory signal representation is used to obtain a novel bio-inspired set of features for emotional speech signals. These characteristics, together with other spectral and prosodic features, are used for emotion recognition under noise conditions. Neural models were trained as classifiers and results were compared to the well-known mel-frequency cepstral coefficients. Results show that using the proposed representations, it is possible to significantly improve the robustness of an emotion recognition system. The results were also validated in a speaker independent scheme and with two emotional speech corpora.Fil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism

    Get PDF
    The INTERSPEECH 2013 Computational Paralinguistics Challenge provides for the first time a unified test-bed for Social Signals such as laughter in speech. It further introduces conflict in group discussions as new tasks and picks up on autism and its manifestations in speech. Finally, emotion is revisited as task, albeit with a broader ranger of overall twelve emotional states. In this paper, we describe these four Sub-Challenges, Challenge conditions, baselines, and a new feature set by the openSMILE toolkit, provided to the participants. \em Bj\"orn Schuller1^1, Stefan Steidl2^2, Anton Batliner1^1, Alessandro Vinciarelli3,4^{3,4}, Klaus Scherer5^5}\\ {\em Fabien Ringeval6^6, Mohamed Chetouani7^7, Felix Weninger1^1, Florian Eyben1^1, Erik Marchi1^1, }\\ {\em Hugues Salamin3^3, Anna Polychroniou3^3, Fabio Valente4^4, Samuel Kim4^4

    Biosignals as an Advanced Man-Machine Interface

    Get PDF
    As is known for centuries, humans exhibit an electrical profile. This profile is altered through various physiological processes, which can be measured through biosignals; e.g., electromyography (EMG) and electrodermal activity (EDA). These biosignals can reveal our emotions and, as such, can serve as an advanced man-machine interface (MMI) for empathic consumer products. However, such an MMI requires the correct classification of biosignals to emotion classes. This paper explores the use of EDA and three facial EMG signals to determine neutral, positive, negative, and mixed emotions, using recordings of 24 people. A range of techniques is tested, which resulted in a generic framework for automated emotion classification with up to 61.31% correct classification of the four emotion classes, without the need of personal profiles. Among various other directives for future research, the results emphasize the need for both personalized biosignal-profiles and the recording of multiple biosignals in parallel
    corecore