Search CORE

21,912 research outputs found

Affective Music Information Retrieval

Author: Wang Hsin-Min
Wang Ju-Chiang
Yang Yi-Hsuan
Publication venue
Publication date: 18/02/2015
Field of study

Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition and emotion-based music retrieval. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio

arXiv.org e-Print Archive

CiteSeerX

The Emotional and Chromatic Layers of Urban Smells

Author: Aiello Luca Maria
Quercia Daniele
Schifanella Rossano
Publication venue
Publication date: 01/01/2016
Field of study

People are able to detect up to 1 trillion odors. Yet, city planning is concerned only with a few bad odors, mainly because odors are currently captured only through complaints made by urban dwellers. To capture both good and bad odors, we resort to a methodology that has been recently proposed and relies on tagging information of geo-referenced pictures. In doing so for the cities of London and Barcelona, this work makes three new contributions. We study 1) how the urban smellscape changes in time and space; 2) which emotions people share at places with specific smells; and 3) what is the color of a smell, if it exists. Without social media data, insights about those three aspects have been difficult to produce in the past, further delaying the creation of urban restorative experiences.Comment: 11 pages, 18 figures, final version published in the Proceedings of the Tenth International Conference on Web and Social Media (ICWSM 2016

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Using Generic Summarization to Improve Music Information Retrieval Tasks

Author: de Matos David Martins
Raposo Francisco
Ribeiro Ricardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In order to satisfy processing time constraints, many MIR tasks process only a segment of the whole music signal. This practice may lead to decreasing performance, since the most important information for the tasks may not be in those processed segments. In this paper, we leverage generic summarization algorithms, previously applied to text and speech summarization, to summarize items in music datasets. These algorithms build summaries, that are both concise and diverse, by selecting appropriate segments from the input signal which makes them good candidates to summarize music as well. We evaluate the summarization process on binary and multiclass music genre classification tasks, by comparing the performance obtained using summarized datasets against the performances obtained using continuous segments (which is the traditional method used for addressing the previously mentioned time constraints) and full songs of the same original dataset. We show that GRASSHOPPER, LexRank, LSA, MMR, and a Support Sets-based Centrality model improve classification performance when compared to selected 30-second baselines. We also show that summarized datasets lead to a classification performance whose difference is not statistically significant from using full songs. Furthermore, we make an argument stating the advantages of sharing summarized datasets for future MIR research.Comment: 24 pages, 10 tables; Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition

Author: Adavanne Sharath
Drossos Konstantinos
Jarina Roman
Malik Miroslav
Ticha Dasa
Virtanen Tuomas
Publication venue
Publication date: 01/01/2017
Field of study

This paper studies the emotion recognition from musical tracks in the 2-dimensional valence-arousal (V-A) emotional space. We propose a method based on convolutional (CNN) and recurrent neural networks (RNN), having significantly fewer parameters compared with the state-of-the-art method for the same task. We utilize one CNN layer followed by two branches of RNNs trained separately for arousal and valence. The method was evaluated using the 'MediaEval2015 emotion in music' dataset. We achieved an RMSE of 0.202 for arousal and 0.268 for valence, which is the best result reported on this dataset.Comment: Accepted for Sound and Music Computing (SMC 2017

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Analysing user physiological responses for affective video summarisation

Author: Agius
Allanson
Arthur G. Money
Boiten
Bradley
Brown
Cacioppo
Carlson
Cernekova
Coicca
Colombo
Detenber
Dimitrova
Ekman
Frazier
Fridja
Gomez
Gomez
Greenwald
Gross
Hagemann
Hanjalic
Hanjalic
Harry Agius
Kramer
Lang
Lew
Li
McIntyre
Money
Nasoz
Palomba
Philippot
Picard
Picard
Piferi
Power
Scheirer
Simon
Simons
Smeulders
Steinbeis
Suziki
van Reekum
VanDiest
Ward
Winton
Publication venue: 'Elsevier BV'
Publication date: 01/04/2009
Field of study

This is the post-print version of the final paper published in Displays. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2009 Elsevier B.V.Video summarisation techniques aim to abstract the most significant content from a video stream. This is typically achieved by processing low-level image, audio and text features which are still quite disparate from the high-level semantics that end users identify with (the ‘semantic gap’). Physiological responses are potentially rich indicators of memorable or emotionally engaging video content for a given user. Consequently, we investigate whether they may serve as a suitable basis for a video summarisation technique by analysing a range of user physiological response measures, specifically electro-dermal response (EDR), respiration amplitude (RA), respiration rate (RR), blood volume pulse (BVP) and heart rate (HR), in response to a range of video content in a variety of genres including horror, comedy, drama, sci-fi and action. We present an analysis framework for processing the user responses to specific sub-segments within a video stream based on percent rank value normalisation. The application of the analysis framework reveals that users respond significantly to the most entertaining video sub-segments in a range of content domains. Specifically, horror content seems to elicit significant EDR, RA, RR and BVP responses, and comedy content elicits comparatively lower levels of EDR, but does seem to elicit significant RA, RR, BVP and HR responses. Drama content seems to elicit less significant physiological responses in general, and both sci-fi and action content seem to elicit significant EDR responses. We discuss the implications this may have for future affective video summarisation approaches

Crossref

Brunel University Research Archive