3 research outputs found

    Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition

    Get PDF
    This paper studies the emotion recognition from musical tracks in the 2-dimensional valence-arousal (V-A) emotional space. We propose a method based on convolutional (CNN) and recurrent neural networks (RNN), having significantly fewer parameters compared with the state-of-the-art method for the same task. We utilize one CNN layer followed by two branches of RNNs trained separately for arousal and valence. The method was evaluated using the 'MediaEval2015 emotion in music' dataset. We achieved an RMSE of 0.202 for arousal and 0.268 for valence, which is the best result reported on this dataset.Comment: Accepted for Sound and Music Computing (SMC 2017

    Emotional quantification of soundscapes by learning between samples

    Get PDF
    Predicting the emotional responses of humans to soundscapes is a relatively recent field of research coming with a wide range of promising applications. This work presents the design of two convolutional neural networks, namely ArNet and ValNet, each one responsible for quantifying arousal and valence evoked by soundscapes. We build on the knowledge acquired from the application of traditional machine learning techniques on the specific domain, and design a suitable deep learning framework. Moreover, we propose the usage of artificially created mixed soundscapes, the distributions of which are located between the ones of the available samples, a process that increases the variance of the dataset leading to significantly better performance. The reported results outperform the state of the art on a soundscape dataset following Schafer\u2019s standardized categorization considering both sound\u2019s identity and the respective listening context

    Spatial sound and emotions: A literature survey on the relationship between spatially rendered audio and listeners’ affective responses

    Get PDF
    With the development of the entertainment industry, the need for immersive and emotionally impactful sound design has emerged. Utilization of spatial sound is potentially the next step to improve the audio experiences for listeners in terms of their emotional engagement. Hence, the relationship between spatial audio characteristics and emotional responses of the listeners has been the main focus of several recent studies. This paper provides a systematic overview of the above reports, including the analysis of commonly utilized methodology and technology. The survey was undertaken using four literature repositories, namely, Google Scholar, Scopus, IEEE Xplore, and AES E-Library. The overviewed papers were selected according to the empirical validity and quality of the reported studies. According to the survey outcomes, there is growing evidence of a positive influence of the selected spatial audio characteristics on the listeners’ affective responses. However, more data is required to build reliable, universal, and useful models explaining the above relationship. Furthermore, the two research trends on this topic were identified. Namely, the studies undertaken so far can be classified as either technology-oriented or technology-agnostic, depending on the research questions or experimental factors examined. Prospective future research directions regarding this topic are identified and discussed. They include better utilization of scene-based paradigms, affective computing techniques, and exploring the emotional effects of dynamic changes in spatial audio scenes
    corecore