1,036 research outputs found

    Music emotion recognition: a multimodal machine learning approach

    Get PDF
    Music emotion recognition (MER) is an emerging domain of the Music Information Retrieval (MIR) scientific community, and besides, music searches through emotions are one of the major selection preferred by web users. As the world goes to digital, the musical contents in online databases, such as Last.fm have expanded exponentially, which require substantial manual efforts for managing them and also keeping them updated. Therefore, the demand for innovative and adaptable search mechanisms, which can be personalized according to users’ emotional state, has gained increasing consideration in recent years. This thesis concentrates on addressing music emotion recognition problem by presenting several classification models, which were fed by textual features, as well as audio attributes extracted from the music. In this study, we build both supervised and semisupervised classification designs under four research experiments, that addresses the emotional role of audio features, such as tempo, acousticness, and energy, and also the impact of textual features extracted by two different approaches, which are TF-IDF and Word2Vec. Furthermore, we proposed a multi-modal approach by using a combined feature-set consisting of the features from the audio content, as well as from context-aware data. For this purpose, we generated a ground truth dataset containing over 1500 labeled song lyrics and also unlabeled big data, which stands for more than 2.5 million Turkish documents, for achieving to generate an accurate automatic emotion classification system. The analytical models were conducted by adopting several algorithms on the crossvalidated data by using Python. As a conclusion of the experiments, the best-attained performance was 44.2% when employing only audio features, whereas, with the usage of textual features, better performances were observed with 46.3% and 51.3% accuracy scores considering supervised and semi-supervised learning paradigms, respectively. As of last, even though we created a comprehensive feature set with the combination of audio and textual features, this approach did not display any significant improvement for classification performanc

    The MoodStripe - An evaluation of a novel visual interface as an alternative for online response gatherin

    Get PDF
    We present an innovative dynamic visual interface, the MoodStripe, which provides a continuous-scale, multi-parameter drag-and-drop alternative to the standard \textit{n}-degree (Likert) scale widgets, commonly used in online evaluation processes. We elaborate on the motivation for the development of the new user input interfaces, and present the results of cross evaluation of the GMail product by using the SUS questionnaire with the standard and the proposed MoodStripe interfaces. The overall goal is to design a more intuitive interface, by reducing the noise and task load inherent in traditional interfaces for standardized user-feedback gathering tests. The results show the MoodStripe interface outperforms the standard scale approach both in terms of intuitiveness and functionality. Additionally, the cross-evaluation of the both approaches shows comparable SUS scores

    Exploring the Emotional Spectrum. Investigating the Impact of Colour and Sounds on Human Emotions through Experimental Studies on Audiovisual Stimuli.

    Get PDF
    openIn un'era in cui una comunicazione efficace è essenziale, il Sound Design sta acquisendo ogni giorno più importanza nella progettazione di prodotti multimediali, consentendo di generare profonde risposte emotive nel pubblico, attributo fondamentale per le sue applicazioni nel marketing e nell'industria, ma anche in campi più artistici, come nel mondo del cinema e dei videogiochi.\\ Il nostro obiettivo è quello di approfondire le intricate dinamiche delle emozioni che il suono evoca se combinato con altri stimoli: colori, contenuti e movimenti della macchina da presa; basandosi su osservazioni scientifiche oggettive. In particolare si vuole verificare quali siano le principali tipologie di stimoli che influenzano la percezione delle emozioni nell'uomo. Per fare ciò abbiamo creato due questionari, in cui ai partecipanti è stato chiesto di guardare alcuni brevi video contenenti gli stimoli e di valutare l'emozione provata dopo la visione. Il primo si concentra specificamente sulla relazione tra suoni, colori ed emozioni; il secondo, invece, ha lo scopo di ricreare in modo più completo l'insieme degli stimoli che è possibile trovare in una scena cinematografica: suoni, colori, contenuti, movimenti di macchina e il loro rapporto con le emozioni evocate dalla scena. Abbiamo quindi raccolto e analizzato i dati. I risultati del primo esperimento hanno evidenziato una dipendenza quasi totale delle emozioni dai suoni presentati nei video; mentre l'influenza dei colori è, solo in alcuni casi, rafforzativa. I risultati del secondo esperimento hanno dimostrato che l'influenza principale sulle emozioni percepite dalla la visione di una scena cinematografica, è data dal contenuto della scena stessa. Un'influenza secondaria è, poi, data dagli stimoli uditivi. I movimenti della macchina e i colori, invece, risultano rispettivamente avere influenza minima e parziale e non avere alcuna influenza sulle emozioni percepite dall'audience. Si ritiene che i risultati di questi esperimenti possano contribuire a una comprensione più profonda dell'intricata interazione tra gli stimoli nel contesto della progettazione multimediale, consentendo ai professionisti di vari settori di sfruttare la potenza del Sound Design in modo più efficace, per suscitare specifiche risposte emotive nel loro pubblico.In an era where effective communication is paramount, Sound Design is gaining every day more importance in the design of multimedia products, enabling them to engender profound emotional responses in their audiences, which are fundamental for its applications in marketing and multimedia, but also in more artistic fields like the world of cinema and video games industries. Our objective is to delve deep into the world of sound, investigating the intricate dynamics of emotions it evokes when combined to other stimuli: colours, contents and camera motion techniques; basing on objective scientific observations. In particular, we want to verify which are the main types of stimuli that influence the human perception of emotions. To do so we built two questionnaires, in which participants were asked to watch some short videos containing the stimuli and to rate the emotions they felt. The first one focuses on the specific relationship between sounds, colours and evoked emotions; the second, instead, aims to recreate in a more complete way the set of stimuli that we can find in a cinematic scene: sounds, colours, contents, camera movements and their relationship with the emotions the scene evokes. We therefore collected and analyzed the data. The results of the first experiment highlighted the almost complete dependence of perceived emotions on the sounds presented in the videos, while the influence of colours was, just in some cases, only reinforcing. The results of the second experiment demonstrated that the main influence on the evoked emotions when watching to a cinematic scene is given by the content of the scene itself, and a secondary influence is given by the presented auditory stimuli. Colours and Camera motion techniques resulted to have respectively a minimum influence and no influence on the overall emotional perception. It is believed that the results of these experiments may contribute to a deeper understanding of the intricate interplay between stimuli in multimedia design, enabling professionals across various industries to harness the power of Sound Design more effectively, to elicit specific emotional responses from their audiences

    Music information retrieval: conceptuel framework, annotation and user behaviour

    Get PDF
    Understanding music is a process both based on and influenced by the knowledge and experience of the listener. Although content-based music retrieval has been given increasing attention in recent years, much of the research still focuses on bottom-up retrieval techniques. In order to make a music information retrieval system appealing and useful to the user, more effort should be spent on constructing systems that both operate directly on the encoding of the physical energy of music and are flexible with respect to users’ experiences. This thesis is based on a user-centred approach, taking into account the mutual relationship between music as an acoustic phenomenon and as an expressive phenomenon. The issues it addresses are: the lack of a conceptual framework, the shortage of annotated musical audio databases, the lack of understanding of the behaviour of system users and shortage of user-dependent knowledge with respect to high-level features of music. In the theoretical part of this thesis, a conceptual framework for content-based music information retrieval is defined. The proposed conceptual framework - the first of its kind - is conceived as a coordinating structure between the automatic description of low-level music content, and the description of high-level content by the system users. A general framework for the manual annotation of musical audio is outlined as well. A new methodology for the manual annotation of musical audio is introduced and tested in case studies. The results from these studies show that manually annotated music files can be of great help in the development of accurate analysis tools for music information retrieval. Empirical investigation is the foundation on which the aforementioned theoretical framework is built. Two elaborate studies involving different experimental issues are presented. In the first study, elements of signification related to spontaneous user behaviour are clarified. In the second study, a global profile of music information retrieval system users is given and their description of high-level content is discussed. This study has uncovered relationships between the users’ demographical background and their perception of expressive and structural features of music. Such a multi-level approach is exceptional as it included a large sample of the population of real users of interactive music systems. Tests have shown that the findings of this study are representative of the targeted population. Finally, the multi-purpose material provided by the theoretical background and the results from empirical investigations are put into practice in three music information retrieval applications: a prototype of a user interface based on a taxonomy, an annotated database of experimental findings and a prototype semantic user recommender system. Results are presented and discussed for all methods used. They show that, if reliably generated, the use of knowledge on users can significantly improve the quality of music content analysis. This thesis demonstrates that an informed knowledge of human approaches to music information retrieval provides valuable insights, which may be of particular assistance in the development of user-friendly, content-based access to digital music collections

    ESCOM 2017 Book of Abstracts

    Get PDF

    Understanding Agreement and Disagreement in Listeners’ Perceived Emotion in Live Music Performance

    Get PDF
    Emotion perception of music is subjective and time dependent. Most computational music emotion recognition (MER) systems overlook time- and listener-dependent factors by averaging emotion judgments across listeners. In this work, we investigate the influence of music, setting (live vs lab vs online), and individual factors on music emotion perception over time. In an initial study, we explore changes in perceived music emotions among audience members during live classical music performances. Fifteen audience members used a mobile application to annotate time-varying emotion judgments based on the valence-arousal model. Inter-rater reliability analyses indicate that consistency in emotion judgments varies significantly across rehearsal segments, with systematic disagreements in certain segments. In a follow-up study, we examine listeners' reasons for their ratings in segments with high and low agreement. We relate these reasons to acoustic features and individual differences. Twenty-one listeners annotated perceived emotions while watching a recorded video of the live performance. They then reflected on their judgments and provided explanations retrospectively. Disagreements were attributed to listeners attending to different musical features or being uncertain about the expressed emotions. Emotion judgments were significantly associated with personality traits, gender, cultural background, and music preference. Thematic analysis of explanations revealed cognitive processes underlying music emotion perception, highlighting attributes less frequently discussed in MER studies, such as instrumentation, arrangement, musical structure, and multimodal factors related to performer expression. Exploratory models incorporating these semantic features and individual factors were developed to predict perceived music emotion over time. Regression analyses confirmed the significance of listener-informed semantic features as independent variables, with individual factors acting as moderators between loudness, pitch range, and arousal. In our final study, we analyzed the effects of individual differences on music emotion perception among 128 participants with diverse backgrounds. Participants annotated perceived emotions for 51 piano performances of different compositions from the Western canon, spanning various era. Linear mixed effects models revealed significant variations in valence and arousal ratings, as well as the frequency of emotion ratings, with regard to several individual factors: music sophistication, music preferences, personality traits, and mood states. Additionally, participants' ratings of arousal, valence, and emotional agreement were significantly associated to the historical time periods of the examined clips. This research highlights the complexity of music emotion perception, revealing it to be a dynamic, individual and context-dependent process. It paves the way for the development of more individually nuanced, time-based models in music psychology, opening up new avenues for personalised music emotion recognition and recommendation, music emotion-driven generation and therapeutic applications

    Presence 2005: the eighth annual international workshop on presence, 21-23 September, 2005 University College London (Conference proceedings)

    Get PDF
    OVERVIEW (taken from the CALL FOR PAPERS) Academics and practitioners with an interest in the concept of (tele)presence are invited to submit their work for presentation at PRESENCE 2005 at University College London in London, England, September 21-23, 2005. The eighth in a series of highly successful international workshops, PRESENCE 2005 will provide an open discussion forum to share ideas regarding concepts and theories, measurement techniques, technology, and applications related to presence, the psychological state or subjective perception in which a person fails to accurately and completely acknowledge the role of technology in an experience, including the sense of 'being there' experienced by users of advanced media such as virtual reality. The concept of presence in virtual environments has been around for at least 15 years, and the earlier idea of telepresence at least since Minsky's seminal paper in 1980. Recently there has been a burst of funded research activity in this area for the first time with the European FET Presence Research initiative. What do we really know about presence and its determinants? How can presence be successfully delivered with today's technology? This conference invites papers that are based on empirical results from studies of presence and related issues and/or which contribute to the technology for the delivery of presence. Papers that make substantial advances in theoretical understanding of presence are also welcome. The interest is not solely in virtual environments but in mixed reality environments. Submissions will be reviewed more rigorously than in previous conferences. High quality papers are therefore sought which make substantial contributions to the field. Approximately 20 papers will be selected for two successive special issues for the journal Presence: Teleoperators and Virtual Environments. PRESENCE 2005 takes place in London and is hosted by University College London. The conference is organized by ISPR, the International Society for Presence Research and is supported by the European Commission's FET Presence Research Initiative through the Presencia and IST OMNIPRES projects and by University College London
    corecore