Search CORE

32 research outputs found

MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation

Author: Huang Zhijie
Sun Lingyun
Tan Xu
Wang Zihao
Wu Xinda
Yu Jiaxing
Zhang Kejun
Zhang Tieyao
Publication venue
Publication date: 20/09/2023
Field of study

Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody generation struggle to capture multi-scale, multi-dimensional structural information in note sequences, due to the domain knowledge discrepancy between text and music. Moreover, the lack of available large-scale symbolic melody datasets limits the pre-training improvement. In this paper, we propose MelodyGLM, a multi-task pre-training framework for generating melodies with long-term structure. We design the melodic n-gram and long span sampling strategies to create local and global blank infilling tasks for modeling the local and global structures in melodies. Specifically, we incorporate pitch n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram blank infilling tasks for modeling the multi-dimensional structures in melodies. To this end, we have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet is utilized for large-scale pre-training and domain-specific n-gram lexicon construction. Both subjective and objective evaluations demonstrate that MelodyGLM surpasses the standard and previous pre-training methods. In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in consistency, rhythmicity, structure, and overall quality, respectively. Notably, MelodyGLM nearly matches the quality of human-composed melodies on the melody inpainting task

arXiv.org e-Print Archive

Understanding Agreement and Disagreement in Listeners’ Perceived Emotion in Live Music Performance

Author: Yang S
Publication venue
Publication date: 03/11/2023
Field of study

Emotion perception of music is subjective and time dependent. Most computational music emotion recognition (MER) systems overlook time- and listener-dependent factors by averaging emotion judgments across listeners. In this work, we investigate the influence of music, setting (live vs lab vs online), and individual factors on music emotion perception over time. In an initial study, we explore changes in perceived music emotions among audience members during live classical music performances. Fifteen audience members used a mobile application to annotate time-varying emotion judgments based on the valence-arousal model. Inter-rater reliability analyses indicate that consistency in emotion judgments varies significantly across rehearsal segments, with systematic disagreements in certain segments. In a follow-up study, we examine listeners' reasons for their ratings in segments with high and low agreement. We relate these reasons to acoustic features and individual differences. Twenty-one listeners annotated perceived emotions while watching a recorded video of the live performance. They then reflected on their judgments and provided explanations retrospectively. Disagreements were attributed to listeners attending to different musical features or being uncertain about the expressed emotions. Emotion judgments were significantly associated with personality traits, gender, cultural background, and music preference. Thematic analysis of explanations revealed cognitive processes underlying music emotion perception, highlighting attributes less frequently discussed in MER studies, such as instrumentation, arrangement, musical structure, and multimodal factors related to performer expression. Exploratory models incorporating these semantic features and individual factors were developed to predict perceived music emotion over time. Regression analyses confirmed the significance of listener-informed semantic features as independent variables, with individual factors acting as moderators between loudness, pitch range, and arousal. In our final study, we analyzed the effects of individual differences on music emotion perception among 128 participants with diverse backgrounds. Participants annotated perceived emotions for 51 piano performances of different compositions from the Western canon, spanning various era. Linear mixed effects models revealed significant variations in valence and arousal ratings, as well as the frequency of emotion ratings, with regard to several individual factors: music sophistication, music preferences, personality traits, and mood states. Additionally, participants' ratings of arousal, valence, and emotional agreement were significantly associated to the historical time periods of the examined clips. This research highlights the complexity of music emotion perception, revealing it to be a dynamic, individual and context-dependent process. It paves the way for the development of more individually nuanced, time-based models in music psychology, opening up new avenues for personalised music emotion recognition and recommendation, music emotion-driven generation and therapeutic applications

Queen Mary Research Online

Proceedings of the 19th Sound and Music Computing Conference

Author: Michon Romain
Orlarey Yann
Pottier Laurent
Publication venue: SMC Network
Publication date: 12/07/2022
Field of study

INRIA a CCSD electronic archive server

Proceedings of the 19th Sound and Music Computing Conference

Author: Michon Romain
Orlarey Yann
Pottier Laurent
Publication venue: SMC Network
Publication date: 12/07/2022
Field of study

Proceedings of the 19th Sound and Music Computing Conference - June 5-12, 2022 - Saint-Étienne (France). https://smc22.grame.f

HAL-UJM

INRIA a CCSD electronic archive server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Safe and Sound: Proceedings of the 27th Annual International Conference on Auditory Display

Author
Publication venue: 'International Community for Auditory Display'
Publication date: 01/06/2022
Field of study

Complete proceedings of the 27th International Conference on Auditory Display (ICAD2022), June 24-27. Online virtual conference

Scholarly Materials And Research @ Georgia Tech

Recommended from our members

Proceedings of the 1st International Conference on Live Coding

Author: Armitage Joanne
Knotts Shelly
Magnusson Thor
McLean Alex
Ng Kia
Publication venue: ICSRiM, School of Music
Publication date: 13/07/2015
Field of study

Open Access peer reviewed papers on live coding published at the 1st International Conference on Live Coding (ICLC) in Leeds

Sussex Research Online

Detection and Evaluation of Clusters within Sequential Data

Author: Kosmella Gianluca
Sanders Jaron
Senen-Cerda Albert
Van Werde Alexander
Publication venue
Publication date: 04/10/2022
Field of study

Motivated by theoretical advancements in dimensionality reduction techniques we use a recent model, called Block Markov Chains, to conduct a practical study of clustering in real-world sequential data. Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees and can be deployed in sparse data regimes. Despite these favorable theoretical properties, a thorough evaluation of these algorithms in realistic settings has been lacking. We address this issue and investigate the suitability of these clustering algorithms in exploratory data analysis of real-world sequential data. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. In order to evaluate the determined clusters, and the associated Block Markov Chain model, we further develop a set of evaluation tools. These tools include benchmarking, spectral noise analysis and statistical model selection tools. An efficient implementation of the clustering algorithm and the new evaluation tools is made available together with this paper. Practical challenges associated to real-world data are encountered and discussed. It is ultimately found that the Block Markov Chain model assumption, together with the tools developed here, can indeed produce meaningful insights in exploratory data analyses despite the complexity and sparsity of real-world data.Comment: 37 pages, 12 figure

arXiv.org e-Print Archive

CREATING A COHERENT SCORE: THE MUSIC OF SINGLE-PLAYER FANTASY COMPUTER ROLE-PLAYING GAMES

Author: TATE JAMES,SEBASTIAN
Publication venue
Publication date: 01/01/2021
Field of study

This thesis provides a comprehensive exploration into the music of the ludic genre (Hourigan, 2005) known as a Computer Role-Playing Game (CRPG) and its two main sub-divisions: Japanese and Western Role-Playing Games (JRPGs & WRPGs). It focuses on the narrative category known as genre fiction, concentrating on fantasy fiction (Turco, 1999) and seeks to address one overall question: How do fantasy CRPG composers incorporate the variety of musical material needed to create a coherent score across the JRPG and WRPG divide? Seven main chapters form the thesis text. Chapter One provides an introduction to the thesis, detailing the research contributions in addition to outlining a variety of key terms that must be understood to continue with the rest of the text. A database accompanying this thesis showcases the vast range of CRPGs available; a literature review tackles relevant existing materials. Chapters Two and Three seek to provide the first canonical history of soundtracks used in CRPGs by dissecting typical narrative structures for games so as to provide context to their musical scores. Through analysis of existing game composer interviews, cultural influences are revealed. Chapters Four and Five mirror one another with detailed discussion respectively regarding JRPG and WRPG music including the influence that anime and Hollywood cinema have had upon them. In Chapter Six, the use of CRPG music outside of video games is explored, particularly the popularity of JRPG soundtracks in the concert hall. Chapter Seven concludes the thesis, summarising research contributions achieved and areas for future work. Throughout these chapters, the core task is to explain how the two primary sub-genres of CRPGs parted ways and why the music used to accompany these games differs so drastically

Durham e-Theses