9 research outputs found
Multiple scale music segmentation using rhythm, timbre and harmony
The segmentation of music into intro-chorus-verse-outro, and similar segments, is a difficult topic. A method for performing automatic segmentation based on features related to rhythm, timbre, and harmony is presented, and compared, between the features and between the features and manual segmentation of a database of 48 songs. Standard information retrieval performance measures are used in the comparison, and it is shown that the timbre-related feature performs best.</p
Recommended from our members
Bringing 'Musicque into the tableture': machine-learning models for polyphonic transcription of 16th-century lute tablature
A large corpus of music written in lute tablature, spanning some three-and-a-half centuries, has survived. This music has so far escaped systematic musicological research because of its notational format. Being a practical instruction for the player, tablature reveals very little of the polyphonic structure of the music it encodes—and is therefore relatively inaccessible to non-specialists. Automatic polyphonic transcription into modern music notation can help unlock the corpus to a larger audience, and thus facilitate musicological research.
In this study we present four variants of a machine-learning model for voice separation and duration reconstruction in 16th-century lute tablature. These models are intended to form the heart of an interactive system for automatic polyphonic transcription that can assist users in making editions tailored to their own preferences. Additionally, such models can provide new methods for analysing different aspects of polyphonic structure.
We have experimented with modelling only voice and modelling voice and duration simultaneously, applying each in a forward- and in a backward-processing approach. The models are evaluated on a dataset containing 15 three- and four-voice intabulations. Each processing approach has its advantages, and the results vary between the models. With accuracy rates between approximately 80 and 90 per cent, both for voice prediction and for duration prediction, the best models’ performance is promising. Even in this early stage of the research, such models yield a useful initial transcription system
Partitura: A Python Package for Symbolic Music Processing
Partitura is a lightweight Python package for handling symbolic musical information. It provides easy access to features commonly used in music information retrieval tasks, like note arrays (lists of timed pitched events) and 2D piano roll matrices, as well as other score elements such as time and key signatures, performance directives, and repeat structures. Partitura can load musical scores (in MEI, MusicXML, Humdrum **kern, and MIDI formats), MIDI performances, and score- to-performance alignments. The package includes some tools for music analysis, such as automatic pitch spelling, key signature identification, and voice separation. Partitura is an open-source project and is available at https://github.com/CPJKU/partitura/
A Probabilistic Model of Meter Perception: Simulating Enculturation
HH is supported by a Distinguished Lorentz fellowship granted by the Lorentz Center for the Sciences and the Netherlands Institute for Advanced Study in the Humanities and Social Sciences (NIAS) and a Horizon grant of the Netherlands Organization for Scientific Research (NWO). BW and MP also received support from the EPSRC Digital Music Platform Grant held at Queen Mary (EP/K009559/1). MP is supported by a grant from the UK Engineering and Physical Science Research Council (EPSRC, EP/M000702/1)
Signal Processing Methods for Music Synchronization, Audio Matching, and Source Separation
The field of music information retrieval (MIR) aims at developing techniques and tools for organizing, understanding, and searching multimodal information in large music collections in a robust, efficient and intelligent manner. In this context, this thesis presents novel, content-based methods for music synchronization, audio matching, and source separation. In general, music synchronization denotes a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation. Here, the thesis presents three complementary synchronization approaches, which improve upon previous methods in terms of robustness, reliability, and accuracy. The first approach employs a late-fusion strategy based on multiple, conceptually different alignment techniques to identify those music passages that allow for reliable alignment results. The second approach is based on the idea of employing musical structure analysis methods in the context of synchronization to derive reliable synchronization results even in the presence of structural differences between the versions to be aligned. Finally, the third approach employs several complementary strategies for increasing the accuracy and time resolution of synchronization results. Given a short query audio clip, the goal of audio matching is to automatically retrieve all musically similar excerpts in different versions and arrangements of the same underlying piece of music. In this context, chroma-based audio features are a well-established tool as they possess a high degree of invariance to variations in timbre. This thesis describes a novel procedure for making chroma features even more robust to changes in timbre while keeping their discriminative power. Here, the idea is to identify and discard timbre-related information using techniques inspired by the well-known MFCC features, which are usually employed in speech processing. Given a monaural music recording, the goal of source separation is to extract musically meaningful sound sources corresponding, for example, to a melody, an instrument, or a drum track from the recording. To facilitate this complex task, one can exploit additional information provided by a musical score. Based on this idea, this thesis presents two novel, conceptually different approaches to source separation. Using score information provided by a given MIDI file, the first approach employs a parametric model to describe a given audio recording of a piece of music. The resulting model is then used to extract sound sources as specified by the score. As a computationally less demanding and easier to implement alternative, the second approach employs the additional score information to guide a decomposition based on non-negative matrix factorization (NMF)
Recommended from our members
Automatic Generation of Dynamic Musical Transitions in Computer Games
In video games, music must often change quickly from one piece to another due to player interaction, such as when moving between different areas. This quick change in music can often sound jarring if the two pieces are very different from each other. Several transition techniques have been used in industry such as the abrupt cut transition, crossfading, horizontal resequencing and vertical reorchestration among others. However, while several claims are made about their effectiveness (or lack thereof), none of these have been experimentally tested.
To investigate how effective each transition technique is, this dissertation empirically evaluates each technique in a study informed by music psychology. This is done based on several features identified as being important for successful transitions. The obtained results led to a novel approach to musical transitions in video games by investigating the use of a multiple viewpoint system, with viewpoints being modelled using Markov models. This algorithm allowed the seamless generation of music that could serve as a transition between two composed pieces of music. While transitions in games normally occur over a zone boundary, the algorithm presented in this dissertation takes place over a transition region, giving the generated music enough time to transition.
This novel approach was evaluated in a bespoke video game environment, where participants navigated through several pairs of different game environments and rated the resulting musical transitions. The results indicate that the generated transitions perform as well as crossfading, a technique commonly used in the industry. Since crossfading is not always appropriate, being able to use generated transitions gives composers another tool in their toolbox. Furthermore, the principled approach taken opens up avenues for further research