Search CORE

84 research outputs found

Non-Negative Group Sparsity with Subspace Note Modelling for Polyphonic Transcription

Author: Keriven N
Nagano H
O'Hanlon K
Plumbley MD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2016
Field of study

This work was supported by EPSRC Platform Grant EPSRC EP/K009559/1, EPSRC Grant EP/L027119/1, and EPSRC Grant EP/J010375/1

Crossref

University of Surrey

Queen Mary Research Online

Surrey Research Insight

Deep Polyphonic ADSR Piano Note Transcription

Author: Böck Sebastian
Kelz Rainer
Widmer Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2019
Field of study

We investigate a late-fusion approach to piano transcription, combined with a strong temporal prior in the form of a handcrafted Hidden Markov Model (HMM). The network architecture under consideration is compact in terms of its number of parameters and easy to train with gradient descent. The network outputs are fused over time in the final stage to obtain note segmentations, with an HMM whose transition probabilities are chosen based on a model of attack, decay, sustain, release (ADSR) envelopes, commonly used for sound synthesis. The note segments are then subject to a final binary decision rule to reject too weak note segment hypotheses. We obtain state-of-the-art results on the MAPS dataset, and are able to outperform other approaches by a large margin, when predicting complete note regions from onsets to offsets.Comment: 5 pages, 2 figures, published as ICASSP'1

arXiv.org e-Print Archive

Crossref

Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription

Author: Benetos E
Liu L
Pearce M
Ycart A
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/01/2020
Field of study

Automatic Music Transcription (AMT) is usually evaluated using low-level criteria, typically by counting the numbers of errors, with equal weighting. Yet, some errors (e.g. out-of-key notes) are more salient than others. In this study, we design an online listening test to gather judgements about AMT quality. These judgements take the form of pairwise comparisons of transcriptions of the same music by pairs of different AMT systems. We investigate how these judgements correlate with benchmark metrics, and find that although they match in many cases, agreement drops when comparing pairs with similar scores, or pairs of poor transcriptions. We show that onset-only notewise F-measure is the benchmark metric that correlates best with human judgement, all the more so with higher onset tolerance thresholds. We define a set of features related to various musical attributes, and use them to design a new metric that correlates significantly better with listeners' quality judgements. We examine which musical aspects were important to raters by conducting an ablation study on the defined metric, highlighting the importance of the rhythmic dimension (tempo, meter). We make the collected data entirely available for further study, in particular to evaluate the perceptual relevance of new AMT metrics

Queen Mary Research Online

Learning and Evaluation Methodologies for Polyphonic Music Sequence Prediction with LSTMs

Author: Benetos E
Ycart A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Music language models (MLMs) play an important role for various music signal and symbolic music processing tasks, such as music generation, symbolic music classification, or automatic music transcription (AMT). In this paper, we investigate Long Short-Term Memory (LSTM) networks for polyphonic music prediction, in the form of binary piano rolls. A preliminary experiment, assessing the influence of the timestep of piano rolls on system performance, highlights the need for more musical evaluation metrics. We introduce a range of metrics, focusing on temporal and harmonic aspects. We propose to combine them into a parametrisable loss to train our network. We then conduct a range of experiments with this new loss, both for polyphonic music prediction (intrinsic evaluation) and using our predictive model as a language model for AMT (extrinsic evaluation). Intrinsic evaluation shows that tuning the behaviour of a model is possible by adjusting loss parameters, with consistent results across timesteps. Extrinsic evaluation shows consistent behaviour across timesteps in terms of precision and recall with respect to the loss parameters, leading to an improvement in AMT performance without changing the complexity of the model. In particular, we show that intrinsic performance (in terms of cross entropy) is not related to extrinsic performance, highlighting the importance of using custom training losses for each specific application. Our model also compares favourably with previously proposed MLMs

Queen Mary Research Online

Recommended from our members

Monophonic Automatic Music Transcription With Convolutional Neural Networks

Author: Lee Mina
Publication venue
Publication date: 01/05/2019
Field of study

This thesis utilizes convolutional neural networks for monophonic automatic music transcription of piano music. We present three different systems utilizing CNNs to perform onset, pitch, and offset detection to get a final output of sheet music. Our TCN system is based on Bai et al.'s TCN architecture, and itachieved the best results due to having the best offset detection, and we were able to get fairly accurate sheet music from this system

Texas ScholarWorks