33 research outputs found
Recommended from our members
Score-informed transcription for automatic piano tutoring
In this paper, a score-informed transcription method for automatic piano tutoring is proposed. The method takes as input a recording made by a student which may contain mistakes, along with a reference score. The recording and the aligned synthesized score are automatically transcribed using the non-negative matrix factorization algorithm for multi-pitch estimation and hidden Markov models for note tracking. By comparing the two transcribed recordings, common errors occurring in transcription algorithms such as extra octave notes can be suppressed. The result is a piano-roll description which shows the mistakes made by the student along with the correctly played notes. Evaluation was performed on six pieces recorded using a Disklavier piano, using both manually-aligned and automatically-aligned scores as an input. Results comparing the system output with ground-truth annotation of the original recording reach a weighted F-measure of 93%, indicating that the proposed method can successfully analyze the student's performance
Identifying Cover Songs Using Information-Theoretic Measures of Similarity
This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/This paper investigates methods for quantifying similarity between audio signals, specifically for the task of cover song detection. We consider an information-theoretic approach, where we compute pairwise measures of predictability between time series. We compare discrete-valued approaches operating on quantized audio features, to continuous-valued approaches. In the discrete case, we propose a method for computing the normalized compression distance, where we account for correlation between time series. In the continuous case, we propose to compute information-based measures of similarity as statistics of the prediction error between time series. We evaluate our methods on two cover song identification tasks using a data set comprised of 300 Jazz standards and using the Million Song Dataset. For both datasets, we observe that continuous-valued approaches outperform discrete-valued approaches. We consider approaches to estimating the normalized compression distance (NCD) based on string compression and prediction, where we observe that our proposed normalized compression distance with alignment (NCDA) improves average performance over NCD, for sequential compression algorithms. Finally, we demonstrate that continuous-valued distances may be combined to improve performance with respect to baseline approaches. Using a large-scale filter-and-refine approach, we demonstrate state-of-the-art performance for cover song identification using the Million Song Dataset.The work of P. Foster was supported by an Engineering and Physical Sciences Research Council Doctoral Training Account studentship
Recommended from our members
Improving instrument recognition in polyphonic music through system integration
A method is proposed for instrument recognition in polyphonic music which combines two independent detector systems. A polyphonic musical instrument recognition system using a missing feature approach and an automatic music transcription system based on shift invariant probabilistic latent component analysis that includes instrument assignment. We propose a method to integrate the two systems by fusing the instrument contributions estimated by the first system onto the transcription system in the form of Dirichlet priors. Both systems, as well as the integrated system are evaluated using a dataset of continuous polyphonic music recordings. Detailed results that highlight a clear improvement in the performance of the integrated system are reported for different training conditions
Instrumentation-based music similarity using sparse representations
International audienc
Automatic Music Transcription: Breaking the Glass Ceiling
Automatic music transcription is considered by many to be the Holy Grail in the field of music signal analysis. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. In order to overcome the limited performance of transcription systems, algorithms have to be tailored to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information across different methods and musical aspects
Missing template estimation for user-assisted music transcription
For a user-assisted music transcription system in which the user is asked to label some notes for each instrument in the recording, we investigate ways to limit the amount of information the user has to provide. Different methods are proposed and experimentally compared that enable the estimation of template spectra at pitch positions that have not been annotated by the user, in order to derive a full set of instrument templates that can be used within a non-negative matrix factorisation framework. A set of error metrics is presented that enables the evaluation of the NMF gain matrix. The results show that purely data-driven methods outperform more refined instrument models when the user annotates notes at many different pitches for each instrument. When notes are labelled at a smaller number of different pitches, the highest accuracies are obtained using pre-stored instrument templates that are adapted to the instruments in the mixture. © 2013 IEEE