2,473 research outputs found
IDENTIFICATION OF COVER SONGS USING INFORMATION THEORETIC MEASURES OF SIMILARITY
13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted versio
Music fingerprinting based on bhattacharya distance for song and cover song recognition
People often have trouble recognizing a song especially, if the song is sung by a not original artist which is called cover song. Hence, an identification system might be used to help recognize a song or to detect copyright violation. In this study, we try to recognize a song and a cover song by using the fingerprint of the song represented by features extracted from MPEG-7. The fingerprint of the song is represented by Audio Signature Type. Moreover, the fingerprint of the cover song is represented by Audio Spectrum Flatness and Audio Spectrum Projection. Furthermore, we propose a sliding algorithm and k-Nearest Neighbor (k-NN) with Bhattacharyya distance for song recognition and cover song recognition. The results of this experiment show that the proposed fingerprint technique has an accuracy of 100% for song recognition and an accuracy of 85.3% for cover song recognition
Sequential Complexity as a Descriptor for Musical Similarity
We propose string compressibility as a descriptor of temporal structure in
audio, for the purpose of determining musical similarity. Our descriptors are
based on computing track-wise compression rates of quantised audio features,
using multiple temporal resolutions and quantisation granularities. To verify
that our descriptors capture musically relevant information, we incorporate our
descriptors into similarity rating prediction and song year prediction tasks.
We base our evaluation on a dataset of 15500 track excerpts of Western popular
music, for which we obtain 7800 web-sourced pairwise similarity ratings. To
assess the agreement among similarity ratings, we perform an evaluation under
controlled conditions, obtaining a rank correlation of 0.33 between intersected
sets of ratings. Combined with bag-of-features descriptors, we obtain
performance gains of 31.1% and 10.9% for similarity rating prediction and song
year prediction. For both tasks, analysis of selected descriptors reveals that
representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio
Towards a style-specific basis for computational beat tracking
Outlined in this paper are a number of sources of evidence, from psychological, ethnomusicological and engineering grounds, to suggest that current approaches to computational beat tracking are incomplete. It is contended that the degree to which cultural knowledge, that is, the specifics of style and associated learnt representational schema, underlie the human faculty of beat tracking has been severely underestimated. Difficulties in building general beat tracking solutions, which can provide both period and phase locking across a large corpus of styles, are highlighted. It is probable that no universal beat tracking model exists which does not utilise a switching model to recognise style and context prior to application
Information-Theoretic Measures of Predictability for Music Content Analysis.
PhDThis thesis is concerned with determining similarity in musical audio, for the purpose of applications
in music content analysis. With the aim of determining similarity, we consider the
problem of representing temporal structure in music. To represent temporal structure, we propose
to compute information-theoretic measures of predictability in sequences. We apply our
measures to track-wise representations obtained from musical audio; thereafter we consider the
obtained measures predictors of musical similarity. We demonstrate that our approach benefits
music content analysis tasks based on musical similarity.
For the intermediate-specificity task of cover song identification, we compare contrasting
discrete-valued and continuous-valued measures of pairwise predictability between sequences.
In the discrete case, we devise a method for computing the normalised compression distance
(NCD) which accounts for correlation between sequences. We observe that our measure improves
average performance over NCD, for sequential compression algorithms. In the continuous
case, we propose to compute information-based measures as statistics of the prediction error
between sequences. Evaluated using 300 Jazz standards and using the Million Song Dataset,
we observe that continuous-valued approaches outperform discrete-valued approaches. Further,
we demonstrate that continuous-valued measures of predictability may be combined to improve
performance with respect to baseline approaches. Using a filter-and-refine approach, we demonstrate
state-of-the-art performance using the Million Song Dataset.
For the low-specificity tasks of similarity rating prediction and song year prediction, we propose
descriptors based on computing track-wise compression rates of quantised audio features,
using multiple temporal resolutions and quantisation granularities. We evaluate our descriptors
using a dataset of 15 500 track excerpts of Western popular music, for which we have 7 800
web-sourced pairwise similarity ratings. Combined with bag-of-features descriptors, we obtain
performance gains of 31.1% and 10.9% for similarity rating prediction and song year prediction.
For both tasks, analysis of selected descriptors reveals that representing features at multiple time
scales benefits prediction accuracy.This work was supported by a UK EPSRC DTA studentship
Automatic Drum Transcription and Source Separation
While research has been carried out on automated polyphonic music transcription, to-date the problem of automated polyphonic percussion transcription has not received the same degree of attention. A related problem is that of sound source separation, which attempts to separate a mixture signal into its constituent sources. This thesis focuses on the task of polyphonic percussion transcription and sound source separation of a limited set of drum instruments, namely the drums found in the standard rock/pop drum kit. As there was little previous research on polyphonic percussion transcription a broad review of music information retrieval methods, including previous polyphonic percussion systems, was also carried out to determine if there were any methods which were of potential use in the area of polyphonic drum transcription. Following on from this a review was conducted of general source separation and redundancy reduction techniques, such as Independent Component Analysis and Independent Subspace Analysis, as these techniques have shown potential in separating mixtures of sources. Upon completion of the review it was decided that a combination of the blind separation approach, Independent Subspace Analysis (ISA), with the use of prior knowledge as used in music information retrieval methods, was the best approach to tackling the problem of polyphonic percussion transcription as well as that of sound source separation. A number of new algorithms which combine the use of prior knowledge with the source separation abilities of techniques such as ISA are presented. These include sub-band ISA, Prior Subspace Analysis (PSA), and an automatic modelling and grouping technique which is used in conjunction with PSA to perform polyphonic percussion transcription. These approaches are demonstrated to be effective in the task of polyphonic percussion transcription, and PSA is also demonstrated to be capable of transcribing drums in the presence of pitched instruments
Feature Extraction for Music Information Retrieval
Copyright c © 2009 Jesper Højvang Jensen, except where otherwise stated
- …