Search CORE

14 research outputs found

Recommended from our members

Improving generalization for polyphonic piano transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

In this paper, we present methods to improve the generalization capabilities of a classification-based approach to polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances, and the independent classifications are temporally constrained via hidden Markov model post-processing. Semi-supervised learning and multiconditioning are investigated, and transcription results are reported for a compiled set of piano recordings. A reduction in the frame-level transcription error score of 10% was achieved by combining multiconditioning and semi-supervised classification

Columbia University Academic Commons

A Discriminative Model for Polyphonic Piano Transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided

CiteSeerX

Columbia University Academic Commons

Springer - Publisher Connector

Directory of Open Access Journals

Identifying 'Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Large music collections, ranging from thousands to millions of tracks, are unsuited to manual searching, motivating the development of automatic search methods. When different musicians perform the same underlying song or piece, these are known as 'cover' versions. We describe a system that attempts to identify such a relationship between music audio recordings. To overcome variability in tempo, we use beat tracking to describe each piece with one feature vector per beat. To deal with variation in instrumentation, we use 12-dimensional 'chroma' feature vectors that collect spectral energy supporting each semitone of the octave. To compare two recordings, we simply cross-correlate the entire beat-by-chroma representation for two tracks and look for sharp peaks indicating good local alignment between the pieces. Evaluation on several databases indicate good performance, including best performance on an independent international evaluation, where the system achieved a mean reciprocal ranking of 0.49 for true cover versions among top-10 returns

Crossref

Columbia University Academic Commons

Recommended from our members

Identifying "Cover Songs" with Beat-Synchronous Chroma Features

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Describes the problem of cover songs, how to calculate chroma features and track beats with dynamic programming, and how to match beat-chroma matrices

Columbia University Academic Commons

A Classification Approach to Melody Transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

Melodies provide an important conceptual summarization of polyphonic audio. The extraction of melodic content has practical applications ranging from content-based audio retrieval to the analysis of musical structure. In contrast to previous transcription systems based on a model of the harmonic (or periodic) structure of musical pitches, we present a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data. We evaluate the success of our algorithm by predicting the melody of the ISMIR 2004 Melody Competition evaluation set and on newly-generated test data. We show that a Support Vector Machine melodic classifier produces results comparable to state of the art model-based transcription systems

CiteSeerX

Columbia University Academic Commons

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Recommended from our members

Melody Transcription From Music Audio: Approaches and Evaluation

Author: Ehmann Andreas F.
Ellis Daniel P. W.
Gomez Emilia
Ong Beesuan
Poliner Graham E.
Streich Sebastian
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody--roughly, the part a listener might whistle or hum--as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications

Columbia University Academic Commons

Recommended from our members

Support Vector Machine Active Learning for Music Retrieval

Author: Ellis Daniel P. W.
Mandel Michael I.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

Searching and organizing growing digital music collections requires a computational model of music similarity. This paper describes a system for performing flexible music similarity queries using SVM active learning. We evaluated the success of our system by classifying 1210 pop songs according to mood and style (from an online music guide) and by the performing artist. In comparing a number of representations for songs, we found the statistics of mel-frequency cepstral coefficients to perform best in precision-at-20 comparisons. We also show that by choosing training examples intelligently, active learning requires half as many labeled examples to achieve the same accuracy as a standard scheme

Columbia University Academic Commons

Melody Transcription From Music Audio: Approaches and Evaluation

Author: Andreas F. Ehmann
Beesuan Ong
Daniel P. W. Ellis
Emilia Gomez
Graham E. Poliner
Sebastian Streich
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Classification-based melody transcription

Author: Daniel P. W. Ellis
Graham E. Poliner
Publication venue
Publication date: 01/01/2006
Field of study

The melody of a musical piece – informally, the part you would hum along with – is a useful and compact summary of a full audio recording. The extraction of melodic content has practical applications ranging from content-based audio retrieval to the analysis of musical structure. Whereas previous systems generate transcriptions based on a model of the harmonic (or periodic) structure of musical pitches, we present a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data. We evaluate the success of our algorithm by predicting the melody of the ADC 2004 Melody Competition evaluation set, and we show that a simple framelevel note classifier, temporally smoothed by post processing with a hidden Markov model, produces results comparable to state of the art model-based transcription systems.

CiteSeerX