Search CORE

67 research outputs found

Recommended from our members

Improving generalization for polyphonic piano transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

In this paper, we present methods to improve the generalization capabilities of a classification-based approach to polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances, and the independent classifications are temporally constrained via hidden Markov model post-processing. Semi-supervised learning and multiconditioning are investigated, and transcription results are reported for a compiled set of piano recordings. A reduction in the frame-level transcription error score of 10% was achieved by combining multiconditioning and semi-supervised classification

Columbia University Academic Commons

A Discriminative Model for Polyphonic Piano Transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided

CiteSeerX

Columbia University Academic Commons

Springer - Publisher Connector

Directory of Open Access Journals

Identifying 'Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Large music collections, ranging from thousands to millions of tracks, are unsuited to manual searching, motivating the development of automatic search methods. When different musicians perform the same underlying song or piece, these are known as 'cover' versions. We describe a system that attempts to identify such a relationship between music audio recordings. To overcome variability in tempo, we use beat tracking to describe each piece with one feature vector per beat. To deal with variation in instrumentation, we use 12-dimensional 'chroma' feature vectors that collect spectral energy supporting each semitone of the octave. To compare two recordings, we simply cross-correlate the entire beat-by-chroma representation for two tracks and look for sharp peaks indicating good local alignment between the pieces. Evaluation on several databases indicate good performance, including best performance on an independent international evaluation, where the system achieved a mean reciprocal ranking of 0.49 for true cover versions among top-10 returns

Crossref

Columbia University Academic Commons

Recommended from our members

Identifying "Cover Songs" with Beat-Synchronous Chroma Features

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Describes the problem of cover songs, how to calculate chroma features and track beats with dynamic programming, and how to match beat-chroma matrices

Columbia University Academic Commons

A Classification Approach to Melody Transcription

Author: Ellis Daniel P. W.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

Melodies provide an important conceptual summarization of polyphonic audio. The extraction of melodic content has practical applications ranging from content-based audio retrieval to the analysis of musical structure. In contrast to previous transcription systems based on a model of the harmonic (or periodic) structure of musical pitches, we present a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data. We evaluate the success of our algorithm by predicting the melody of the ISMIR 2004 Melody Competition evaluation set and on newly-generated test data. We show that a Support Vector Machine melodic classifier produces results comparable to state of the art model-based transcription systems

CiteSeerX

Columbia University Academic Commons

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Recommended from our members

Melody Transcription From Music Audio: Approaches and Evaluation

Author: Ehmann Andreas F.
Ellis Daniel P. W.
Gomez Emilia
Ong Beesuan
Poliner Graham E.
Streich Sebastian
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody--roughly, the part a listener might whistle or hum--as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications

Columbia University Academic Commons

Recommended from our members

Support Vector Machine Active Learning for Music Retrieval

Author: Ellis Daniel P. W.
Mandel Michael I.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

Searching and organizing growing digital music collections requires a computational model of music similarity. This paper describes a system for performing flexible music similarity queries using SVM active learning. We evaluated the success of our system by classifying 1210 pop songs according to mood and style (from an online music guide) and by the performing artist. In comparing a number of representations for songs, we found the statistics of mel-frequency cepstral coefficients to perform best in precision-at-20 comparisons. We also show that by choosing training examples intelligently, active learning requires half as many labeled examples to achieve the same accuracy as a standard scheme

Columbia University Academic Commons

Methodology issues concerning the accuracy of kinematic data collection and analysis using the ariel performance analysis system

Author: Carroll Amy E.
Klute Glenn K.
Poliner Jeff
Rajulu Sudhakar
Stanush Julie
Stuart Mark A.
Wilmington R. P.
Publication venue
Publication date
Field of study

Kinematics, the study of motion exclusive of the influences of mass and force, is one of the primary methods used for the analysis of human biomechanical systems as well as other types of mechanical systems. The Anthropometry and Biomechanics Laboratory (ABL) in the Crew Interface Analysis section of the Man-Systems Division performs both human body kinematics as well as mechanical system kinematics using the Ariel Performance Analysis System (APAS). The APAS supports both analysis of analog signals (e.g. force plate data collection) as well as digitization and analysis of video data. The current evaluations address several methodology issues concerning the accuracy of the kinematic data collection and analysis used in the ABL. This document describes a series of evaluations performed to gain quantitative data pertaining to position and constant angular velocity movements under several operating conditions. Two-dimensional as well as three-dimensional data collection and analyses were completed in a controlled laboratory environment using typical hardware setups. In addition, an evaluation was performed to evaluate the accuracy impact due to a single axis camera offset. Segment length and positional data exhibited errors within 3 percent when using three-dimensional analysis and yielded errors within 8 percent through two-dimensional analysis (Direct Linear Software). Peak angular velocities displayed errors within 6 percent through three-dimensional analyses and exhibited errors of 12 percent when using two-dimensional analysis (Direct Linear Software). The specific results from this series of evaluations and their impacts on the methodology issues of kinematic data collection and analyses are presented in detail. The accuracy levels observed in these evaluations are also presented

NASA Technical Reports Server

Recommended from our members

Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model

Author: Bay M.
Benetos E.
Benetos E.
Benetos E.
de Cheveigné A.
Dempster A. P.
Dessein A.
Dixon S.
Emmanouil Benetos
Fuentes B.
Goto M.
Lee C.-T.
Nakano M.
Nakano M.
Pertusa A.
Poliner G.
Ryynänen M.
Simon Dixon
Smaragdis P.
Smaragdis P.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/03/2013
Field of study

A method for automatic transcription of polyphonic music is proposed in this work that models the temporal evolution of musical tones. The model extends the shift-invariant probabilistic latent component analysis method by supporting the use of spectral templates that correspond to sound states such as attack, sustain, and decay. The order of these templates is controlled using hidden Markov model-based temporal constraints. In addition, the model can exploit multiple templates per pitch and instrument source. The shift-invariant aspect of the model makes it suitable for music signals that exhibit frequency modulations or tuning changes. Pitch-wise hidden Markov models are also utilized in a postprocessing step for note tracking. For training, sound state templates were extracted for various orchestral instruments using isolated note samples. The proposed transcription system was tested on multiple-instrument recordings from various datasets. Experimental results show that the proposed model is superior to a non-temporally constrained model and also outperforms various state-of-the-art transcription systems for the same experiment

City Research Online

Crossref

Melody Transcription From Music Audio: Approaches and Evaluation

Author: Andreas F. Ehmann
Beesuan Ong
Daniel P. W. Ellis
Emilia Gomez
Graham E. Poliner
Sebastian Streich
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref