Search CORE

4 research outputs found

Recommended from our members

A Shift-Invariant Latent Variable Model for Automatic Music Transcription

Author: Bay M.
Benetos E.
Dessein A.
Emmanouil Benetos
Fuentes B.
Goto M.
Grindlay G.
Mysore G.
Poliner G.
Schörkhuber C.
Simon Dixon
Smaragdis P.
Publication venue: 'MIT Press - Journals'
Publication date: 01/12/2012
Field of study

In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics

City Research Online

Crossref

Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription

Author: Emmanouil Benetos
Simon Dixon
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated

CiteSeerX

City Research Online

Crossref

Research on Effective Designs and Evaluation for Speech Interface Systems

Author: 西本卓也
Publication venue: [出版者不明]
Publication date: 01/02/2011
Field of study

制度:新 ; 報告番号:乙2305号 ; 学位の種類:博士(工学) ; 授与年月日:2011/2/25 ; 早大学位記番号:新564

Waseda University Repository

Combined audio and video analysis for guitar chord identification

Author: Hrybyk Alex
Publication venue: Drexel University
Publication date
Field of study

This thesis presents a multi-modal approach to automatically identifying guitar chords using audio and video of the performer. Chord identi cation is typically performed by analyzing the audio, using a chroma based feature to extract pitch class information, then identifying the chord with the appropriate label. Even if this method proves perfectly accurate, stringed instruments add extra ambiguity as a single chord or melody may be played in di erent positions on the fretboard. Preserving this information is important, because it signi es the original ngering, and implied \easiest" way to perform the selection. This chord identi cation system combines analysis of audio to determine the general chord scale (i.e. A major, G minor), and video of the guitarist to determine chord voicing (i.e. open, barred, inversion), to accurately identify the guitar chord.M.S., Electrical Engineering -- Drexel University, 201

Drexel Libraries E-Repository and Archives