2 research outputs found
Deep Polyphonic ADSR Piano Note Transcription
We investigate a late-fusion approach to piano transcription, combined with a
strong temporal prior in the form of a handcrafted Hidden Markov Model (HMM).
The network architecture under consideration is compact in terms of its number
of parameters and easy to train with gradient descent. The network outputs are
fused over time in the final stage to obtain note segmentations, with an HMM
whose transition probabilities are chosen based on a model of attack, decay,
sustain, release (ADSR) envelopes, commonly used for sound synthesis. The note
segments are then subject to a final binary decision rule to reject too weak
note segment hypotheses. We obtain state-of-the-art results on the MAPS
dataset, and are able to outperform other approaches by a large margin, when
predicting complete note regions from onsets to offsets.Comment: 5 pages, 2 figures, published as ICASSP'1
A PARALLEL FUSION APPROACH TO PIANO MUSIC TRANSCRIPTION BASED ON CONVOLUTIONAL NEURAL NETWORK
In this paper, a supervised approach based on Convolutional Neural Networks (CNN) for polyphonic piano transcription is presented. The system consists of pitch detection model, onset/offset detection model, and note search model. The pitch detection model is a single-channel CNN predicting the probabilities of pitches contained in one frame of the audio. The onset/offset model based on dual-channel CNN is used for estimating the probabilities of each pitch's onset or offset in a frame. The note search model is rule-based; it integrates the outputs of the pitch model and onset/offset model to determine the final onset, offset and pitch of notes in audio. Two experiments with different dataset conditions are accomplished to compare with state-of-the-art approaches on the same datasets. Experimental results reveal that the proposed approach preforms better in both frame- and note-based metrics