2 research outputs found

    Iterative Separation of Note Events from Single-Channel Polyphonic Recordings

    Get PDF
    This thesis is concerned with the separation of audio sources from single-channel polyphonic musical recordings using the iterative estimation and separation of note events. Each event is defined as a section of audio containing largely harmonic energy identified as coming from a single sound source. Multiple events can be clustered to form separated sources. This solution is a model-based algorithm that can be applied to a large variety of audio recordings without requiring previous training stages. The proposed system embraces two principal stages. The first one considers the iterative detection and separation of note events from within the input mixture. In every iteration, the pitch trajectory of the predominant note event is automatically selected from an array of fundamental frequency estimates and used to guide the separation of the event's spectral content using two different methods: time-frequency masking and time-domain subtraction. A residual signal is then generated and used as the input mixture for the next iteration. After convergence, the second stage considers the clustering of all detected note events into individual audio sources. Performance evaluation is carried out at three different levels. Firstly, the accuracy of the note-event-based multipitch estimator is compared with that of the baseline algorithm used in every iteration to generate the initial set of pitch estimates. Secondly, the performance of the semi-supervised source separation process is compared with that of another semi-automatic algorithm. Finally, a listening test is conducted to assess the audio quality and naturalness of the separated sources when they are used to create stereo mixes from monaural recordings. Future directions for this research focus on the application of the proposed system to other music-related tasks. Also, a preliminary optimisation-based approach is presented as an alternative method for the separation of overlapping partials, and as a high resolution time-frequency representation for digital signals
    corecore