2 research outputs found

    Glottal source parametrisation by multi-estimate fusion

    Get PDF
    Glottal source information has been proven useful in many applications such as speech synthesis, speaker characterisation, voice transformation and pathological speech diagnosis. However, currently no single algorithm can extract reliable glottal source estimates across a wide range of speech signals. This thesis describes an investigation into glottal source parametrisation, including studies, proposals and evaluations on glottal waveform extraction, glottal source modelling by Liljencrants-Fant (LF) model fitting and a new multi-estimate fusion framework. As one of the critical steps in voice source parametrisation, glottal waveform extraction techniques are reviewed. A performance study is carried out on three existing glottal inverse filtering approaches and results confirm that no single algorithm consistently outperforms others and provide a reliable and accurate estimate for different speech signals. The next step is modelling the extracted glottal flow. To more accurately estimate the glottal source parameters, a new time-domain LF-model fitting algorithm by extended Kalman filter is proposed. The algorithm is evaluated by comparing it with a standard time-domain method and a spectral approach. Results show the proposed fitting method is superior to existing fitting methods. To obtain accurate glottal source estimates for different speech signals, a multi-estimate (ME) fusion framework is proposed. In the framework different algorithms are applied in parallel to extract multiple sets of LF-model estimates which are then combined by quantitative data fusion. The ME fusion approach is implemented and tested in several ways. The novel fusion framework is shown to be able to give more reliable glottal LF-model estimates than any single algorithm
    corecore