2 research outputs found

    Glottal waveform synthesis with Volterra shaping functions

    No full text
    We recently proposed an input-output model of the glottal pulse. Mathematically speaking, the pulse is broken down into a cosinusoidal input signal and a pair of nonlinear shaping functions. The pulse is recovered when the cosinusoid is put through the shapers. In this article, it is shown that the cycles of a speaker's glottal waveform can be synthesized with the shaping functions of a small number of reference cycles. Indeed, nonlinear systems are not described by a transfer function. Therefore, it may be assumed that the nonlinear shaping functions of a glottal pulse are less variable than the shape of the pulse itself. Two experiments were carried out to test this assumption. In a first, the output static waveforms from a two-mass model of the vocal folds were copied. In a second, the glottis signal that was obtained from a logatome [ama] spoken by a male speaker was analyzed and synthesized. Each pulse was characterized by its peak amplitude, period and form factor. In both experiments, the features of all the glottal pulses could be copied by calculating the shaper coefficients of just two reference pulse and by adjusting the control parameters of the driving cosinusoid till the output of the shaper exhibited the desired feature values. © 1992.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    An investigation into glottal waveform based speech coding

    Get PDF
    Coding of voiced speech by extraction of the glottal waveform has shown promise in improving the efficiency of speech coding systems. This thesis describes an investigation into the performance of such a system. The effect of reverberation on the radiation impedance at the lips is shown to be negligible under normal conditions. Also, the accuracy of the Image Method for adding artificial reverberation to anechoic speech recordings is established. A new algorithm, Pre-emphasised Maximum Likelihood Epoch Detection (PMLED), for Glottal Closure Instant detection is proposed. The algorithm is tested on natural speech and is shown to be both accurate and robust. Two techniques for giottai waveform estimation, Closed Phase Inverse Filtering (CPIF) and Iterative Adaptive Inverse Filtering (IAIF), are compared. In tandem with an LF model fitting procedure, both techniques display a high degree of accuracy However, IAIF is found to be slightly more robust. Based on these results, a Glottal Excited Linear Predictive (GELP) coding system for voiced speech is proposed and tested. Using a differential LF parameter quantisation scheme, the system achieves speech quality similar to that of U S Federal Standard 1016 CELP at a lower mean bit rate while incurring no extra delay
    corecore