5 research outputs found

    Phase-Distortion-Robust Voice-Source Analysis

    Get PDF
    This work concerns itself with the analysis of voiced speech signals, in particular the analysis of the glottal source signal. Following the source-filter theory of speech, the glottal signal is produced by the vibratory behaviour of the vocal folds and is modulated by the resonances of the vocal tract and radiation characteristic of the lips to form the speech signal. As it is thought that the glottal source signal contributes much of the non-linguistic and prosodical information to speech, it is useful to develop techniques which can estimate and parameterise this signal accurately. Because of vocal tract modulation, estimating the glottal source waveform from the speech signal is a blind deconvolution problem which necessarily makes assumptions about the characteristics of both the glottal source and vocal tract. A common assumption is that the glottal signal and/or vocal tract can be approximated by a parametric model. Other assumptions include the causality of the speech signal: the vocal tract is assumed to be a minimum phase system while the glottal source is assumed to exhibit mixed phase characteristics. However, as the literature review within this thesis will show, the error criteria utilised to determine the parameters are not robust to the conditions under which the speech signal is recorded, and are particularly degraded in the common scenario where low frequency phase distortion is introduced. Those that are robust to this type of distortion are not well suited to the analysis of real-world signals. This research proposes a voice-source estimation and parameterisation technique, called the Power-spectrum-based determination of the Rd parameter (PowRd) method. Illustrated by theory and demonstrated by experiment, the new technique is robust to the time placement of the analysis frame and phase issues that are generally encountered during recording. The method assumes that the derivative glottal flow signal is approximated by the transformed Liljencrants-Fant model and that the vocal tract can be represented by an all-pole filter. Unlike many existing glottal source estimation methods, the PowRd method employs a new error criterion to optimise the parameters which is also suitable to determine the optimal vocal-tract filter order. In addition to the issue of glottal source parameterisation, nonlinear phase recording conditions can also adversely affect the results of other speech processing tasks such as the estimation of the instant of glottal closure. In this thesis, a new glottal closing instant estimation algorithm is proposed which incorporates elements from the state-of-the-art techniques and is specifically designed for operation upon speech recorded under nonlinear phase conditions. The new method, called the Fundamental RESidual Search or FRESS algorithm, is shown to estimate the glottal closing instant of voiced speech with superior precision and comparable accuracy as other existing methods over a large database of real speech signals under real and simulated recording conditions. An application of the proposed glottal source parameterisation method and glottal closing instant detection algorithm is a system which can analyse and re-synthesise voiced speech signals. This thesis describes perceptual experiments which show that, iunder linear and nonlinear recording conditions, the system produces synthetic speech which is generally preferred to speech synthesised based upon a state-of-the-art timedomain- based parameterisation technique. In sum, this work represents a movement towards flexible and robust voice-source analysis, with potential for a wide range of applications including speech analysis, modification and synthesis

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 4th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2005, held 29-31 October 2005, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Novel MRI Technologies for Structural and Functional Imaging of Tissues with Ultra-short Tâ‚‚ Values

    Get PDF
    Conventional MRI has several limitations such as long scan durations, motion artifacts, very loud acoustic noise, signal loss due to short relaxation times, and RF induced heating of electrically conducting objects. The goals of this work are to evaluate and improve the state-of-the-art methods for MRI of tissue with short Tâ‚‚, to prove the feasibility of in vivo Concurrent Excitation and Acquisition, and to introduce simultaneous electroglottography measurement during functional lung MRI

    Novel MRI Technologies for Structural and Functional Imaging of Tissues with Ultra-short T2 Values

    Get PDF
    Conventional MRI has several limitations such as long scan durations, motion artifacts, very high acoustic noise levels, signal loss due to short relaxation times, and RF induced heating of electrically conducting objects. The goals of this thesis are to evaluate state-of-the-art methods for MRI of tissue with short relaxation times, to prove the feasibility of CEA in a clinical MRI system, and to introduce a new electrophysiological measurement unit applied simultaneously with lung MRI
    corecore