502 research outputs found

    Model-based speech enhancement for hearing aids

    Get PDF

    Model-based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids

    Get PDF
    Speech intelligibility is often severely degraded among hearing impaired individuals in situations such as the cocktail party scenario. The performance of the current hearing aid technology has been observed to be limited in these scenarios. In this paper, we propose a binaural speech enhancement framework that takes into consideration the speech production model. The enhancement framework proposed here is based on the Kalman filter that allows us to take the speech production dynamics into account during the enhancement process. The usage of a Kalman filter requires the estimation of clean speech and noise short term predictor (STP) parameters, and the clean speech pitch parameters. In this work, a binaural codebook-based method is proposed for estimating the STP parameters, and a directional pitch estimator based on the harmonic model and maximum likelihood principle is used to estimate the pitch parameters. The proposed method for estimating the STP and pitch parameters jointly uses the information from left and right ears, leading to a more robust estimation of the filter parameters. Objective measures such as PESQ and STOI have been used to evaluate the enhancement framework in different acoustic scenarios representative of the cocktail party scenario. We have also conducted subjective listening tests on a set of nine normal hearing subjects, to evaluate the performance in terms of intelligibility and quality improvement. The listening tests show that the proposed algorithm, even with access to only a single channel noisy observation, significantly improves the overall speech quality, and the speech intelligibility by up to 15%.Comment: after revisio

    <strong>Non-Gaussian, Non-stationary and Nonlinear Signal Processing Methods - with Applications to Speech Processing and Channel Estimation</strong>

    Get PDF

    Speech Modeling and Robust Estimation for Diagnosis of Parkinson’s Disease

    Get PDF

    Reconstruction-based speech enhancement from robust acoustic features

    Get PDF
    This paper proposes a method of speech enhancement where a clean speech signal is reconstructed from a sinusoidal model of speech production and a set of acoustic speech features. The acoustic features are estimated from noisy speech and comprise, for each frame, a voicing classification (voiced, unvoiced or non-speech), fundamental frequency (for voiced frames) and spectral envelope. Rather than using different algorithms to estimate each parameter, a single statistical model is developed. This comprises a set of acoustic models and has similarity to the acoustic modelling used in speech recognition. This allows noise and speaker adaptation to be applied to acoustic feature estimation to improve robustness. Objective and subjective tests compare reconstruction-based enhancement with other methods of enhancement and show the proposed method to be highly effective at removing noise

    Non-Intrusive Speech Intelligibility Prediction

    Get PDF

    Single Channel Speech Enhancement using Kalman Filter

    Get PDF
    The quality and intelligibility of speech conversation are generally degraded by the surrounding noises. The main objective of speech enhancement (SE) is to eliminate or reduce such disturbing noises from the degraded speech. Various SE methods have been proposed in literature. Among them, the Kalman filter (KF) is known to be an efficient SE method that uses the minimum mean square error (MMSE). However, most of the conventional KF based speech enhancement methods need access to clean speech and additive noise information for the state-space model parameters, namely, the linear prediction coefficients (LPCs) and the additive noise variance estimation, which is impractical in the sense that in practice, we can access only the noisy speech. Moreover, it is quite difficult to estimate these model parameters efficiently in the presence of adverse environmental noises. Therefore, the main focus of this thesis is to develop single channel speech enhancement algorithms using Kalman filter, where the model parameters are estimated in noisy conditions. Depending on these parameter estimation techniques, the proposed SE methods are classified into three approaches based on non-iterative, iterative, and sub-band iterative KF. In the first approach, a non-iterative Kalman filter based speech enhancement algorithm is presented, which operates on a frame-by-frame basis. In this proposed method, the state-space model parameters, namely, the LPCs and noise variance, are estimated first in noisy conditions. For LPC estimation, a combined speech smoothing and autocorrelation method is employed. A new method based on a lower-order truncated Taylor series approximation of the noisy speech along with a difference operation serving as high-pass filtering is introduced for the noise variance estimation. The non-iterative Kalman filter is then implemented with these estimated parameters effectively. In order to enhance the SE performance as well as parameter estimation accuracy in noisy conditions, an iterative Kalman filter based single channel SE method is proposed as the second approach, which also operates on a frame-by-frame basis. For each frame, the state-space model parameters of the KF are estimated through an iterative procedure. The Kalman filtering iteration is first applied to each noisy speech frame, reducing the noise component to a certain degree. At the end of this first iteration, the LPCs and other state-space model parameters are re-estimated using the processed speech frame and the Kalman filtering is repeated for the same processed frame. This iteration continues till the KF converges or a maximum number of iterations is reached, giving further enhanced speech frame. The same procedure will repeat for the following frames until the last noisy speech frame being processed. For further improving the speech enhancement performance, a sub-band iterative Kalman filter based SE method is also proposed as the third approach. A wavelet filter-bank is first used to decompose the noisy speech into a number of sub-bands. To achieve the best trade-off among the noise reduction, speech intelligibility and computational complexity, a partial reconstruction scheme based on consecutive mean squared error (CMSE) is proposed to synthesize the low-frequency (LF) and highfrequency (HF) sub-bands such that the iterative KF is employed only to the partially reconstructed HF sub-band speech. Finally, the enhanced HF sub-band speech is combined with the partially reconstructed LF sub-band speech to reconstruct the full-band enhanced speech. Experimental results have shown that the proposed KF based SE methods are capable of reducing adverse environmental noises for a wide range of input SNRs, and the overall performance of the proposed methods in terms of different evaluation metrics is superior to some existing state-of-the art SE methods
    corecore