Speech enhancement and psychoacoustics

Abstract

International audienceThe purpose of this presentation is to describe why and how psychoacoustic models are used to design speech enhancement systems capable of improving speech intelligibility in presence of noise (for mobile communication applications, for instance) or capable of denoising sufficiently well noisy speech signals so as to improve the recognition rate of some automatic speech recognizer (for instance, robot monitoring in noisy environment, hand-free applications on board of vehicles, military fastjets and helicopters). To begin with, standard methods aimed at denoising speech signals are performed in the spectral domain without taking into account the perceptual characteristics of the speech signal to enhance. They succeed in improving the Signal to Noise Ratio (SNR) but return annoying and unpleasant residual noise known as musical noise. In the last few decades, psychoacoustic models have then attracted a great deal of interest. The objective is to improve the perceptual quality of the enhanced speech signal. The psychoacoustic model is used to control the enhancement process in order to find the best trade-off between noise reduction, residual noise and speech distortion. The masking phenomenon is the main human auditory system property which is used to design perceptually motivated speech enhancement systems

    Similar works