Robust speech/non-speech detection based on LDAderived parameter and voicing parameter for speech recognition in noisy environments

Abstract

Abstract Every speech recognition system contains a speech/non-speech detection stage. Detected speech sequences are only passed through the speech recognition stage later on. In a very noisy environment, the noise detection stage is generally responsible for most of the recognition errors. Indeed, many detected noisy periods can be recognized as a vocabulary word. This manuscript provides solutions to improve the performance of a speech/non-speech detection system in very noisy environment (for both stationary and short-time energetic noise), with an application to the France Télécom system. The improvement we propose are threefold. First, noise reduction is considered in order to reduce stationary noise effects on the speech detection system. Then, in order to decrease detections of noise characterized by brief duration and high energy, two new versions of the speech/non-speech detection stage are proposed. On the one hand, a linear discriminate analysis algorithm applied to the Mel frequency cepstrum coefficients is incorporated in the speech/nonspeech detection algorithm. On the other hand, the use of a voicing parameter is introduced in the speech/non-speech detection in order to reduce the probability of false noise detections

    Similar works