Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

King, S.; Maia, R.; Valentini-Botinhao, C.; Yamagishi, J.; Zen, H.

research

Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

Authors: S. King
R. Maia
C. Valentini-Botinhao
J. Yamagishi
H. Zen
Publication date: 1 January 2012
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise. Index Terms — cepstral coefficient extraction, objective measure for speech intelligibility, Lombard speech, HMM-based speech synthesis 1