Relevant acoustic features of speech signals for natural-to-shouted voice transformation

Abstract

International audienceHumans are able to estimate the distance to a talker solely by hearing the voice. Hence, the voice of a talker carries the distance information. Indeed, in order to ensure a good communication, humans ad- just their voice, mainly by adjusting his vocal effort, according to the talker-to-listener distance. This vocal effort modifies several parameters (especially the prosodic parameters) of the speech signals. The main goal of this work is to show that the use of voice transformation techniques allows to create a distance perception in radio communication systems. We hope to transform conversational voices (i.e. modal voice) into whispered voices, representing close-by interlocutors; and into shouted voices, representing far-out interlocutors. The main difficulty of this approach remains to find pertinent cues indicating the speaker's vocal effort. In this paper we describe the recording of a new database and their analysis, especially for high vocal efforts. Important cues seem to be the intensity dynamics and the fundamental frequency dynamics of the speech signal, and their absolute values

    Similar works

    Full text

    thumbnail-image

    Available Versions