Location of Repository

V Jornadas en Tecnología del Habla FLEXIBLE HARMONIC/STOCHASTIC MODELING FOR HMM-BASED SPEECH SYNTHESIS

By Eleftherios Banos, Daniel Erro, Antonio Bonafonte and Asuncion Moreno

Abstract

In this paper the preliminary results, of a new approach on speech modeling for statistical parametric HMM-based speech synthesis are presented. The proposed system is based on a flexible pitch-asynchronous harmonic/stochastic model (HSM) [1]. The speech is modeled as the superposition of two components: a harmonic component and a stochastic or aperiodic component. The fact that the synthesis model is pitch-asynchronous allows the direct integration to a HMM-based synthesis system. HTS [2], an open source software toolkit that provides HMM-based speech synthesis was used. The proposed HSM method was compared to the HTS baseline system with the same configurations and database. A number of different experiments were conducted. Results show that high quality of synthesized utterances is reached. A small perceptual test was carried out comparing the two systems on quality of the synthetic voice and similarity to the original voice. HSM outperforms the HTS baseline system in the quality test: HSM 53 %, HTS35,3%,and undecided 11,7%. Concerning similarity to the original voice, HSM-performed slightly better than HTS: HSM 35,3%,HTS 29,4%, and undecided 35,3%. 1

Year: 2010
OAI identifier: oai:CiteSeerX.psu:10.1.1.161.1187
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://gps-tsc.upc.es/veu/rese... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.