Location of Repository

The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building

By Lukas Latacz, Wesley Mattheyses and Werner Verhelst

Abstract

In this paper we describe the voices we submitted to the 2010 Blizzard Challenge, a yearly challenge to evaluate auditory speech synthesis on common data. One of the goals of a datadriven synthesizer, such as ours, is to generalize the speech database in such a way that it allows a realistic rendition of unseen input text. The two main changes to our system, compared to previous submissions, are the inclusion of an HMM-based acoustic prosody model, and the automatic training of context-dependent target cost weights. These weights are estimated for each individual target during synthesis, and depend on the linguistic features of these targets which encompass their broader linguistic context. Another new aspect of our synthesizer is the ability to synthesize Mandarin Chinese speech. Its evaluation helps us assess the quality of our synthesizer for languages unfamiliar to the voice developers. Evaluation results and possible improvements to our synthesizer are also discussed. Index Terms: speech synthesis, unit selection, weight training, evaluatio

Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.193.9876
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.festvox.org/blizzar... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.