2 research outputs found
Incorporating Duration and Intonation Models in Filipino Speech Synthesis
In this paper we describe the development of an intonation model and a duration model to generate prosody for the Filipino language. Z-scores of normalized durations are used for the duration model and the Tilt parameters are used for the intonation model. The Filipino Speech Corpus (FSC) is the source of statistical data for modeling the duration and intonation. A Classification and Regression Tree (CART) generator is used to build the model for duration and intonation. The Harmonic plus Noise Model (HNM) is developed for the FSC. The diphones are concatenated to produce the synthetic speech and HNM is used to modify the prosody. The synthesized speech is evaluated using the Mean Opinion Score (MOS). Results show that the duration model and the intonation model needs improvement. HNM synthesis performs slightly better than TD-PSOLA (time-domain pitch synchronous overlap-add).APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Speech and Music Processing (5 October 2009)