Search CORE

16,214 research outputs found

Text-based and Signal-based Prediction of Break Indices and Pause Durations

Author: Pfitzinger Hartmut R.
Reichel Uwe D.
Publication venue
Publication date: 01/01/2006
Field of study

The relation between symbolic and signal features of prosodic boundaries is experimentally studied using prediction methods. Text-based break index prediction turns out to be fairly good, but signal-based prediction and pause duration prediction perform worse. A possible reason is that random signal feature variations, as usually produced by humans, are hard to predict

CiteSeerX

Open Access LMU

Reducing Audible Spectral Discontinuities

Author: Klabbers Esther
Veldhuis Raymond
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2001
Field of study

In this paper, a common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is most likely the cause of this phenomenon.We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities

Crossref

Pure OAI Repository

University of Twente Research Information

The UPC Text-to-Speech System for Spanish and Catalan

Author: Bonafonte Cávez Antonio
Esquerra Llucià Ignasi
Febrer A
Rodríguez Fonollosa José Adrián
Vallverdú Bayés Sisco
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/1998
Field of study

This paper summarizes the text-to-speech system that has been developed in the Speech Group of the Universitat Politècnica de Catalunya (UPC). The system is composed of a core and different interfaces so that it is compatible for research, for telephone applications (either CTI boards or standard ISDN PC cards supporting CAPI), and Windows applications developed using Microsoft SAPI. The paper reviews the system making emphasis in the parts of the system which are language dependent and which allow the reading of bilingual text (Spanish and Catalan). The paper also presents new approaches in prosodic modeling (segmental duration modeling) and generation of the database of speech segments, which have been introduced last year.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC