16,214 research outputs found

    Text-based and Signal-based Prediction of Break Indices and Pause Durations

    Get PDF
    The relation between symbolic and signal features of prosodic boundaries is experimentally studied using prediction methods. Text-based break index prediction turns out to be fairly good, but signal-based prediction and pause duration prediction perform worse. A possible reason is that random signal feature variations, as usually produced by humans, are hard to predict

    Reducing Audible Spectral Discontinuities

    Get PDF
    In this paper, a common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is most likely the cause of this phenomenon.We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities

    The UPC Text-to-Speech System for Spanish and Catalan

    Get PDF
    This paper summarizes the text-to-speech system that has been developed in the Speech Group of the Universitat Politècnica de Catalunya (UPC). The system is composed of a core and different interfaces so that it is compatible for research, for telephone applications (either CTI boards or standard ISDN PC cards supporting CAPI), and Windows applications developed using Microsoft SAPI. The paper reviews the system making emphasis in the parts of the system which are language dependent and which allow the reading of bilingual text (Spanish and Catalan). The paper also presents new approaches in prosodic modeling (segmental duration modeling) and generation of the database of speech segments, which have been introduced last year.Peer ReviewedPostprint (published version
    • …
    corecore