3 research outputs found

    Integrated approaches to prosodic word prediction for Chinese TTS

    Get PDF
    We focus on integrated prosodic word prediction for Chinese TTS. To avoid the problem of inconsistency between lexical words and prosodic words in Chinese, lexical word segmentation and prosodic word prediction are taken as one process instead of two independent tasks. Furthermore, two word-based approaches are proposed to drive this integrated prosodic word prediction: The first one follows the notion of lexicalized hidden Markov models, and the second one is borrowed from unknown word identification for Chinese. The results of our primary experiment show these integrated approaches are effective.published_or_final_versio

    Functional timing or rhythmical timing, or both? A corpus study of English and Mandarin duration

    Get PDF
    It has been long held that languages of the world are divided into rhythm classes so that they are either stress-timed, syllable-timed or mora-timed. It is also known for a long time that duration serves various informational functions in speech. But it is unclear whether these two kinds of uses of duration are complementary to each other, or they are actually one and the same. There has been much empirical research that raises questions about the rhythm class hypothesis due to lack of evidence of the suggested isochrony in any language. Yet the alleged cross-language rhythm classification is still widely taken for granted and continues to be researched. Here we conducted a corpus study of English, an archetype of a stress-timed language, and Mandarin, an alleged syllable-timed language, to look for evidence of at least a tendency toward isochrony when much of the informational use of duration is controlled for. We examined the relationship between segment and syllable duration and the relationship of syllable and phrase duration in the two languages. The results show that in English syllables are largely incompressible to allow stress-timing because segment duration is inflexible to allow variable syllable duration beyond its functional use. Surprisingly, Mandarin does show a small tendency toward both equal syllable duration and equal phrase duration. Additionally, the duration of pre-boundary syllables in English increases linearly with break index, whereas in Mandarin, the duration increase stops after break index 2, which is accompanied by the insertion of silent pauses. We conclude, therefore, timing and duration in speech are predominantly used for encoding information rather being controlled by a rhythmic principle, and the residual equal-duration tendency in the two languages examined here show exactly the opposite patterns from the predictions of the rhythm class hypothesis
    corecore