537 research outputs found

    The interaction between articulation and tones in Cantonese

    Get PDF
    "A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2009."Thesis (B.Sc)--University of Hong Kong, 2009.Includes bibliographical references (p. 27-30).published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    The perception of intonation questions and statements in Cantonese

    Get PDF
    In tone languages there are potential conflicts in the perception of lexical tone and intonation, as both depend mainly on the differences in fundamental frequency (F0) patterns. The present study investigated the acoustic cues associated with the perception of sentences as questions or statements in Cantonese, as a function of the lexical tone in sentence final position. Cantonese listeners performed intonation identification tasks involving complete sentences, isolated final syllables, and sentences without the final syllable (carriers). Sensitivity (d′ scores) were similar for complete sentences and final syllables but were significantly lower for carriers. Sensitivity was also affected by tone identity. These findings show that the perception of questions and statements relies primarily on the F0 characteristics of the final syllables (local F0 cues). A measure of response bias (c) provided evidence for a general bias toward the perception of statements. Logistic regression analyses showed that utterances were accurately classified as questions or statements by using average F0 and F0 interval. Average F0 of carriers (global F0 cue) was also found to be a reliable secondary cue. These findings suggest that the use of F0 cues for the perception of intonation question in tonal languages is likely to be language-specific. © 2011 Acoustical Society of America.published_or_final_versio

    Pre-Low Raising in Japanese Pitch Accent

    Get PDF
    Japanese has been observed to have 2 versions of the H tone, the higher of which is associated with an accented mora. However, the distinction of these 2 versions only surfaces in context but not in isolation, leading to a long-standing debate over whether there is 1 H tone or 2. This article reports evidence that the higher version may result from a pre-low raising mechanism rather than being inherently higher. The evidence is based on an analysis of F0 of words that varied in length, accent condition and syllable structure, produced by native speakers of Japanese at 2 speech rates. The data indicate a clear separation between effects that are due to mora-level preplanning and those that are mechanical. These results are discussed in terms of mechanisms of laryngeal control during tone production, and highlight the importance of articulation as a link between phonology and surface acoustics.postprin

    The effects of Cantonese tones on vocal attack time (VAT)

    Get PDF
    "A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2009."Includes bibliographical references (p. 27-30).Thesis (B.Sc)--University of Hong Kong, 2009.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Effect of tones on voice onset time (VOT) in Cantonese aspirated stops

    Get PDF
    "A dissertation submitted in partial fulfillment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2010."Includes bibliographical references (p. 22-24).Thesis (B.Sc)--University of Hong Kong, 2010.The study investigated the possible interaction between VOT values associated with aspirated stops produced at six different lexical tones (high falling, high rising, mid level, mid-low falling, mid-low rising and mid-low level) in Cantonese. A total of 27 male Cantonese speakers were recruited and they were instructed to read phrases containing targeted CV syllables formed by the aspirated Cantonese stops (/ph/, /th/, and /kh/) and the vowel /a/ at the six tones. VOT analysis revealed that, across aspirated stops, tones in the upper tone register produced shorter VOT while those in the lower tone register had longer VOT values. In particular, mid-low rising tone showed the longest VOT than all other tones. This finding indicated an interaction between VOT and tone during Cantonese stop production is confirmed.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Prosody analysis and modeling for Cantonese text-to-speech.

    Get PDF
    Li Yu Jia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references.Abstracts in English and Chinese.Chapter Chapter 1 --- Introduction --- p.1Chapter 1.1. --- TTS Technology --- p.1Chapter 1.2. --- Prosody --- p.2Chapter 1.2.1. --- What is Prosody --- p.2Chapter 1.2.2. --- Prosody from Different Perspectives --- p.3Chapter 1.2.3. --- Acoustical Parameters of Prosody --- p.3Chapter 1.2.4. --- Prosody in TTS --- p.5Chapter 1.2.4.1 --- Analysis --- p.5Chapter 1.2.4.2 --- Modeling --- p.6Chapter 1.2.4.3 --- Evaluation --- p.6Chapter 1.3. --- Thesis Objectives --- p.7Chapter 1.4. --- Thesis Outline --- p.7Reference --- p.8Chapter Chapter 2 --- Cantonese --- p.9Chapter 2.1. --- The Cantonese Dialect --- p.9Chapter 2.1.1. --- Phonology --- p.10Chapter 2.1.1.1 --- Initial --- p.11Chapter 2.1.1.2 --- Final --- p.12Chapter 2.1.1.3 --- Tone --- p.13Chapter 2.1.2. --- Phonological Constraints --- p.14Chapter 2.2. --- Tones in Cantonese --- p.15Chapter 2.2.1. --- Tone System --- p.15Chapter 2.2.2. --- Linguistic Significance --- p.18Chapter 2.2.3. --- Acoustical Realization --- p.18Chapter 2.3. --- Prosodic Variation in Continuous Cantonese Speech --- p.20Chapter 2.4. --- Cantonese Speech Corpus - CUProsody --- p.21Reference --- p.23Chapter Chapter 3 --- F0 Normalization --- p.25Chapter 3.1. --- F0 in Speech Production --- p.25Chapter 3.2. --- F0 Extraction --- p.27Chapter 3.3. --- Duration-normalized Tone Contour --- p.29Chapter 3.4. --- F0 Normalization --- p.30Chapter 3.4.1. --- Necessity and Motivation --- p.30Chapter 3.4.2. --- F0 Normalization --- p.33Chapter 3.4.2.1 --- Methodology --- p.33Chapter 3.4.2.2 --- Assumptions --- p.34Chapter 3.4.2.3 --- Estimation of Relative Tone Ratios --- p.35Chapter 3.4.2.4 --- Derivation of Phrase Curve --- p.37Chapter 3.4.2.5 --- Normalization of Absolute FO Values --- p.39Chapter 3.4.3. --- Experiments and Discussion --- p.39Chapter 3.5. --- Conclusions --- p.44Reference --- p.45Chapter Chapter 4 --- Acoustical FO Analysis --- p.48Chapter 4.1. --- Methodology of FO Analysis --- p.48Chapter 4.1.1. --- Analysis-by-Synthesis --- p.48Chapter 4.1.2. --- Acoustical Analysis --- p.51Chapter 4.2. --- Acoustical FO Analysis for Cantonese --- p.52Chapter 4.2.1. --- Analysis of Phrase Curves --- p.52Chapter 4.2.2. --- Analysis of Tone Contours --- p.55Chapter 4.2.2.1 --- Context-independent Single-tone Contours --- p.56Chapter 4.2.2.2 --- Contextual Variation --- p.58Chapter 4.2.2.3 --- Co-articulated Tone Contours of Disyllabic Word --- p.59Chapter 4.2.2.4 --- Cross-word Contours --- p.62Chapter 4.2.2.5 --- Phrase-initial Tone Contours --- p.65Chapter 4.3. --- Summary --- p.66Reference --- p.67Chapter Chapter5 --- Prosody Modeling for Cantonese Text-to-Speech --- p.70Chapter 5.1. --- Parametric Model and Non-parametric Model --- p.70Chapter 5.2. --- Cantonese Text-to-Speech: Baseline System --- p.72Chapter 5.2.1. --- Sub-syllable Unit --- p.72Chapter 5.2.2. --- Text Analysis Module --- p.73Chapter 5.2.3. --- Acoustical Synthesis --- p.74Chapter 5.2.4. --- Prosody Module --- p.74Chapter 5.3. --- Enhanced Prosody Model --- p.74Chapter 5.3.1. --- Modeling Tone Contours --- p.75Chapter 5.3.1.1 --- Word-level FO Contours --- p.76Chapter 5.3.1.2 --- Phrase-initial Tone Contours --- p.77Chapter 5.3.1.3 --- Tone Contours at Word Boundary --- p.78Chapter 5.3.2. --- Modeling Phrase Curves --- p.79Chapter 5.3.3. --- Generation of Continuous FO Contours --- p.81Chapter 5.4. --- Summary --- p.81Reference --- p.82Chapter Chapter 6 --- Performance Evaluation --- p.83Chapter 6.1. --- Introduction to Perceptual Test --- p.83Chapter 6.1.1. --- Aspects of Evaluation --- p.84Chapter 6.1.2. --- Methods of Judgment Test --- p.84Chapter 6.1.3. --- Problems in Perceptual Test --- p.85Chapter 6.2. --- Perceptual Tests for Cantonese TTS --- p.86Chapter 6.2.1. --- Intelligibility Tests --- p.86Chapter 6.2.1.1 --- Method --- p.86Chapter 6.2.1.2 --- Results --- p.88Chapter 6.2.1.3 --- Analysis --- p.89Chapter 6.2.2. --- Naturalness Tests --- p.90Chapter 6.2.2.1 --- Word-level --- p.90Chapter 6.2.2.1.1 --- Method --- p.90Chapter 6.2.2.1.2 --- Results --- p.91Chapter 6.2.3.1.3 --- Analysis --- p.91Chapter 6.2.2.2 --- Sentence-level --- p.92Chapter 6.2.2.2.1 --- Method --- p.92Chapter 6.2.2.2.2 --- Results --- p.93Chapter 6.2.2.2.3 --- Analysis --- p.94Chapter 6.3. --- Conclusions --- p.95Chapter 6.4. --- Summary --- p.95Reference --- p.96Chapter Chapter 7 --- Conclusions and Future Work --- p.97Chapter 7.1. --- Conclusions --- p.97Chapter 7.2. --- Suggested Future Work --- p.99Appendix --- p.100Appendix 1 Linear Regression --- p.100Appendix 2 36 Templates of Cross-word Contours --- p.101Appendix 3 Word List for Word-level Tests --- p.102Appendix 4 Syllable Occurrence in Word List of Intelligibility Test --- p.108Appendix 5 Wrongly Identified Word List --- p.112Appendix 6 Confusion Matrix --- p.115Appendix 7 Unintelligible Word List --- p.117Appendix 8 Noisy Word List --- p.119Appendix 9 Sentence List for Naturalness Test --- p.12

    How tone, intonation and emotion shape the development of infants' fundamental frequency perception

    Get PDF
    Fundamental frequency (ƒ0), perceived as pitch, is the first and arguably most salient auditory component humans are exposed to since the beginning of life. It carries multiple linguistic (e.g., word meaning) and paralinguistic (e.g., speakers’ emotion) functions in speech and communication. The mappings between these functions and ƒ0 features vary within a language and differ cross-linguistically. For instance, a rising pitch can be perceived as a question in English but a lexical tone in Mandarin. Such variations mean that infants must learn the specific mappings based on their respective linguistic and social environments. To date, canonical theoretical frameworks and most empirical studies do not view or consider the multi-functionality of ƒ0, but typically focus on individual functions. More importantly, despite the eventual mastery of ƒ0 in communication, it is unclear how infants learn to decompose and recognize these overlapping functions carried by ƒ0. In this paper, we review the symbioses and synergies of the lexical, intonational, and emotional functions that can be carried by ƒ0 and are being acquired throughout infancy. On the basis of our review, we put forward the Learnability Hypothesis that infants decompose and acquire multiple ƒ0 functions through native/environmental experiences. Under this hypothesis, we propose representative cases such as the synergy scenario, where infants use visual cues to disambiguate and decompose the different ƒ0 functions. Further, viable ways to test the scenarios derived from this hypothesis are suggested across auditory and visual modalities. Discovering how infants learn to master the diverse functions carried by ƒ0 can increase our understanding of linguistic systems, auditory processing and communication functions

    Effect of tone on vocal attack time in Cantonese-speaking children

    Get PDF
    Cantonese tones were shown to have significant effect on vocal attack time (VAT) in adult Cantonese speakers, with males produced greater VAT values than females (Ma et al., 2012). The present study aims to investigate the effect of tone on VAT in Cantonese-speaking children. Sound pressure (SP) and electroglottographic (EGG) recordings were collected from 55 native Cantonese-speaking children. 26 six-year-old and 29 nine-year-old children were asked to read aloud six monosyllabic or disyllabic words which contained all the Cantonese tones. One word was presented at the same time and children were asked to read the word immediately after the presentation. Results revealed significant differences between some contour tone (tone 2 and tone 4) and level tone (tone 1) pairs. Age and gender showed no significant effect on VAT values. Children demonstrated a clear different VAT profile compared with Cantonese adult speakers. The results support the idea that contour tones require more complicated pre-phonatory laryngeal settings. Different VAT patterns between children and adults suggest that they adopt different laryngeal adjustment strategies during phonation onset.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning

    Get PDF
    Variability has been one of the major challenges for both theoretical understanding and computer synthesis of speech prosody. In this paper we show that economical representation of variability is the key to effective modeling of prosody. Specifically, we report the development of PENTAtrainer—A trainable yet deterministic prosody synthesizer based on an articulatory–functional view of speech. We show with testing results on Thai, Mandarin and English that it is possible to achieve high-accuracy predictive synthesis of fundamental frequency contours with very small sets of parameters obtained through stochastic learning from real speech data. The first key component of this system is syllable-synchronized sequential target approximation—implemented as the qTA model, which is designed to simulate, for each tonal unit, a wide range of contextual variability with a single invariant target. The second key component is the automatic learning of function-specific targets through stochastic global optimization, guided by a layered pseudo-hierarchical functional annotation scheme, which requires the manual labeling of only the temporal domains of the functional units. The results in terms of synthesis accuracy demonstrate that effective modeling of the contextual variability is the key also to effective modeling of function-related variability. Additionally, we show that, being both theory-based and trainable (hence data-driven), computational systems like PENTAtrainer can serve as an effective modeling tool in basic research, with which the level of falsifiability in theory testing can be raised, and also a closer link between basic and applied research in speech science can be developed

    Crosslinguistic trends in tone change A review of tone change studies in East and Southeast Asia

    Get PDF
    Ground-breaking studies on how Bangkok Thai tones have changed over the past 100 years (Pittayaporn 2007, 2018; Zhu et al. 2015) reveal a pattern that Zhu et al. (2015) term the “clockwise tone shift cycle:” low > falling > high level or rising-falling > rising > falling-rising or low. The present study addresses three follow-up questions: (1) Are tone changes like those seen in Bangkok Thai also attested in other languages? (2) What other tone changes are repeated across multiple languages? (3) What phonetic biases are most likely to be the origins of the reported changes? A typological review of 52 tone change studies across 45 Sinitic, Tai-Kadai, Hmong-Mien, and Tibeto-Burman languages reveals that clockwise changes are by far the most common. The paper concludes by exploring how tonal truncation (Xu 2017) generates synchronic variation that matches the diachronic patterns; this suggests that truncation is a key mechanism in tone change
    • …
    corecore