986 research outputs found

    Tone classification of syllable -segmented Thai speech based on multilayer perceptron

    Get PDF
    Thai is a monosyllabic and tonal language. Thai makes use of tone to convey lexical information about the meaning of a syllable. Thai has five distinctive tones and each tone is well represented by a single F0 contour pattern. In general, a Thai syllable with a different tone has a different lexical meaning. Thus, to completely recognize a spoken Thai syllable, a speech recognition system has not only to recognize a base syllable but also to correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system.;In this study, a tone classification of syllable-segmented Thai speech which incorporates the effects of tonal coarticulation, stress and intonation was developed. Automatic syllable segmentation, which performs the segmentation on the training and test utterances into syllable units, was also developed. The acoustical features including fundamental frequency (F0), duration, and energy extracted from the processing syllable and neighboring syllables were used as the main discriminating features. A multilayer perceptron (MLP) trained by backpropagation method was employed to classify these features. The proposed system was evaluated on 920 test utterances spoken by five male and three female Thai speakers who also uttered the training speech. The proposed system achieved an average accuracy rate of 91.36%

    Toward an integrative model of talker normalization

    Get PDF
    2015-2016 > Academic research: refereed > Publication in refereed journal201804_a bcwhAccepted ManuscriptPublishe

    The role of lexical tone in spoken word recognition of Chinese

    Get PDF
    The present study used a direct priming task in order to investigate the nature and processing of tonal information in spoken word recognition of Chinese. Two experiments were conducted. In Experiment 1, prime-target pairs contrasted in terms of tonal and segmental overlap. Experiment 1 replicated the first experiment of C.-Y. Lee's (2007) study but with a significant modification that balanced tonal information in prime-target pairs. Forty-eight monosyllabic Mandarin target words were paired with four types of primes in which prime and target were identical (e.g., bo1- bo1), shared only segmental information (e.g., bo1 -bo2), shared only tonal information (e.g., bo1 -zhua1) or were unrelated (e.g., bo1 -man3). Experiment 2 extended the prime-target paradigm to include minimal segmental overlap in onset and in offset portion. Forty-eight monosyllabic Mandarin target words were paired with four types of primes in which prime and target were identical (e.g., bo1- bo1), shared tonal and only onset segmental information (e.g., bo1 -bin1), shared tonal and only offset segmental information (e.g., bo1 -po1) or were unrelated (e.g., bo1 -man3). The results of Experiment 1 showed that the facilitation effect was found when the prime-target pairs were identical or segmental structure overlapped compared to conditions where the prime-target pairs only overlapped in tone or were unrelated. Effects of similarity of tone across prime-target segmental pairs were also analyzed. The results of Experiment 2 showed that the facilitation effect was only found when the prime-target pairs were identical. Partial segmental overlap in conjunction with tone resulted in inhibition compared to an unrelated control. Together, these data indicate that segmental information can facilitate word recognition, with segmental information carrying more weight than tonal information in the processing of spoken Chinese

    Tones in whispered Cantonese

    Get PDF
    Includes bibliographical references (p. 28-30)."A dissertation submitted in partial fulfillment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2010."Thesis (B.Sc)--University of Hong Kong, 2010.Acoustic analysis and perceptual experiments were carried out to investigate the acoustical characteristics of tones in whispered Cantonese and to identify possible perceptual cues for tone identification. The isolated vowel /a/ embedded in a framing sentence produced by 20 (10 male and 10 female) native Cantonese speakers using modal and whispered phonation was recorded. Formant frequencies, duration and intensity of the vowels were measured from the samples using signal analysis software. During tone identification tasks, the speech samples were presented to 20 listeners who were native Cantonese speakers. The listeners were instructed to identify the tone of the target vowels in the presented sentences, based on which percent correct identification of tones was calculated. Results of the study reveal the role of second formant, duration, average intensity and intensity contours in perception of Cantonese whispered tones. Speaker’s maneuvers in production of whispered tones were also discussed.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Large vocabulary Cantonese speech recognition using neural networks.

    Get PDF
    Tsik Chung Wai Benjamin.Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.Includes bibliographical references (leaves 67-70).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Automatic Speech Recognition --- p.1Chapter 1.2 --- Cantonese Speech Recognition --- p.3Chapter 1.3 --- Neural Networks --- p.4Chapter 1.4 --- About this Thesis --- p.5Chapter 2 --- The Phonology of Cantonese --- p.6Chapter 2.1 --- The Syllabic Structure of Cantonese Syllable --- p.7Chapter 2.2 --- The Tone System of Cantonese --- p.9Chapter 3 --- Review of Automatic Speech Recognition Systems --- p.12Chapter 3.1 --- Hidden Markov Model Approach --- p.12Chapter 3.2 --- Neural Networks Approach --- p.13Chapter 3.2.1 --- Multi-Layer Perceptrons (MLP) --- p.13Chapter 3.2.2 --- Time-Delay Neural Networks (TDNN) --- p.15Chapter 3.2.3 --- Recurrent Neural Networks --- p.17Chapter 3.3 --- Integrated Approach --- p.18Chapter 3.4 --- Mandarin and Cantonese Speech Recognition Systems --- p.19Chapter 4 --- The Speech Corpus and Database --- p.21Chapter 4.1 --- Design of the Speech Corpus --- p.21Chapter 4.2 --- Speech Database Acquisition --- p.23Chapter 5 --- Feature Parameters Extraction --- p.24Chapter 5.1 --- Endpoint Detection --- p.25Chapter 5.2 --- Speech Processing --- p.26Chapter 5.3 --- Speech Segmentation --- p.27Chapter 5.4 --- Phoneme Feature Extraction --- p.29Chapter 5.5 --- Tone Feature Extraction --- p.30Chapter 6 --- The Design of the System --- p.33Chapter 6.1 --- Towards Large Vocabulary System --- p.34Chapter 6.2 --- Overview of the Isolated Cantonese Syllable Recognition System --- p.36Chapter 6.3 --- The Primary Level: Phoneme Classifiers and Tone Classifier --- p.38Chapter 6.4 --- The Intermediate Level: Ending Corrector --- p.42Chapter 6.5 --- The Secondary Level: Syllable Classifier --- p.43Chapter 6.5.1 --- Concatenation with Correction Approach --- p.44Chapter 6.5.2 --- Fuzzy ART Approach --- p.45Chapter 7 --- Computer Simulation --- p.49Chapter 7.1 --- Experimental Conditions --- p.49Chapter 7.2 --- Experimental Results of the Primary Level Classifiers --- p.50Chapter 7.3 --- Overall Performance of the System --- p.57Chapter 7.4 --- Discussions --- p.61Chapter 8 --- Further Works --- p.62Chapter 8.1 --- Enhancement on Speech Segmentation --- p.62Chapter 8.2 --- Towards Speaker-Independent System --- p.63Chapter 8.3 --- Towards Speech-to-Text System --- p.64Chapter 9 --- Conclusions --- p.65Bibliography --- p.67Appendix A. Cantonese Syllable Full Set List --- p.7

    Tone perception of Cantonese-speaking children with cochlear implant

    Get PDF
    Also available in print."A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, May 10, 2000."Thesis (B.Sc)--University of Hong Kong, 2000published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    An acoustic analysis of the Cantonese whispered tones

    Get PDF
    "A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, December 31, 2004."Also available in print.Thesis (B.Sc)--University of Hong Kong, 2004.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Stimulus presentation order and the perception of lexical tones in Cantonese

    Get PDF
    Listeners' auditory discrimination of vowel sounds depends in part on the order in which stimuli are presented. Such presentation order effects have been argued to be language independent, and to result from psychophysical (not speech- or language-specific) factors such as the decay of memory traces over time or increased weighting of later-occurring stimuli. In the present study, native Cantonese speakers' discrimination of a linguistic tone continuum is shown to exhibit order of presentation effects similar to those shown for vowels in previous studies. When presented with two successive syllables differing in fundamental frequency by approximately 4 Hz, listeners were significantly more sensitive to this difference when the first syllable was higher in frequency than the second. However, American English-speaking listeners with no experience listening to Cantonese showed no such contrast effect when tested in the same manner using the same stimuli. Neither English nor Cantonese listeners showed any order of presentation effects in the discrimination of a nonspeech continuum in which tokens had the same fundamental frequencies as the Cantonese speech tokens but had a qualitatively non-speech-like timbre. These results suggest that tone presentation order effects, unlike vowel effects, may be language specific, possibly resulting from the need to compensate for utterance-related pitch declination when evaluating fundamental frequency for tone identification.published_or_final_versio

    Perception of nonnative tonal contrasts by Mandarin-English and English-Mandarin sequential bilinguals

    Full text link
    This study examined the role of acquisition order and crosslinguistic similarity in influencing transfer at the initial stage of perceptually acquiring a tonal third language (L3). Perception of tones in Yoruba and Thai was tested in adult sequential bilinguals representing three different first (L1) and second language (L2) backgrounds: L1 Mandarin-L2 English (MEBs), L1 English-L2 Mandarin (EMBs), and L1 English-L2 intonational/non-tonal (EIBs). MEBs outperformed EMBs and EIBs in discriminating L3 tonal contrasts in both languages, while EMBs showed a small advantage over EIBs on Yoruba. All groups showed better overall discrimination in Thai than Yoruba, but group differences were more robust in Yoruba. MEBs’ and EMBs’ poor discrimination of certain L3 contrasts was further reflected in the L3 tones being perceived as similar to the same Mandarin tone; however, EIBs, with no knowledge of Mandarin, showed many of the same similarity judgments. These findings thus suggest that L1 tonal experience has a particularly facilitative effect in L3 tone perception, but there is also a facilitative effect of L2 tonal experience. Further, crosslinguistic perceptual similarity between L1/L2 and L3 tones, as well as acoustic similarity between different L3 tones, play a significant role at this early stage of L3 tone acquisition.Published versio
    corecore