1,677 research outputs found

    Unit selection and waveform concatenation strategies in Cantonese text-to-speech.

    Get PDF
    Oey Sai Lok.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references.Abstracts in English and Chinese.Chapter 1. --- Introduction --- p.1Chapter 1.1 --- An overview of Text-to-Speech technology --- p.2Chapter 1.1.1 --- Text processing --- p.2Chapter 1.1.2 --- Acoustic synthesis --- p.3Chapter 1.1.3 --- Prosody modification --- p.4Chapter 1.2 --- Trends in Text-to-Speech technologies --- p.5Chapter 1.3 --- Objectives of this thesis --- p.7Chapter 1.4 --- Outline of the thesis --- p.9References --- p.11Chapter 2. --- Cantonese Speech --- p.13Chapter 2.1 --- The Cantonese dialect --- p.13Chapter 2.2 --- Phonology of Cantonese --- p.14Chapter 2.2.1 --- Initials --- p.15Chapter 2.2.2 --- Finals --- p.16Chapter 2.2.3 --- Tones --- p.18Chapter 2.3 --- Acoustic-phonetic properties of Cantonese syllables --- p.19References --- p.24Chapter 3. --- Cantonese Text-to-Speech --- p.25Chapter 3.1 --- General overview --- p.25Chapter 3.1.1 --- Text processing --- p.25Chapter 3.1.2 --- Corpus based acoustic synthesis --- p.26Chapter 3.1.3 --- Prosodic control --- p.27Chapter 3.2 --- Syllable based Cantonese Text-to-Speech system --- p.28Chapter 3.3 --- Sub-syllable based Cantonese Text-to-Speech system --- p.29Chapter 3.3.1 --- Definition of sub-syllable units --- p.29Chapter 3.3.2 --- Acoustic inventory --- p.31Chapter 3.3.3 --- Determination of the concatenation points --- p.33Chapter 3.4 --- Problems --- p.34References --- p.36Chapter 4. --- Waveform Concatenation for Sub-syllable Units --- p.37Chapter 4.1 --- Previous work in concatenation methods --- p.37Chapter 4.1.1 --- Determination of concatenation point --- p.38Chapter 4.1.2 --- Waveform concatenation --- p.38Chapter 4.2 --- Problems and difficulties in concatenating sub-syllable units --- p.39Chapter 4.2.1 --- Mismatch of acoustic properties --- p.40Chapter 4.2.2 --- "Allophone problem of Initials /z/, Id and /s/" --- p.42Chapter 4.3 --- General procedures in concatenation strategies --- p.44Chapter 4.3.1 --- Concatenation of unvoiced segments --- p.45Chapter 4.3.2 --- Concatenation of voiced segments --- p.45Chapter 4.3.3 --- Measurement of spectral distance --- p.48Chapter 4.4 --- Detailed procedures in concatenation points determination --- p.50Chapter 4.4.1 --- Unvoiced segments --- p.50Chapter 4.4.2 --- Voiced segments --- p.53Chapter 4.5 --- Selected examples in concatenation strategies --- p.58Chapter 4.5.1 --- Concatenation at Initial segments --- p.58Chapter 4.5.1.1 --- Plosives --- p.58Chapter 4.5.1.2 --- Fricatives --- p.59Chapter 4.5.2 --- Concatenation at Final segments --- p.60Chapter 4.5.2.1 --- V group (long vowel) --- p.60Chapter 4.5.2.2 --- D group (diphthong) --- p.61References --- p.63Chapter 5. --- Unit Selection for Sub-syllable Units --- p.65Chapter 5.1 --- Basic requirements in unit selection process --- p.65Chapter 5.1.1 --- Availability of multiple copies of sub-syllable units --- p.65Chapter 5.1.1.1 --- "Levels of ""identical""" --- p.66Chapter 5.1.1.2 --- Statistics on the availability --- p.67Chapter 5.1.2 --- Variations in acoustic parameters --- p.70Chapter 5.1.2.1 --- Pitch level --- p.71Chapter 5.1.2.2 --- Duration --- p.74Chapter 5.1.2.3 --- Intensity level --- p.75Chapter 5.2 --- Selection process: availability check on sub-syllable units --- p.77Chapter 5.2.1 --- Multiple copies found --- p.79Chapter 5.2.2 --- Unique copy found --- p.79Chapter 5.2.3 --- No matched copy found --- p.80Chapter 5.2.4 --- Illustrative examples --- p.80Chapter 5.3 --- Selection process: acoustic analysis on candidate units --- p.81References --- p.88Chapter 6. --- Performance Evaluation --- p.89Chapter 6.1 --- General information --- p.90Chapter 6.1.1 --- Objective test --- p.90Chapter 6.1.2 --- Subjective test --- p.90Chapter 6.1.3 --- Test materials --- p.91Chapter 6.2 --- Details of the objective test --- p.92Chapter 6.2.1 --- Testing method --- p.92Chapter 6.2.2 --- Results --- p.93Chapter 6.2.3 --- Analysis --- p.96Chapter 6.3 --- Details of the subjective test --- p.98Chapter 6.3.1 --- Testing method --- p.98Chapter 6.3.2 --- Results --- p.99Chapter 6.3.3 --- Analysis --- p.101Chapter 6.4 --- Summary --- p.107References --- p.108Chapter 7. --- Conclusions and Future Works --- p.109Chapter 7.1 --- Conclusions --- p.109Chapter 7.2 --- Suggested future works --- p.111References --- p.113Appendix 1 Mean pitch level of Initials and Finals stored in the inventory --- p.114Appendix 2 Mean durations of Initials and Finals stored in the inventory --- p.121Appendix 3 Mean intensity level of Initials and Finals stored in the inventory --- p.124Appendix 4 Test word used in performance evaluation --- p.127Appendix 5 Test paragraph used in performance evaluation --- p.128Appendix 6 Pitch profile used in the Text-to-Speech system --- p.131Appendix 7 Duration model used in Text-to-Speech system --- p.13

    Effects of errorless learning on the acquisition of velopharyngeal movement control

    Get PDF
    Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio

    An Experimental Study on Societal Factors Affecting VOT of English Plosives

    Get PDF
    Plosives are integral components of English consonants. In phonetics, English plosives are classified into voiceless plosives /p, t, k/ and /b, d, g/. VOT (voice onset time) was defined as “the time interval between the burst that marks release of the stop closure and the onset of quasi-periodicity that reflects laryngeal vibration”. VOT is a significant acoustic feature and analytic parameter of plosives. Referring to Labov’s experimental model of linguistic variation analysis, this study investigates the influences of societal factors have on the VOT of English plosives.In this study, 15 English words with word-initial voiceless plosives /p, t, k/ and 15 word-initial voiced plosives /b, d, g/ were selected as reading material; meanwhile, 30 subjects were randomly recruited to read, and audio samples were collected. It is found that the two social factors (gender and regional dialect) selected in this experiment have influences in different degrees on the English plosive VOT of the subjects. The specific results are as follows.For gender, no significant difference exists between males and females, but the mean VOT of females is longer than that of males, which is basically consistent with previous research results. The underlying reasons of the gender VOT differences inferred by this paper can be physiological and sociophonetical. For regional dialects, the VOT of the subjects were primarily influenced by Southwest Mandarin and Min Dialect, in which the mean value of voiceless plosives was higher and the difference was greater for speakers of Southwest Mandarin, and the mean value of voiced plosives was higher and the difference was greater for speakers of Min Dialect. The results of this empirical study theoretically provide some reference for acoustic researches, and pedagogically, provide some implications for optimizations of English curriculums in university

    International Student Perceptual Challenges and Coping within Higher Education

    Get PDF
    A capstone submitted in partial fulfillment of the requirements for the degree of Doctor of Education in the College of Education at Morehead State University by Edmund Martelli on July 24, 2020

    Evaluating English translations of ancient Chinese poetry with special reference to image schemas and foregrounding

    Get PDF
    Poetry translation evaluation from ancient Chinese to English has been subjective in China. This is caused by the indefinable and intangible notion of ‘poetic spirit’, which is often used in influential translators’ criteria, and by the lack of a systematic investigation of translation evaluation. The problem of subjective criteria has remained unresolved for nearly a century. In order to improve the subjective criteria of poetry translation evaluation, this thesis is an attempt to make objective evaluations of the English translations of an ancient Chinese poem using stylistic theories. To make an objective criticism, it is necessary to offer evidence which is based on systematic and reliable criteria and replicable evaluation procedures. By applying stylistic theories to both the source text and the target texts, it is possible to make a judgement based on the stylistic features found in the texts themselves. Thus, objective evaluation of poetry translation from ancient Chinese to English can be made. This research is qualitative with the data consisting of one ancient Chinese poem as the source text and six English translations as the target texts. It carries out stylistic analyses on the data with two approaches based on the cognitive stylistic concept of figure and ground and the linguistic stylistic theory of foregrounding. The target texts are judged by the evidence of locative relations and foregrounding features. This research also explores and proposes a practical framework for poetry translation. The research findings suggest how to make objective poetry translation evaluations and improve translation techniques. They also point out the need to integrate stylistics with translation evaluation to make improvements in the field

    Current Perspectives on Prevention of Reading and Writing Learning Disabilities

    Get PDF
    This chapter intends firstly to analyze the problem of identifying learning disabilities, from the standpoint of competing diagnostic models. The controversy between different models for identifying learning disabilities was presented, contrasting the characteristics of diagnostic models and models based on response to intervention. Second, an analysis of the main predictive factors of reading and writing was offered, using recent results from research carried out in different languages. The most often studied predictors—phonological awareness, speech perception, the alphabetic principle, rapid automatic naming, and vocabulary—were analyzed for their relationship to reading and writing. Finally, a discussion follows on the effects of certain programs that have been developed in different countries to prevent reading and writing learning disabilities. Most of these programs have been developed in the United States or Spain; they have also been implemented in other countries such as Canada, Australia, Mexico, Chile, and Israel

    Factors influencing information service quality of the information platform of Wenzhou Municipal People's Hospital

    Get PDF
    Along with the global trend of informatization, Internet has become a new mainstream media form following such forms as print media, television and broadcast media, via which people can access information services. As a country with the largest number of netizens around the world, China enjoys improving social information services based on the Internet. With such a large quantity of network users, it is inevitable for China's hospitals at various levels to provide patients and the public with information services by setting up their own official websites. This research investigates the factors affecting the information service quality of Wenzhou People's Hospital (WZPH) and by means of Delphi method, statistical analysis and other research methods, formulates the Evaluation Indicator System for the Information Service Quality of WZPH's Information Platform. The research applies this system to the empirical research on the information service quality of the hospital's website and then makes a comparative analysis between the research results and traffic data of the websites of other hospitals over the same period. Next, the research identifies the factors affecting the information service quality of WZPH's website and finds out how the hospital may increase its website users and traffic through improving its service quality. This research starts with the determination of the objectives, significance, research problems, framework, contents and methods of the research. In the following literature review, the research sorts out papers on hospital websites and theories on service quality, users' information needs and customer satisfaction in a systematic way. Based on the literature review as well as expert consultations and theoretical review, the research determines the approach to examining the information service quality of WZPH's information platform and works out the initial set of evaluation dimensions and indicators of the information service function and quality of the hospital's website. Then, via two rounds of expert consultations, the research figures out the weights of these indicators and further assigns values to each of them. On this basis, the research establishes a research framework and a comprehensive evaluation model for the information service quality of WZPH's information platform. In the end, the research conducts two surveys respectively on the information service quality of WZPH's information platform before and after its overall revision by using Hospital Website Information Service Evaluation Form and Virtual User Questionnaire, and makes a correlation analysis based on the survey results and the flow data of other hospitals' websites over the same period. The analysis draws a conclusion that the website of WZPH, as the information platform of the hospital, is the only carrier to deliver information service, thus playing a vital role in WZPH's overall service quality. In other words, the website of WZPH affects the hospital's overall service quality to a large extent. The comprehensive service function of WZPH's website are important to the quality improvement of the hospital's information service and directly affect the information service quality, the number of users as well as utilization rate of the website

    Oral application of L-menthol in the heat: From pleasure to performance

    Get PDF
    When menthol is applied to the oral cavity it presents with a familiar refreshing sensation and cooling mint flavour. This may be deemed hedonic in some individuals, but may cause irritation in others. This variation in response is likely dependent upon trigeminal sensitivity toward cold stimuli, suggesting a need for a menthol solution that can be easily personalised. Menthol’s characteristics can also be enhanced by matching colour to qualitative outcomes; a factor which can easily be manipulated by practitioners working in athletic or occupational settings to potentially enhance intervention efficacy. This presentation will outline the efficacy of oral menthol application for improving time trial performance to date, either via swilling or via co-ingestion with other cooling strategies, with an emphasis upon how menthol can be applied in ecologically valid scenarios. Situations in which performance is not expected to be enhanced will also be discussed. An updated model by which menthol may prove hedonic, satiate thirst and affect ventilation will also be presented, with the potential performance implications of these findings discussed and modelled. Qualitative reflections from athletes that have implemented menthol mouth swilling in competition, training and maximal exercise will also be included
    corecore