77 research outputs found

    Developing Speech Recognition and Synthesis Technologies to Support Computer-Aided Pronunciation Training for Chinese Learners of English

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    ERROR ANALYSIS OF CHEFS’ PRONUNCIATION OF ENGLISH CULINARY TERMS

    Get PDF
    This study is focused on revealing chefs’ mispronunciation on English culinary terms. It is aimed at analyzing the words mostly do the chefs mispronounce, how the chefs mispronounce the English words, and the problems faced by chefs in pronouncing the English words. The sample are five experienced chefs in Mataram and Western Lombok. The data were gathered by recording the pronunciations. The method used in analyzing the data was qualitative method. In the study, the writer found that the are several English words that are mispronounced by the chefs. They mispronounced the silent letters, vowels, and consonants. The chefs are troubled in pronouncing English words correctly

    Enhancing the pronunciation of problematic English consonants for Spanish learners through intralingual dubbing activities

    Get PDF
    En esta tesis doctoral se proporciona un estudio sobre el potencial de las actividades de doblaje intralingüístico en la mejora de la pronunciación de fonemas consonánticos problemáticos del inglés para estudiantes españoles, junto con otras consideraciones adicionales, como el grado en que esos fonemas resultan problemáticos para los participantes de la investigación (n=71) y un análisis pormenorizado de sus puntos de vista y opiniones sobre la actividad de doblaje.Para ello, un Grupo Experimental (GE; n=37) y un Grupo Control (GC; n=34) se grabaron en diferentes fases del estudio (GE: fase pre-test, doblajes, y fase post-test; GC: pre-test y post-test) con el fin de obtener datos relevantes y útiles sobre su pronunciación. Todos los datos recopilados han sido analizados con el Statistical Package for Social Sciences, (SPSS; v.25), aplicando el test de Wilcoxon para comparaciones intragrupales, y el U-test de Mann-Whitney para las comparaciones entre grupos. Además, los participantes de la investigación completaron dos cuestionarios para obtener información adicional al respecto.Como conclusión, la pronunciación general del GE mejoró significativamente en la mayoría de los fonemas consonánticos problemáticos durante y después de realizar las actividades de doblaje, mientras que el GC no mostró ninguna mejora significativa en su pronunciación. Además, la mayoría de los participantes del GE mostraron opiniones muy positivas hacia la actividad de doblaje, destacando su valor motivador e innovador en el aprendizaje de lenguas, así como su utilidad para mejorar las habilidades orales.<br /

    An online system for entering and annotating non-native Mandarin Chinese speech for language teaching

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 59-62).This thesis describes the design and implementation of an intuitive online system for the annotation of non-native Mandarin Chinese speech by native Chinese speakers. This system will allow speech recognition researchers to easily generate a corpus of labeled non-native speech. We have five native Chinese speakers test the annotation system on a sample bank of 250 Chinese utterances and observe fair to moderate inter-rater agreement scores. In addition to giving us a benchmark for inter-rater agreement, this also demonstrates the feasibility of having remote graders annotate sets of utterances. Finally, we extend our work to Chinese language instruction by creating a web-based interface for Chinese reading assignments. Our design is a simple, integrated solution for completing and correcting of spoken reading assignments, that also streamlines the compilation of a corpus of labeled non-native speech for use in future research.by Andrea Johanna Hawksley.M.Eng

    Is There a Bilingual Advantage in Phonetic and Phonological Acquisition? The Initial Learning of Word-Final Coronal Stop Realization in a Novel Accent of English

    Full text link
    Research question: We address the question of whether the cognitive advantage of the bilingual mind, already demonstrated in the case of auditory processing or novel word acquisition, also applies to other linguistic domains, specifically to phonetic and phonological learning. Design: We compare the performance of 17 monolinguals and 25 bilinguals from Canada in a production experiment with two tasks: imitation and spontaneous reproduction of a novel foreign accent, specifically Sussex English. Data and analysis: To eliminate potential sources of variability, our focus is on a sound already existing in the subjects’ production (the glottal stop), but differently mapped to surface representations in the novel accent to which they were exposed (i.e. as an allophone of coronal stops in word-final position). We measured the glottal stop rates of our subjects in baseline, training, and post-training. Results: The two groups behaved differently, with bilinguals showing a larger increase of their glottal stop rate post-training. Our results are thus consistent with a bilingual advantage in phonetic and phonological learning. Originality: We interpret these findings in light of recent psycholinguistic work and conclude that echoic memory strategies, possibly underlain by stronger subcortical encoding of sound in bilinguals, may account for our results by facilitating the re-mapping between existing mental representations of sounds and existing articulatory command configurations. Significance: Our study adds to the body of work showing that there is an advantage of bilingualism in second dialect learning in adulthood, and provides an explanation in terms of perceptual strategies in which echoic memory is involved. We also contribute to the recent body of research suggesting that imitation of an action can result in improved understanding of that action

    A Mobile App For Practicing Finnish Pronunciation Using Wav2vec 2.0

    Get PDF
    As Finland attracts more foreign talents, there are demands for self-learning tools to help second language (L2) speakers learn Finnish with proper feedback. However, there are few resources in L2 data in Finnish, especially focusing on the beginner level for adults. Moreover, since L2 adults are mainly busy studying or working in Finland, the application must allow users to practice anytime, anywhere. This thesis aims to address the above issues by developing a mobile app for beginner Finnish L2 learners to practice their pronunciation. The app would evaluate the users' speech samples, give feedback on their pronunciation, and then provide them with instructions in the form of text, photos, audio, and videos to help them improve their pronunciation. Due to the limited resources available, this work explores the wav2vec 2.0 model's capability for the application. We trained our models with the native Finnish speakers' corpus and used them to provide pronunciation feedback on L2 samples without any L2 training data. The results show that the models can detect mispronunciation on phoneme level about 60% of the time (Recall rate) compared to a native Finnish listener. By adding regularizations, selecting training datasets, and using a smaller model size, we achieved a comparable Recall rate of approximately 63% with a slightly lower Precision of around 29%. Compared to the state-of-the-art model in Finnish Automatic Speech Recognition, the trade-off resulted in a significantly faster response time

    Methods for pronunciation assessment in computer aided language learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 149-176).Learning a foreign language is a challenging endeavor that entails acquiring a wide range of new knowledge including words, grammar, gestures, sounds, etc. Mastering these skills all require extensive practice by the learner and opportunities may not always be available. Computer Aided Language Learning (CALL) systems provide non-threatening environments where foreign language skills can be practiced where ever and whenever a student desires. These systems often have several technologies to identify the different types of errors made by a student. This thesis focuses on the problem of identifying mispronunciations made by a foreign language student using a CALL system. We make several assumptions about the nature of the learning activity: it takes place using a dialogue system, it is a task- or game-oriented activity, the student should not be interrupted by the pronunciation feedback system, and that the goal of the feedback system is to identify severe mispronunciations with high reliability. Detecting mispronunciations requires a corpus of speech with human judgements of pronunciation quality. Typical approaches to collecting such a corpus use an expert phonetician to both phonetically transcribe and assign judgements of quality to each phone in a corpus. This is time consuming and expensive. It also places an extra burden on the transcriber. We describe a novel method for obtaining phone level judgements of pronunciation quality by utilizing non-expert, crowd-sourced, word level judgements of pronunciation. Foreign language learners typically exhibit high variation and pronunciation shapes distinct from native speakers that make analysis for mispronunciation difficult. We detail a simple, but effective method for transforming the vowel space of non-native speakers to make mispronunciation detection more robust and accurate. We show that this transformation not only enhances performance on a simple classification task, but also results in distributions that can be better exploited for mispronunciation detection. This transformation of the vowel is exploited to train a mispronunciation detector using a variety of features derived from acoustic model scores and vowel class distributions. We confirm that the transformation technique results in a more robust and accurate identification of mispronunciations than traditional acoustic models.by Mitchell A. Peabody.Ph.D
    • …
    corecore