10 research outputs found
Recommended from our members
Investigating English Pronunciation: Trends and Directions - Jose A. Mompean and JonĂĄs Fouz-GonzĂĄlez (Eds.)
Corrective feedback accuracy and pronunciation improvement: Feedback that is âgood enoughâ
It is unclear whether corrective feedback (CF) provided by L2 computer-assisted pronunciation training (CAPT) tools must be 100% accurate to promote an acceptable level of improvement in pronunciation. Using a web-based interface, 30 native speakers of Chinese completed a pretest, a computer-based training session to produce nine sound contrasts in English, and a posttest. The study manipulated feedback accuracy using a modified âWizard of Ozâ protocol in which a phonetically-trained human listener in a separate room provided CF on the traineesâ productions, but the trainees thought that the computer-based system provided the CF. The computer system presented a set of three sound contrasts with 100% accuracy, three with 66% accuracy (with one of three human responses changed randomly), and three with 33% accuracy (with two of three human feedback responses being changed). The traineesâ pre- and posttest productions were rated for accuracy by native speakers of English. For trained items, productions were not significantly different when the trainees received CF with 100% or 66% accuracy, but both resulted in greater improvement than feedback with 33% accuracy. An important implication for L2 pronunciation training software is that machine feedback can be beneficial even when it is âgood enoughâ (i.e., not 100% accurate)
L2-ARCTIC: A Non-Native English Speech Corpus
In this paper, we introduce L2-ARCTIC, a speech corpus of non-native English that is intended for research in voice conversion, accent conversion, and mispronunciation detection. This initial release includes recordings from ten non-native speakers of English whose first languages (L1s) are Hindi, Korean, Mandarin, Spanish, and Arabic, each L1 containing recordings from one male and one female speaker. Each speaker recorded approximately one hour of read speech from the Carnegie Mellon University ARCTIC prompts, from which we generated orthographic and forced-aligned phonetic transcriptions. In addition, we manually annotated 150 utterances per speaker to identify three types of mispronunciation errors: substitutions, deletions, and additions, making it a valuable resource not only for research in voice conversion and accent conversion but also in computer-assisted pronunciation training. The corpus is publicly accessible at https://psi.engr.tamu.edu/l2-arctic-corpus/
Consonant-induced pitch perturbations, domain-initial strengthening, and word learning success in a tone language
This dissertation presents three studies that examined issues related to the production and the perception of pitch in a tone language. The first study examined linguistic contexts that may modulate consonant-induced pitch perturbations (CF0) in a tone language. Previous studies have produced mixed findings regarding the role of linguistic contexts in modulating CF0. To address this issue, this study analyzed CF0 effects in Thai monosyllabic words starting with /b/, /p/, /pÊ°/, /d/, /t/, /tÊ°/, /k/, or /kÊ°/, having /a/ or /aË/ as the vowel, and bearing the falling tone, the mid tone, or the low tone, and placed in two sentential contexts or in isolation. The results showed that linguistic contexts including tone context, sentential context, place of articulation, and, to a lesser extent, vowel length modulated CF0 effects. Different pitch perturbation patterns were also observed. The results are discussed in terms of different mechanisms that may underlie the perturbation effects. The findings have implications for previous conflicting findings and for tonogenesis.
The second study investigated whether lexical tones can undergo domain-initial strengthening (DIS) like consonants can. If so, this suggests that DIS effects in a tone language extend beyond the first segment of a prosodic domain, suggesting that the domain of DIS is larger than previously thought. However, previous studies have reported conflicting results. This study analyzed Thai monosyllabic words in domain-initial and domain-medial positions. The study analyzed the maximum fundamental frequency of the falling tone, the mid tone, and the low tone, as well as acoustic measures of consonants, including the Voice Onset Times of /b/, /p/, and /pÊ°/, the frication duration of /f/, and the CF0 associated with these four consonants produced at the intonational phrase level and the word level. There was no evidence of DIS effects on tones. The findings contribute to the current understanding of the prosody-phonetics interface.
The last study investigated the relationship between two versions of a test that assessed the ability to identify pitch patterns and word learning success in a tone language. One version of the test, the mixed-talker Pitch-Contour Perception Test (PCPT), presented stimuli that were not blocked by talker, whereas the other version of the test, the blocked-talker PCPT, presented stimuli that were blocked by talker. Previous studies have suggested that nontonal language speakersâ ability to perceive pitch patterns is a good predictor of their success in learning words in a tone language. Previous studies, however, have not adequately assessed the ability to perceive pitch patterns because this ability has often been assessed with the ability to cope with trial-by-trial variability. Native speakers of English with no prior experience with a tone language took the two versions of the PCPT before being trained for six sessions to learn 16 Mandarin words represented by Chinese characters. The results showed that the learnersâ scores on the mixed-talker PCPT were a slightly better predictor of their success in learning Mandarin words compared to their scores on the blocked-talker PCPT. The results suggested that the ability to cope with trial-by-trial variability is not a strong predictor of word learning success in a tone language.
Given that its intended audience is theoretical researchers and applied researchers, this dissertation concludes with a discussion about how the gap between theoretical research and applied research can be bridged. Reasons for why the gap should be bridged are provided, and suggestions for how the gap can be bridged are offered
Effects of voice type and task on L2 learnersâ awareness of pronunciation errors
Research suggests learners may improve their second language (L2) pronunciation by imitating voices with similar acoustic profiles. However, previously reported improvements have been in suprasegmentals (prosodic features such as intonation). It remains unclear if voice similarity applies to L2 segmentals (consonants and vowels). To address this issue, this study investigates how voice similarity facilitates awareness of pronunciation errors, a necessary step in pronunciation improvement. In two experiments, advanced L2 learners identified their pronunciation errors by comparing their production to the production of a resynthesized model voice using learnersâ voices as the base (Golden Speaker voice), or to an unfamiliar resynthesized voice with the same gender as the learner (Silver Speaker voice). In Experiment 1, L2 learners identified all syllables with vowel and consonant errors when
comparing their production to the model voice. Their choices were compared to identifications by expert judges. In Experiment 2, learners were told how many errors the expert judges had identified before identifying the same number of errors. Results did not support facilitative effects of Golden Speaker voices in either experiment, but Experiment 2 resulted in higher identification percentages. Discussion of the challenges in self-identification of errors in relation to
voice similarity are offered
L2-ARCTIC: A Non-Native English Speech Corpus
In this paper, we introduce L2-ARCTIC, a speech corpus of non-native English that is intended for research in voice conversion, accent conversion, and mispronunciation detection. This initial release includes recordings from ten non-native speakers of English whose first languages (L1s) are Hindi, Korean, Mandarin, Spanish, and Arabic, each L1 containing recordings from one male and one female speaker. Each speaker recorded approximately one hour of read speech from the Carnegie Mellon University ARCTIC prompts, from which we generated orthographic and forced-aligned phonetic transcriptions. In addition, we manually annotated 150 utterances per speaker to identify three types of mispronunciation errors: substitutions, deletions, and additions, making it a valuable resource not only for research in voice conversion and accent conversion but also in computer-assisted pronunciation training. The corpus is publicly accessible at https://psi.engr.tamu.edu/l2-arctic-corpus/.This article is published as Zhao, G., Sonsaat, S., Silpachai,A., Lucic, I., Chukharev-Hudilainen, E., Levis, J., Gutierrez-Osuna, R., L2-ARCTIC: A Non-Native English Speech Corpus. Perception Sensing Instrumentation Lab. 2018. Posted with permission.</p
Golden Speaker Builder - An interactive tool for pronunciation training
The type of voice model used in Computer Assisted Pronunciation Instruction is a crucial factor in the quality of practice and the amount of uptake by language learners. As an example, prior research indicates that second-language learners are more likely to succeed when they imitate a speaker with a voice similar to their own, a so-called âgolden speakerâ. This manuscript presents Golden Speaker Builder (GSB), a tool that allows learners to generate a personalized âgolden-speakerâ voice: one that mirrors their own voice but with a native accent. We describe the overall system design, including the web application with its user interface, and the underlying speech analysis/synthesis algorithms. Next, we present results from a series of listening tests, which show that GSB is capable of synthesizing such golden-speaker voices. Finally, we present results from a user study in a language-instruction setting, which show that practising with GSB leads to improved fluency and comprehensibility. We suggest reasons for why learners improved as they did and recommendations for the next iteration of the training.This accepted manuscript is published as Shaojin Ding ,Christopher Liberatore ,Sinem Sonsaat ,Ivana LuËci Ìc ,Alif Silpachai ,Guanlong Zhao ,Evgeny Chukharev-Hudilainen ,John Levis ,Ricardo Gutierrez-Osuna , Golden Speaker Builder - An interactive tool for pronunciation train-ing,Speech Communication(2019), DOI: 10.1016/j.specom.2019.10.005. Posted with permission.</p