Search CORE

411 research outputs found

Computational Approaches to Exploring Persian-Accented English

Author: Miller Corey
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/03/2015
Field of study

Methods involving phonetic speech recognition are discussed for detecting Persian-accented English. These methods offer promise for both the identification and mitigation of L2 pronunciation errors. Pronunciation errors, both segmental and suprasegmental, particular to Persian speakers of English are discussed

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

MISPRONUNCIATION DETECTION AND DIAGNOSIS IN MANDARIN ACCENTED ENGLISH SPEECH

Author: Khanal Subash
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

This work presents the development, implementation, and evaluation of a Mispronunciation Detection and Diagnosis (MDD) system, with application to pronunciation evaluation of Mandarin-accented English speech. A comprehensive detection and diagnosis of errors in the Electromagnetic Articulography corpus of Mandarin-Accented English (EMA-MAE) was performed by using the expert phonetic transcripts and an Automatic Speech Recognition (ASR) system. Articulatory features derived from the parallel kinematic data available in the EMA-MAE corpus were used to identify the most significant articulatory error patterns seen in L2 speakers during common mispronunciations. Using both acoustic and articulatory information, an ASR based Mispronunciation Detection and Diagnosis (MDD) system was built and evaluated across different feature combinations and Deep Neural Network (DNN) architectures. The MDD system captured mispronunciation errors with a detection accuracy of 82.4%, a diagnostic accuracy of 75.8% and a false rejection rate of 17.2%. The results demonstrate the advantage of using articulatory features in revealing the significant contributors of mispronunciation as well as improving the performance of MDD systems

University of Kentucky

Dealing with linguistic mismatches for automatic speech recognition

Author: Yang Xuesong
Publication venue
Publication date: 01/05/2019
Field of study

Recent breakthroughs in automatic speech recognition (ASR) have resulted in a word error rate (WER) on par with human transcribers on the English Switchboard benchmark. However, dealing with linguistic mismatches between the training and testing data is still a significant challenge that remains unsolved. Under the monolingual environment, it is well-known that the performance of ASR systems degrades significantly when presented with the speech from speakers with different accents, dialects, and speaking styles than those encountered during system training. Under the multi-lingual environment, ASR systems trained on a source language achieve even worse performance when tested on another target language because of mismatches in terms of the number of phonemes, lexical ambiguity, and power of phonotactic constraints provided by phone-level n-grams. In order to address the issues of linguistic mismatches for current ASR systems, my dissertation investigates both knowledge-gnostic and knowledge-agnostic solutions. In the first part, classic theories relevant to acoustics and articulatory phonetics that present capability of being transferred across a dialect continuum from local dialects to another standardized language are re-visited. Experiments demonstrate the potentials that acoustic correlates in the vicinity of landmarks could help to build a bridge for dealing with mismatches across difference local or global varieties in a dialect continuum. In the second part, we design an end-to-end acoustic modeling approach based on connectionist temporal classification loss and propose to link the training of acoustics and accent altogether in a manner similar to the learning process in human speech perception. This joint model not only performed well on ASR with multiple accents but also boosted accuracies of accent identification task in comparison to separately-trained models

Illinois Digital Environment for Access to Learning and Scholarship Repository

Detecting autism, emotions and social signals using AdaBoost

Author: Busa-Fekete Róbert
Gosztolya Gábor
Tóth László
Publication venue: Interspeech
Publication date: 01/01/2013
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Mispronunciation of High Front and Low Hausa Vowels among the Yorùbá Speakers

Author: Burkhardt Jurgen Martin
Husain Salina Binti
Maikanti Sale
Oludare Olúwadọrọ̀ Jacob
Yong Mei Fung
Publication venue: Secholian Publication
Publication date: 01/01/2021
Field of study

Pronunciation in second language learning is sometimes challenging, especially the vowels. Vowels such as [i] and [a] are found both in Hausa and Yorùbá but [i:] and [a:] are peculiar to Hausa alone. While Hausa has short and long vowels, Yorùbá has only oral and nasal vowels in their vowel inventories. Such phonemic differences constitute learning challenges, especially for the Yorùbá native speakers. This is a cross-sectional study design using mixed methods to examines the production of high front vowels: [i], and [i:], as well as low: [a], and [a:] Hausa vowels by the Yorùbá speakers to identify which group perform better between group 1 (Yorùbá native speakers who learned Hausa in the secondary school before going to the college of education), and group 2 (Yorùbá native speakers who learned Hausa informally before going to the college of education). The study also seeks to find out vowel substitutions that occur in the pronunciation tasks using 80 participants from 18 years old and above from the College of Education system in Nigeria who were selected based on purposive sampling. The findings were discussed in line with Flege & Bohn’s (2020) ‘Revised Speech Learning Model’. 8 stimuli were audio-recorded, transcribed, and rated by two independent raters, in addition to participant observation techniques adapted. The results of the Mann-Whitney test revealed that group 2 performed better than group 1. The study discovered also that the short [a] in the first and second syllables had the highest frequency of substitution compared to [i], [i:] and [a:] vowels. Such problems have pedagogical implications for learning Hausa as a second language

Neliti

Universiti Putra Malaysia Institutional Repository

Malaysian Journal of Social Sciences and Humanities

English as a lingua franca: mutual intelligibility of Chinese, Dutch and American speakers of English

Author: Wang H.
Publication venue: 'The Korean Society of Clothing and Textiles'
Publication date: 10/01/2007
Field of study

English has become the language of international communication. As a result of this development, we are now confronted with a bewildering variety of ‘Englishes’, spoken with non-native accents. Research determining how intelligible non-native speakers of varying native-language backgrounds are to each other and to native speakers of English has only just started to receive attention. This thesis investigated to what extent Chinese, Dutch and American speakers of English are mutually intelligible. Intelligibility of vowels, simplex consonants and consonant clusters was tested in meaningless sound sequences, as well as in words in meaningless and meaningful short sentences. Speakers (one male, one female per language background) were selected so as to be optimally representative of their peer groups, which were made up of young academic users of English. Intelligibility was tested for all nine combinations of speaker and listener backgrounds. Results show that Chinese-accented English is less intelligible overall than Dutch-accented English, which is less intelligible than American English. Generally, the native-language background of the speaker was less important for the intelligibility than the background of the listener. Also, the results reveal a clear and consistent so-called interlanguage speech intelligibility benefit: speakers of English – whether foreign or native – are more intelligible to listeners with whom they share the native-language background than to listeners with a different native language.LEI Universiteit LeidenChina Scholarship Council; Leids Universiteits FondsTheoretical and Experimental Linguistic

Leiden University Scholary Publications

The Effect of Speech Elicitation Method on Second Language Phonemic Accuracy

Author: Carrasquel Nicole
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2017
Field of study

The present study, a One-Group Posttest-Only Repeated-Measures Design, examined the effect of speech elicitation method on second language (L2) phonemic accuracy of high functional load initial phonemes found in frequently occurring nouns in American English. This effect was further analyzed by including the variable of first language (L1) to determine if L1 moderated any effects found. The data consisted of audio recordings of 61 adult English learners (ELs) enrolled in English for Academic Purposes (EAP) courses at a large, public, post-secondary institution in the United States. Phonemic accuracy was judged by two independent raters as either approximating a standard American English (SAE) pronunciation of the intended phoneme or not, thus a dichotomous scale, and scores were assigned to each participant in terms of the three speech elicitation methods of word reading, word repetition, and picture naming. Results from a repeated measures ANOVA test revealed a statistically significant difference in phonemic accuracy (F(1.47, 87.93) = 25.94, p = .000) based on speech elicitation method, while the two-factor mixed design ANOVA test indicated no statistically significant differences for the moderator variable of native language. However, post-hoc analyses revealed that mean scores of picture naming tasks differed significantly from the other two elicitation methods of word reading and word repetition. Moreover, the results of this study should heighten attention to the role that various speech elicitation methods, or input modalities, might play on L2 productive accuracy. Implications for practical application suggest that caution should be used when utilizing pictures to elicit specific vocabulary words–even high-frequency words–as they might result in erroneous productions or no utterance at all. These methods could inform pronunciation instructors about best teaching practices when pronunciation accuracy is the objective. Finally, the impact of L1 on L2 pronunciation accuracy might not be as important as once thought

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

The Perceptual and Production Training of /d, tap, r/ in L2 Spanish: Behavioral, Psycholinguistic, and Neurolinguistic Evidence

Author: Herd Wendy
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2011
Field of study

When native speakers of American English begin learning Spanish, their acquisition of native-like pronunciation can be hampered by the tap - trill distinction in words like coro `choir' and corro `I run'. The trill proves difficult because it does not exist in English. Although the tap exists as an allophone of /t/ and /d/ in English words like `writer' and `rider', students of Spanish must learn to process it as a phoneme rather than an allophone. Similarly, learners have difficulty acquiring the spirantization of voiced stops, where the /d/ in codo `elbow' is produced as a voiced dental fricative or approximant, which is more like the `th' sound in English. This study investigates whether American English-speaking learners of Spanish can be trained to perceive and produce the intervocalic tap, trill, and /d/ contrasts in Spanish. Participants were trained using both perceptual and production training methods. Past research has reported that perceptual training alone improves both perception and production and that production training alone improves both as well, but the production training studies have not been limited to production as trainees have been able to listen to the training stimuli. This study is important because it systematically controls both training modalities so that they can be directly compared and introduces a third training methodology that includes both perception and production to discover whether perceptual training, production training, or a combination of the two is most effective. This study also uses cross-modal priming and ERP data in addition to traditional tasks (identification and production tasks) to evaluate the effect of training, an innovative use of both tasks to determine if trainees not only perceive and produce the trained L2 contrasts but also if they unconsciously process these contrasts and if they have built new phonemic categories for these sounds. All three training paradigms improved English learners' perception or production. While production trainees did not improve in their overall perception and declined in their perception of one contrast, perception trainees improved in their production and overall perception, indicating that perception training transfers more effectively than production training

KU ScholarWorks

The Effect of Shadowing in Learning L2 Segments: A Perspective from Phonetic Convergence

Author: Althubyani Ruqayyah
Publication venue: UWM Digital Commons
Publication date: 01/12/2021
Field of study

This study aimed to investigate the role that phonetic convergence plays in the acquisition of L2 segments. In particular, it examined whether phonetic convergence towards native speakers could help Arabic-speaking second-language (L2) learners of English improve their pronunciation of four problematic English segments (/p, v, ɛ, oʊ/). To do so, the study went through several phases of experimental studies. Phonetic convergence was first explored in the productions of Arabic L2 learners towards five different English native model talkers in non-interactive setting. Five XAB perceptual similarity judgments and acoustic measurements of VOT, vowel duration, F0, and F1*F2 were used to evaluate phonetic convergence.Based mainly on perceptual measures of phonetic convergence, learners were divided evenly between two groups. C-group (convergence group) received phonetic production training from the model talkers to whom they showed the highest degree of phonetic convergence, while D-group (divergence group) received training from the model talkers they showed divergence from or the least convergence to. Training lasted three consecutive days with target segments (i.e., /p, v, ɛ, oʊ/) presented in nonsense words. They were trained using the shadowing technique that used low-variability training paradigm in which each learner received training from one native model talker. Native-speaker judgments on segmental intelligibility indicated both groups showed significant improvement on the post-test; however, no significant differences were found between groups in terms of the overall magnitude of this change. Perceived convergence in learners’ speech failed to explain the improvement. However, some patterns of acoustic convergence towards their trainers, regardless of group, predicted the overall segmental intelligibility gains. The findings suggested that the more trainees converged their vowel duration and formants to their trainers, the more their performance improved. At featural level, the study examined the relationship between the preexisting phonetic distance between the Arabic L2 learners of English and model talkers before the exposure and the degree of convergence. Results indicated that there was a direct relationship between how far Arabic L2 learners were from the native model talkers and the degree of convergence in all measured acoustic features. That is, the greater the baseline distance, the greater the degree of phonetic convergence was. However, such a relationship might be due to the metric used to assess phonetic convergence. The relationship between phonetic convergence measured by difference in distance (DID) and the absolute baseline distance is always biased due to the way they are calculated (Cohen Priva & Sanker, 2019; MacLeod, 2021). This study found shadowing to be an effective technique to promote segmental intelligibility among Arabic-speakers learning English as an L2. However, this effectiveness might be increased by trainees converging more to their trainers in vowel duration and vowel spectra or being similar to their trainers in this regard from the beginning

University of Wisconsin-Milwaukee

A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

Author: Avesani Cinzia
Baker Brett Joseph
Balling Laura Winther
Behne Dawn M.
Best Catherine
Bundgaard-Nielsen Rikke
Carlet Angélica
Cebrian Juli
Christensen Ken Ramshøj
Cooper Angela
Flege James Emil
Hejná Michaela
Hejná Mísa
Horslund Camilla Søballe
Hua Congehao
Højen Anders
Højen Anders
Jespersen Anna
Jespersen Anna Bothe
Jongman Allard
Jørgensen Henrik
Karmeli Sophia
Kizach Johannes
Kluge Denise Cristina
Lee Goun
Li Bin
Li Yingjie
Masapollo Matthew
Mooshammer Christine
Mora Joan C.
Mora-Plaza Ingrid
Niebuhr Oliver
Nyvad Anne Mette
Nyvad Anne Mette
Piske Thorsten
Polka Linda
Rasmussen Sidsel
Ruan Yufang
Sereno Joan A.
Steinlen Anja
Sørensen Mette Hjortshøj
Sørensen Mette Hjortshøj
Tyler Michael
Vayra Mario
Vikner Sten
Wang Yue
Wayland Ratree
Whalen D. H.
Wood Johanna
Yan Mengzhu
Publication venue: 'Aarhus University Library'
Publication date: 16/05/2019
Field of study

The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

AU Library Scholarly Publishing Services: E-books (Aarhus University)