298 research outputs found

    How to Teach English Phonetic Components to Speakers of Japanese : A Review of Previous Studies

    Get PDF
    The purpose of this paper is to compare some salient suprasegmental andsegmental differences between English and Japanese, to identify difficulties thatJapanese learning English may encounter, and to revisit effective pronunciationteaching based on previous research findings. Current tendency is to focus onintegration of suprasegmental and segmental features rather than choosing either ofthem. Regarding setting priority in teaching pronunciation, a consensus betweenexperienced teachers on relatively important pronunciation features for Japaneselearning English was recently provided by Saito (2014). As for teachers’roles andresponsibilities, there are many factors (e.g., learners’goal, proficiency level, anddevelopment stage)which pronunciation teachers need to take into account. Withthis in mind, teachers are responsible for selecting appropriate approaches, instructionalmaterials, and learning activities from a wide range. Above all, phonemicdistinction exercise,articulatory descriptions or diagrams seem beneficial especiallyfor beginners. Form Focused Instruction (FFI) and social interaction with othersare regarded to be effective approaches to improve phonetic abilities in spontaneouscommunication

    The Effect Of Input Modality On Pronunciation Accuracy Of English Language Learners

    Get PDF
    The issues relative to foreign accent continue to puzzle second language researchers, educators, and learners today. Although once thought to be at the root, maturational constraints have fallen short of definitively accounting for the myriad levels and rates of phonological attainment (Bialystok & Miller, 1999, p. 128). This study, a Posttest-only Control Group Design, examined how the pronunciation accuracy of adult, English language learners, as demonstrated by utterance length, was related to two input stimuli: auditory-only input and auditoryorthographic input. Utterance length and input modality were further examined with the added variables of native language, specifically Arabic and Spanish, and second language proficiency as defined by unofficial TOEFL Listening Comprehension and Reading Comprehension section scores. Results from independent t tests indicated a statistically significant difference in utterance length based on input modality (t(192) = -3.285. p = .001), while with the added variable of native language, factorial ANOVA results indicated no statistically significance difference for the population studied. In addition, multiple linear regression analyses examined input modality and second language proficiency as predictors of utterance length accuracy and revealed a statistically significant relationship (R 2 = .108, adjusted R 2 = .089, F(3, 144) = 5.805, p = .001), with 11% of the utterance length variance accounted for by these two factors predictors. Lastly, hierarchical regressions applied to two blocks of factors revealed statistical significance: (a) input modality/native language (R 2 = .069, adjusted R 2 = .048, F(2, 87) = 3.230, p = .044) and ListenComp (R 2 = .101, adjusted R 2 = .070, F(3, 86) = 3.232, p = .026), with ListenComp iv increasing the predictive power by 3%; (b) input modality/native language (R 2 = .069, adjusted R 2 = .048, F(2, 87) = 3.230, p = .044) and ReadComp (R 2 = .112, adjusted R 2 = .081, F(1, 86) = 3.629, p = .016), with ReadComp increasing the predictive power by 4%; and (c) input modality/native language (R 2 = .069, adjusted R 2 = .048, F(2, 87) = 3.230, p = .044) and ListenComp/ReadComp (R 2 = .114, adjusted R 2 = .072, F(2, 85) = 2.129, p = .035), with ListenComp/ReadComp increasing the predictive power by 4%. The implications of this research are that by considering issues relative to input modality and second language proficiency levels especially when teaching new vocabulary to adult second language learners, the potential for improved pronunciation accuracy is maximized. Furthermore, the heightened attention to the role of input modality as a cognitive factor on phonological output in second language teaching and learning may redirect the manner in which target language phonology is approached

    The effects of perception- vs. production-based pronunciation instruction

    Get PDF
    While research has shown that provision of explicit pronunciation instruction (PI) is facilitative of various aspects of second language (L2) speech learning (Thomson & Derwing, 2015), a growing number of scholars have begun to examine which type of instruction can best impact on acquisition. In the current study, we explored the effects of perception- vs. production-based methods of PI among tertiary-level Japanese students of English. Participants (N = 115) received two weeks of instruction on either segmental or suprasegmental features of English, using either a perception- or a production-based method, with progress assessed in a pre/post/delayed posttest study design. Although all four treatment groups demonstrated major gains in pronunciation accuracy, performance varied considerably across groups and over time. A close examination of our findings suggested that perception-based training may be the more effective training method across both segmental and suprasegmental features

    CAPT를 위한 발음 변이 분석 및 CycleGAN 기반 피드백 생성

    Get PDF
    학위논문(박사)--서울대학교 대학원 :인문대학 협동과정 인지과학전공,2020. 2. 정민화.Despite the growing popularity in learning Korean as a foreign language and the rapid development in language learning applications, the existing computer-assisted pronunciation training (CAPT) systems in Korean do not utilize linguistic characteristics of non-native Korean speech. Pronunciation variations in non-native speech are far more diverse than those observed in native speech, which may pose a difficulty in combining such knowledge in an automatic system. Moreover, most of the existing methods rely on feature extraction results from signal processing, prosodic analysis, and natural language processing techniques. Such methods entail limitations since they necessarily depend on finding the right features for the task and the extraction accuracies. This thesis presents a new approach for corrective feedback generation in a CAPT system, in which pronunciation variation patterns and linguistic correlates with accentedness are analyzed and combined with a deep neural network approach, so that feature engineering efforts are minimized while maintaining the linguistically important factors for the corrective feedback generation task. Investigations on non-native Korean speech characteristics in contrast with those of native speakers, and their correlation with accentedness judgement show that both segmental and prosodic variations are important factors in a Korean CAPT system. The present thesis argues that the feedback generation task can be interpreted as a style transfer problem, and proposes to evaluate the idea using generative adversarial network. A corrective feedback generation model is trained on 65,100 read utterances by 217 non-native speakers of 27 mother tongue backgrounds. The features are automatically learnt in an unsupervised way in an auxiliary classifier CycleGAN setting, in which the generator learns to map a foreign accented speech to native speech distributions. In order to inject linguistic knowledge into the network, an auxiliary classifier is trained so that the feedback also identifies the linguistic error types that were defined in the first half of the thesis. The proposed approach generates a corrected version the speech using the learners own voice, outperforming the conventional Pitch-Synchronous Overlap-and-Add method.외국어로서의 한국어 교육에 대한 관심이 고조되어 한국어 학습자의 수가 크게 증가하고 있으며, 음성언어처리 기술을 적용한 컴퓨터 기반 발음 교육(Computer-Assisted Pronunciation Training; CAPT) 어플리케이션에 대한 연구 또한 적극적으로 이루어지고 있다. 그럼에도 불구하고 현존하는 한국어 말하기 교육 시스템은 외국인의 한국어에 대한 언어학적 특징을 충분히 활용하지 않고 있으며, 최신 언어처리 기술 또한 적용되지 않고 있는 실정이다. 가능한 원인으로써는 외국인 발화 한국어 현상에 대한 분석이 충분하게 이루어지지 않았다는 점, 그리고 관련 연구가 있어도 이를 자동화된 시스템에 반영하기에는 고도화된 연구가 필요하다는 점이 있다. 뿐만 아니라 CAPT 기술 전반적으로는 신호처리, 운율 분석, 자연어처리 기법과 같은 특징 추출에 의존하고 있어서 적합한 특징을 찾고 이를 정확하게 추출하는 데에 많은 시간과 노력이 필요한 실정이다. 이는 최신 딥러닝 기반 언어처리 기술을 활용함으로써 이 과정 또한 발전의 여지가 많다는 바를 시사한다. 따라서 본 연구는 먼저 CAPT 시스템 개발에 있어 발음 변이 양상과 언어학적 상관관계를 분석하였다. 외국인 화자들의 낭독체 변이 양상과 한국어 원어민 화자들의 낭독체 변이 양상을 대조하고 주요한 변이를 확인한 후, 상관관계 분석을 통하여 의사소통에 영향을 미치는 중요도를 파악하였다. 그 결과, 종성 삭제와 3중 대립의 혼동, 초분절 관련 오류가 발생할 경우 피드백 생성에 우선적으로 반영하는 것이 필요하다는 것이 확인되었다. 교정된 피드백을 자동으로 생성하는 것은 CAPT 시스템의 중요한 과제 중 하나이다. 본 연구는 이 과제가 발화의 스타일 변화의 문제로 해석이 가능하다고 보았으며, 생성적 적대 신경망 (Cycle-consistent Generative Adversarial Network; CycleGAN) 구조에서 모델링하는 것을 제안하였다. GAN 네트워크의 생성모델은 비원어민 발화의 분포와 원어민 발화 분포의 매핑을 학습하며, Cycle consistency 손실함수를 사용함으로써 발화간 전반적인 구조를 유지함과 동시에 과도한 교정을 방지하였다. 별도의 특징 추출 과정이 없이 필요한 특징들이 CycleGAN 프레임워크에서 무감독 방법으로 스스로 학습되는 방법으로, 언어 확장이 용이한 방법이다. 언어학적 분석에서 드러난 주요한 변이들 간의 우선순위는 Auxiliary Classifier CycleGAN 구조에서 모델링하는 것을 제안하였다. 이 방법은 기존의 CycleGAN에 지식을 접목시켜 피드백 음성을 생성함과 동시에 해당 피드백이 어떤 유형의 오류인지 분류하는 문제를 수행한다. 이는 도메인 지식이 교정 피드백 생성 단계까지 유지되고 통제가 가능하다는 장점이 있다는 데에 그 의의가 있다. 본 연구에서 제안한 방법을 평가하기 위해서 27개의 모국어를 갖는 217명의 유의미 어휘 발화 65,100개로 피드백 자동 생성 모델을 훈련하고, 개선 여부 및 정도에 대한 지각 평가를 수행하였다. 제안된 방법을 사용하였을 때 학습자 본인의 목소리를 유지한 채 교정된 발음으로 변환하는 것이 가능하며, 전통적인 방법인 음높이 동기식 중첩가산 (Pitch-Synchronous Overlap-and-Add) 알고리즘을 사용하는 방법에 비해 상대 개선률 16.67%이 확인되었다.Chapter 1. Introduction 1 1.1. Motivation 1 1.1.1. An Overview of CAPT Systems 3 1.1.2. Survey of existing Korean CAPT Systems 5 1.2. Problem Statement 7 1.3. Thesis Structure 7 Chapter 2. Pronunciation Analysis of Korean Produced by Chinese 9 2.1. Comparison between Korean and Chinese 11 2.1.1. Phonetic and Syllable Structure Comparisons 11 2.1.2. Phonological Comparisons 14 2.2. Related Works 16 2.3. Proposed Analysis Method 19 2.3.1. Corpus 19 2.3.2. Transcribers and Agreement Rates 22 2.4. Salient Pronunciation Variations 22 2.4.1. Segmental Variation Patterns 22 2.4.1.1. Discussions 25 2.4.2. Phonological Variation Patterns 26 2.4.1.2. Discussions 27 2.5. Summary 29 Chapter 3. Correlation Analysis of Pronunciation Variations and Human Evaluation 30 3.1. Related Works 31 3.1.1. Criteria used in L2 Speech 31 3.1.2. Criteria used in L2 Korean Speech 32 3.2. Proposed Human Evaluation Method 36 3.2.1. Reading Prompt Design 36 3.2.2. Evaluation Criteria Design 37 3.2.3. Raters and Agreement Rates 40 3.3. Linguistic Factors Affecting L2 Korean Accentedness 41 3.3.1. Pearsons Correlation Analysis 41 3.3.2. Discussions 42 3.3.3. Implications for Automatic Feedback Generation 44 3.4. Summary 45 Chapter 4. Corrective Feedback Generation for CAPT 46 4.1. Related Works 46 4.1.1. Prosody Transplantation 47 4.1.2. Recent Speech Conversion Methods 49 4.1.3. Evaluation of Corrective Feedback 50 4.2. Proposed Method: Corrective Feedback as a Style Transfer 51 4.2.1. Speech Analysis at Spectral Domain 53 4.2.2. Self-imitative Learning 55 4.2.3. An Analogy: CAPT System and GAN Architecture 57 4.3. Generative Adversarial Networks 59 4.3.1. Conditional GAN 61 4.3.2. CycleGAN 62 4.4. Experiment 63 4.4.1. Corpus 64 4.4.2. Baseline Implementation 65 4.4.3. Adversarial Training Implementation 65 4.4.4. Spectrogram-to-Spectrogram Training 66 4.5. Results and Evaluation 69 4.5.1. Spectrogram Generation Results 69 4.5.2. Perceptual Evaluation 70 4.5.3. Discussions 72 4.6. Summary 74 Chapter 5. Integration of Linguistic Knowledge in an Auxiliary Classifier CycleGAN for Feedback Generation 75 5.1. Linguistic Class Selection 75 5.2. Auxiliary Classifier CycleGAN Design 77 5.3. Experiment and Results 80 5.3.1. Corpus 80 5.3.2. Feature Annotations 81 5.3.3. Experiment Setup 81 5.3.4. Results 82 5.4. Summary 84 Chapter 6. Conclusion 86 6.1. Thesis Results 86 6.2. Thesis Contributions 88 6.3. Recommendations for Future Work 89 Bibliography 91 Appendix 107 Abstract in Korean 117 Acknowledgments 120Docto

    Restructuring multimodal corrective feedback through Augmented Reality (AR)-enabled videoconferencing in L2 pronunciation teaching

    Get PDF
    The problem of cognitive overload is particularly pertinent in multimedia L2 classroom corrective feedback (CF), which involves rich communicative tools to help the class to notice the mismatch between the target input and learners’ pronunciation. Based on multimedia design principles, this study developed a new multimodal CF model through augmented reality (AR)-enabled videoconferencing to eliminate extraneous cognitive load and guide learners’ attention to the essential material. Using a quasi-experimental design, this study aims to examine the effectiveness of this new CF model in improving Chinese L2 students’ segmental production and identification of the targeted English consonants (dark /ɫ/, /ð/and /θ/), as well as their attitudes towards this application. Results indicated that the online multimodal CF environment equipped with AR annotation and filters played a significant role in improving the participants’ production of the target segments. However, this advantage was not found in the auditory identification tests compared to the offline CF multimedia class. In addition, the learners reported that the new CF model helped to direct their attention to the articulatory gestures of the student being corrected, and enhance the class efficiency. Implications for computer-assisted pronunciation training and the construction of online/offline multimedia learning environments are also discussed

    A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

    Get PDF
    The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

    Výuka výslovnosti angličtiny jako cizího jazyka

    Get PDF
    Výuka výslovnosti v rámci výuky angličtiny jako cizího jazyka představuje zanedbanou oblast, navzdory její důležitosti pro studenty. Data v literatuře nám ukazují, že učitelé nejsou připraveni vyučovat výslovnost hned z několika důvodů: nedostačující kvalifikace a školení, nedostatek teoretických a praktických znalostí, času a motivace. Tato diplomová práce zkoumá současnou situaci ve výuce výslovnosti na soukromé jazykové škole v České republice za použití observací a učitelských a studentských dotazníků. Výsledky práce potvrzují původní hypotézu, která říká, že výuka výslovnosti včetně oprav výslovnostních chyb takřka neexistuje, případně se v hodinách objevuje pouze sporadicky. Pouze jeden ze čtyř učitelů (T1) zahrnul do výuky explicitní informace o výslovnosti. Jediný způsob, jakým byly opravovány výslovnostní chyby byl takzvaný recast neboli zopakování slova či fráze se správnou výslovností, který se však ukázal být ve většině případů neefektivním. I přesto, že si je většina učitelů a studentů vědoma důležitostí výslovnosti při akvizici cizího jazyka, jejich přesvědčení a názory se velmi liší. Klíčová slova: výslovnost, výuka angličtiny jako cizího jazyka, explicitní výuka, segmentální jevy, suprasegmentální jevy, názory učitelů a studentůPronunciation instruction in the TEFL classroom has long been a neglected area regardless of its importance for the students. The data in the literature shows that teachers are generally not ready to provide pronunciation instruction for a variety of reasons: lack of qualification and training, theoretical and practical knowledge, time and motivation. The present thesis explores the current situation of pronunciation instruction at a private language school in the Czech Republic using of classroom observations and teacher and student surveys. The results confirm the initial hypothesis that pronunciation instruction including pronunciation error correction is nearly non-existent or occurs sporadically in the classroom. Only one out of four teachers (T1) included explicit pronunciation information into his teaching. The only pronunciation error correction technique observed with the four teachers was a recast which proved to be ineffective in most cases. Even though the teachers and students are generally aware of the importance of pronunciation in foreign language acquisition, their individual beliefs and attitudes towards pronunciation learning and teaching greatly differ. Key words: pronunciation, TEFL, explicit instruction, segmental features, suprasegmental features, teacher and student cognitionÚstav anglického jazyka a didaktikyDepartment of the English Language and ELT MethodologyFilozofická fakultaFaculty of Art

    Children\u27s Sensitivity to Pitch Variation in Language

    Get PDF
    Children acquire consonant and vowel categories by 12 months, but take much longer to learn to interpret perceptible variation. This dissertation considers children’s interpretation of pitch variation. Pitch operates, often simultaneously, at different levels of linguistic structure. English-learning children must disregard pitch at the lexical level—since English is not a tone language—while still attending to pitch for its other functions. Chapters 1 and 5 outline the learning problem and suggest ways children might solve it. Chapter 2 demonstrates that 2.5-year-olds know pitch cannot differentiate words in English. Chapter 3 finds that not until age 4–5 do children correctly interpret pitch cues to emotions. Chapter 4 demonstrates some sensitivity between 2.5 and 5 years to the pitch cue to lexical stress, but continuing difficulties at the older ages. These findings suggest a late trajectory for interpretation of prosodic variation; throughout, I propose explanations for this protracted time-course
    corecore