Search CORE

6 research outputs found

Foneettinen sujuvuus suomessa toisena kielenä: Lukiolaisten spontaanin puheen akustinen analyysi

Author: Koivusalo Liisa
Publication venue: Helsingfors universitet
Publication date: 01/01/2022
Field of study

Speaking fluently is an important goal for second language (L2) learners. In L2 research, fluency is often studied by measuring temporal features in speech. These features include speed (rate of speech), breakdown (use of silent and filled pauses), and repair (self-corrections and repetitions) phenomena. Fluent speakers generally have a higher rate of speech and fewer hesitations and interruptions than beginner language learners. In this thesis, phonetic fluency of high school students’ L2 Finnish speech is studied in relation to human ratings of fluency and overall proficiency. The topic is essential for the development of automated assessment of L2 speech, as phonetic fluency measures can be used for predicting a speaker’s fluency and proficiency level automatically. Although the effect of different fluency measures on perceived fluency level has been widely studied during the last decades, research on phonetic fluency in Finnish as L2 is still limited. Phonetic fluency in high school students’ speech in L2 Finnish has not been studied before. The speech samples and ratings used in this thesis are a part of a larger dataset collected in the DigiTala research project. The analyzed data contained spontaneous speech samples in L2 Finnish from 53 high school students of different language backgrounds. All samples were assessed by expert raters for fluency and overall proficiency. The speech samples were annotated by marking intervals containing silent pauses, filled pauses, corrections and repetitions, and individual words. Several phonetic fluency measures were calculated for each sample from the durations of the annotated intervals. The contribution of phonetic fluency measures to human ratings of fluency and proficiency was studied using simple and multiple linear regression models. Speech rate was found to be the strongest predictor for both fluency and proficiency ratings in simple linear regression. Articulation rate, portion of long silent pauses, mean duration of long silent pauses, mean duration of breaks between utterances, and rate of short silent pauses per minute were also statistically significant predictors of both fluency and proficiency ratings. Multiple linear regression models improved the simple models for both fluency and proficiency: for fluency, a model with a combination of articulation rate and the portion of long silent pauses performed the best, and for proficiency, a model with a combination of speech rate and mean duration of short silent pauses. Perceived fluency level is often affected by a combination of different phonetic fluency measures, and it seems that human raters ground their assessments on this combination, although some phonetic fluency measures might be more important on their own than others. The findings of this thesis expand previous knowledge on phonetic fluency in L2 Finnish and can benefit both language learners and teachers, as well as developers of automatic assessment of L2 speech.Sujuvaa puhetaitoa pidetään tärkeänä tavoitteena toisen kielen (L2) oppimisessa. L2-puheen tutkimuksissa sujuvuutta tutkitaan usein puheesta mitattavilla temporaalisilla piirteillä, joita ovat esimerkiksi puheen nopeus, tauot, korjaukset ja toistot. Nopea, vähän epäröintiä ja keskeytyksiä sisältävä puhe mielletään usein sujuvaksi, ja toisen kielen oppimisen alkuvaiheessa puhe on epäsujuvampaa. Tässä tutkielmassa tutkitaan lukiolaisten L2-suomen foneettista sujuvuutta puheesta mitattavien foneettisten sujuvuuspiirteiden sekä sujuvuus- ja taitotasoarvioiden avulla. Tutkimusaihe liittyy myös puheen automaattisen arvioinnin kehittämiseen, sillä kielenoppijan sujuvuus- ja taitotasoa voidaan ennustaa automaattisesti foneettisten sujuvuuspiirteiden avulla. Vaikka sujuvuuspiirteiden ja arviointien välistä yhteyttä on tutkittu melko paljon viime vuosikymmeninä, L2-suomen foneettiseen sujuvuuteen liittyviä tutkimuksia on yhä vähän. Lukiolaisten L2-suomen foneettista sujuvuutta ei ole aiemmin tutkittu. Tutkielmassa käytetty puhe- ja arviointiaineisto on osa suurempaa aineistoa, joka on kerätty DigiTala-tutkimusprojektissa. Analysoitu aineisto sisälsi 53 spontaania puhenäytettä lukiolaisilta, jotka puhuvat suomea toisena kielenä. Lisäksi jokaisen puhenäytteen sujuvuus ja yleinen taitotaso oli arvioitu. Puhenäytteisiin annotoitiin hiljaiset ja täytetyt tauot, korjaukset ja toistot sekä yksittäiset sanat. Annotoitujen intervallien kestoista laskettiin useita foneettisia sujuvuuspiirteitä jokaiselle puhenäytteelle. Foneettisten sujuvuuspiirteiden vaikutusta ihmisarvioihin tutkittiin lineaaristen regressiomallien avulla. Puhenopeus ennusti yhden selittävän muuttujan malleissa sekä sujuvuus- että taitotasoarvioita parhaiten. Tämän lisäksi artikulaationopeus, pitkien hiljaisten taukojen osuus, pitkien hiljaisten taukojen keskimääräinen kesto, yhtenäisten puhejaksojen välisten keskeytysten keskimääräinen kesto ja lyhyiden hiljaisten taukojen suhteellinen lukumäärä olivat tilastollisesti merkitseviä ennustajia yhden selittävän muuttujan malleissa. Useamman selittävän muuttujan mallit paransivat aiempien mallien selitysvoimaa sekä sujuvuus- että taitotasoarvioissa: artikulaationopeuden ja pitkien hiljaisten taukojen osuuden yhdistelmä ennusti sujuvuusarvioita parhaiten, ja puhenopeuden ja lyhyiden hiljaisten taukojen keskimääräisen keston yhdistelmä taitotasoarvioita. Puheen havaittuun sujuvuuteen vaikuttaa usein yhdistelmä erilaisia sujuvuuspiirteitä, vaikka yksittäisten piirteiden vaikutukset voivat olla keskenään erilaisia. Tutkielman tulokset lisäävät tietoa L2-suomen foneettisesta sujuvuudesta, ja ne ovat tarpeellisia niin kielenoppijoille, -opettajille kuin puheen automaattisten arviointityökalujen kehittäjille

Helsingin yliopiston digitaalinen arkisto

Fluency-related Temporal Features and Syllable Prominence as Prosodic Proficiency Predictors for Learners of English with Different Language Backgrounds

Author: Kallio Heini
Suni Antti
Šimko Juraj
Publication venue
Publication date: 01/09/2022
Field of study

Prosodic features are important in achieving intelligibility, comprehensibility, and fluency in a second or foreign language (L2). However, research on the assessment of prosody as part of oral proficiency remains scarce. Moreover, the acoustic analysis of L2 prosody has often focused on fluency-related temporal measures, neglecting language-dependent stress features that can be quantified in terms of syllable prominence. Introducing the evaluation of prominence-related measures can be of use in developing both teaching and assessment of L2 speaking skills. In this study we compare temporal measures and syllable prominence estimates as predictors of prosodic proficiency in non-native speakers of English with respect to the speaker's native language (L1). The predictive power of temporal and prominence measures was evaluated for utterance-sized samples produced by language learners from four different L1 backgrounds: Czech, Slovak, Polish, and Hungarian. Firstly, the speech samples were assessed using the revised Common European Framework of Reference scale for prosodic features. The assessed speech samples were then analyzed to derive articulation rate and three fluency measures. Syllable-level prominence was estimated by a continuous wavelet transform analysis using combinations of F0, energy, and syllable duration. The results show that the temporal measures serve as reliable predictors of prosodic proficiency in the L2, with prominence measures providing a small but significant improvement to prosodic proficiency predictions. The predictive power of the individual measures varies both quantitatively and qualitatively depending on the L1 of the speaker. We conclude that the possible effects of the speaker's L1 on the production of L2 prosody in terms of temporal features as well as syllable prominence deserve more attention in applied research and developing teaching and assessment methods for spoken L2.Peer reviewe

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Reading Development in Adolescent First and Second Language English Learners: A Comparison Using Age Match Design

Author: Shahidi Vahidehsadat
Publication venue: Scholars Commons @ Laurier
Publication date: 01/01/2011
Field of study

Fourteen Iranian-Canadian bilingual students were tested for language ability as well as cognitive and phonological processing skills in two languages: Farsi and English. They were compared to 30 Iranian monolingual chronological age matched students and 30 Canadian chronological age matched peers. Since there were not any standardized tests in Farsi, one of the aims of this study was to begin creating the language ability measures in Farsi, and to test their reliabilities. In general, from six developed and translated Farsi tasks, three of them were found to be reliable. It was found that bilingual students perform better on memory tasks, compared to two other monolingual groups. There were not any group differences on English measures of reading comprehension and word reading among Iranian bilingual students and their English age matched peers. Additionally, the results of this study showed that Iranian bilinguals performed better on the measure of receptive vocabulary, knowing more English words in comparison to Canadian monolinguals. This finding could be explained by the higher socio-economic status and greater number of English books that Iranian bilinguals have. The final key finding is that Iranian bilinguals perform more poorly on Farsi tasks, and better on English measures compared to Iranian monolinguals

Wilfrid Laurier University

Automatic Proficiency Evaluation of Spoken English by Japanese Learners for Dialogue-Based Language Learning System Based on Deep Learning

Author: FU JIANG
Publication venue
Publication date: 25/03/2020
Field of study

Tohoku University伊藤彰則課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

The development of automatic speech evaluation system for learners of English

Author: Kondo Yusuke
Publication venue
Publication date: 01/01/2010
Field of study

制度:新 ; 報告番号:甲3183号 ; 学位の種類:博士(教育学) ; 授与年月日:2010/11/30 ; 早大学位記番号:新547

Waseda University Repository

Automatic Screening of Childhood Speech Sound Disorders and Detection of Associated Pronunciation Errors

Author: Shahin Mostafa
Publication venue: UNSW, Sydney
Publication date: 01/01/2023
Field of study

Speech disorders in children can affect their fluency and intelligibility. Delay in their diagnosis and treatment increases the risk of social impairment and learning disabilities. With the significant shortage of Speech and Language Pathologists (SLPs), there is an increasing interest in Computer-Aided Speech Therapy tools with automatic detection and diagnosis capability. However, the scarcity and unreliable annotation of disordered child speech corpora along with the high acoustic variations in the child speech data has impeded the development of reliable automatic detection and diagnosis of childhood speech sound disorders. Therefore, this thesis investigates two types of detection systems that can be achieved with minimum dependency on annotated mispronounced speech data. First, a novel approach that adopts paralinguistic features which represent the prosodic, spectral, and voice quality characteristics of the speech was proposed to perform segment- and subject-level classification of Typically Developing (TD) and Speech Sound Disordered (SSD) child speech using a binary Support Vector Machine (SVM) classifier. As paralinguistic features are both language- and content-independent, they can be extracted from an unannotated speech signal. Second, a novel Mispronunciation Detection and Diagnosis (MDD) approach was introduced to detect the pronunciation errors made due to SSDs and provide low-level diagnostic information that can be used in constructing formative feedback and a detailed diagnostic report. Unlike existing MDD methods where detection and diagnosis are performed at the phoneme level, the proposed method achieved MDD at the speech attribute level, namely the manners and places of articulations. The speech attribute features describe the involved articulators and their interactions when making a speech sound allowing a low-level description of the pronunciation error to be provided. Two novel methods to model speech attributes are further proposed in this thesis, a frame-based (phoneme-alignment) method leveraging the Multi-Task Learning (MTL) criterion and training a separate model for each attribute, and an alignment-free jointly-learnt method based on the Connectionist Temporal Classification (CTC) sequence to sequence criterion. The proposed techniques have been evaluated using standard and publicly accessible adult and child speech corpora, while the MDD method has been validated using L2 speech corpora

UNSWorks