6 research outputs found

    Foneettinen sujuvuus suomessa toisena kielenÀ: Lukiolaisten spontaanin puheen akustinen analyysi

    Get PDF
    Speaking fluently is an important goal for second language (L2) learners. In L2 research, fluency is often studied by measuring temporal features in speech. These features include speed (rate of speech), breakdown (use of silent and filled pauses), and repair (self-corrections and repetitions) phenomena. Fluent speakers generally have a higher rate of speech and fewer hesitations and interruptions than beginner language learners. In this thesis, phonetic fluency of high school students’ L2 Finnish speech is studied in relation to human ratings of fluency and overall proficiency. The topic is essential for the development of automated assessment of L2 speech, as phonetic fluency measures can be used for predicting a speaker’s fluency and proficiency level automatically. Although the effect of different fluency measures on perceived fluency level has been widely studied during the last decades, research on phonetic fluency in Finnish as L2 is still limited. Phonetic fluency in high school students’ speech in L2 Finnish has not been studied before. The speech samples and ratings used in this thesis are a part of a larger dataset collected in the DigiTala research project. The analyzed data contained spontaneous speech samples in L2 Finnish from 53 high school students of different language backgrounds. All samples were assessed by expert raters for fluency and overall proficiency. The speech samples were annotated by marking intervals containing silent pauses, filled pauses, corrections and repetitions, and individual words. Several phonetic fluency measures were calculated for each sample from the durations of the annotated intervals. The contribution of phonetic fluency measures to human ratings of fluency and proficiency was studied using simple and multiple linear regression models. Speech rate was found to be the strongest predictor for both fluency and proficiency ratings in simple linear regression. Articulation rate, portion of long silent pauses, mean duration of long silent pauses, mean duration of breaks between utterances, and rate of short silent pauses per minute were also statistically significant predictors of both fluency and proficiency ratings. Multiple linear regression models improved the simple models for both fluency and proficiency: for fluency, a model with a combination of articulation rate and the portion of long silent pauses performed the best, and for proficiency, a model with a combination of speech rate and mean duration of short silent pauses. Perceived fluency level is often affected by a combination of different phonetic fluency measures, and it seems that human raters ground their assessments on this combination, although some phonetic fluency measures might be more important on their own than others. The findings of this thesis expand previous knowledge on phonetic fluency in L2 Finnish and can benefit both language learners and teachers, as well as developers of automatic assessment of L2 speech.Sujuvaa puhetaitoa pidetÀÀn tĂ€rkeĂ€nĂ€ tavoitteena toisen kielen (L2) oppimisessa. L2-puheen tutkimuksissa sujuvuutta tutkitaan usein puheesta mitattavilla temporaalisilla piirteillĂ€, joita ovat esimerkiksi puheen nopeus, tauot, korjaukset ja toistot. Nopea, vĂ€hĂ€n epĂ€röintiĂ€ ja keskeytyksiĂ€ sisĂ€ltĂ€vĂ€ puhe mielletÀÀn usein sujuvaksi, ja toisen kielen oppimisen alkuvaiheessa puhe on epĂ€sujuvampaa. TĂ€ssĂ€ tutkielmassa tutkitaan lukiolaisten L2-suomen foneettista sujuvuutta puheesta mitattavien foneettisten sujuvuuspiirteiden sekĂ€ sujuvuus- ja taitotasoarvioiden avulla. Tutkimusaihe liittyy myös puheen automaattisen arvioinnin kehittĂ€miseen, sillĂ€ kielenoppijan sujuvuus- ja taitotasoa voidaan ennustaa automaattisesti foneettisten sujuvuuspiirteiden avulla. Vaikka sujuvuuspiirteiden ja arviointien vĂ€listĂ€ yhteyttĂ€ on tutkittu melko paljon viime vuosikymmeninĂ€, L2-suomen foneettiseen sujuvuuteen liittyviĂ€ tutkimuksia on yhĂ€ vĂ€hĂ€n. Lukiolaisten L2-suomen foneettista sujuvuutta ei ole aiemmin tutkittu. Tutkielmassa kĂ€ytetty puhe- ja arviointiaineisto on osa suurempaa aineistoa, joka on kerĂ€tty DigiTala-tutkimusprojektissa. Analysoitu aineisto sisĂ€lsi 53 spontaania puhenĂ€ytettĂ€ lukiolaisilta, jotka puhuvat suomea toisena kielenĂ€. LisĂ€ksi jokaisen puhenĂ€ytteen sujuvuus ja yleinen taitotaso oli arvioitu. PuhenĂ€ytteisiin annotoitiin hiljaiset ja tĂ€ytetyt tauot, korjaukset ja toistot sekĂ€ yksittĂ€iset sanat. Annotoitujen intervallien kestoista laskettiin useita foneettisia sujuvuuspiirteitĂ€ jokaiselle puhenĂ€ytteelle. Foneettisten sujuvuuspiirteiden vaikutusta ihmisarvioihin tutkittiin lineaaristen regressiomallien avulla. Puhenopeus ennusti yhden selittĂ€vĂ€n muuttujan malleissa sekĂ€ sujuvuus- ettĂ€ taitotasoarvioita parhaiten. TĂ€mĂ€n lisĂ€ksi artikulaationopeus, pitkien hiljaisten taukojen osuus, pitkien hiljaisten taukojen keskimÀÀrĂ€inen kesto, yhtenĂ€isten puhejaksojen vĂ€listen keskeytysten keskimÀÀrĂ€inen kesto ja lyhyiden hiljaisten taukojen suhteellinen lukumÀÀrĂ€ olivat tilastollisesti merkitseviĂ€ ennustajia yhden selittĂ€vĂ€n muuttujan malleissa. Useamman selittĂ€vĂ€n muuttujan mallit paransivat aiempien mallien selitysvoimaa sekĂ€ sujuvuus- ettĂ€ taitotasoarvioissa: artikulaationopeuden ja pitkien hiljaisten taukojen osuuden yhdistelmĂ€ ennusti sujuvuusarvioita parhaiten, ja puhenopeuden ja lyhyiden hiljaisten taukojen keskimÀÀrĂ€isen keston yhdistelmĂ€ taitotasoarvioita. Puheen havaittuun sujuvuuteen vaikuttaa usein yhdistelmĂ€ erilaisia sujuvuuspiirteitĂ€, vaikka yksittĂ€isten piirteiden vaikutukset voivat olla keskenÀÀn erilaisia. Tutkielman tulokset lisÀÀvĂ€t tietoa L2-suomen foneettisesta sujuvuudesta, ja ne ovat tarpeellisia niin kielenoppijoille, -opettajille kuin puheen automaattisten arviointityökalujen kehittĂ€jille

    Fluency-related Temporal Features and Syllable Prominence as Prosodic Proficiency Predictors for Learners of English with Different Language Backgrounds

    Get PDF
    Prosodic features are important in achieving intelligibility, comprehensibility, and fluency in a second or foreign language (L2). However, research on the assessment of prosody as part of oral proficiency remains scarce. Moreover, the acoustic analysis of L2 prosody has often focused on fluency-related temporal measures, neglecting language-dependent stress features that can be quantified in terms of syllable prominence. Introducing the evaluation of prominence-related measures can be of use in developing both teaching and assessment of L2 speaking skills. In this study we compare temporal measures and syllable prominence estimates as predictors of prosodic proficiency in non-native speakers of English with respect to the speaker's native language (L1). The predictive power of temporal and prominence measures was evaluated for utterance-sized samples produced by language learners from four different L1 backgrounds: Czech, Slovak, Polish, and Hungarian. Firstly, the speech samples were assessed using the revised Common European Framework of Reference scale for prosodic features. The assessed speech samples were then analyzed to derive articulation rate and three fluency measures. Syllable-level prominence was estimated by a continuous wavelet transform analysis using combinations of F0, energy, and syllable duration. The results show that the temporal measures serve as reliable predictors of prosodic proficiency in the L2, with prominence measures providing a small but significant improvement to prosodic proficiency predictions. The predictive power of the individual measures varies both quantitatively and qualitatively depending on the L1 of the speaker. We conclude that the possible effects of the speaker's L1 on the production of L2 prosody in terms of temporal features as well as syllable prominence deserve more attention in applied research and developing teaching and assessment methods for spoken L2.Peer reviewe

    Reading Development in Adolescent First and Second Language English Learners: A Comparison Using Age Match Design

    Get PDF
    Fourteen Iranian-Canadian bilingual students were tested for language ability as well as cognitive and phonological processing skills in two languages: Farsi and English. They were compared to 30 Iranian monolingual chronological age matched students and 30 Canadian chronological age matched peers. Since there were not any standardized tests in Farsi, one of the aims of this study was to begin creating the language ability measures in Farsi, and to test their reliabilities. In general, from six developed and translated Farsi tasks, three of them were found to be reliable. It was found that bilingual students perform better on memory tasks, compared to two other monolingual groups. There were not any group differences on English measures of reading comprehension and word reading among Iranian bilingual students and their English age matched peers. Additionally, the results of this study showed that Iranian bilinguals performed better on the measure of receptive vocabulary, knowing more English words in comparison to Canadian monolinguals. This finding could be explained by the higher socio-economic status and greater number of English books that Iranian bilinguals have. The final key finding is that Iranian bilinguals perform more poorly on Farsi tasks, and better on English measures compared to Iranian monolinguals

    The development of automatic speech evaluation system for learners of English

    Get PDF
    戶ćșŠ:新 ; 栱摊ç•Șć·:ç”Č3183ć· ; ć­ŠäœăźçšźéĄž:ćšćŁ«(教è‚Čć­Š) ; 授䞎ćčŽæœˆæ—„:2010/11/30 ; æ—©ć€§ć­Šäœèš˜ç•Șć·:新547

    Automatic Screening of Childhood Speech Sound Disorders and Detection of Associated Pronunciation Errors

    Full text link
    Speech disorders in children can affect their fluency and intelligibility. Delay in their diagnosis and treatment increases the risk of social impairment and learning disabilities. With the significant shortage of Speech and Language Pathologists (SLPs), there is an increasing interest in Computer-Aided Speech Therapy tools with automatic detection and diagnosis capability. However, the scarcity and unreliable annotation of disordered child speech corpora along with the high acoustic variations in the child speech data has impeded the development of reliable automatic detection and diagnosis of childhood speech sound disorders. Therefore, this thesis investigates two types of detection systems that can be achieved with minimum dependency on annotated mispronounced speech data. First, a novel approach that adopts paralinguistic features which represent the prosodic, spectral, and voice quality characteristics of the speech was proposed to perform segment- and subject-level classification of Typically Developing (TD) and Speech Sound Disordered (SSD) child speech using a binary Support Vector Machine (SVM) classifier. As paralinguistic features are both language- and content-independent, they can be extracted from an unannotated speech signal. Second, a novel Mispronunciation Detection and Diagnosis (MDD) approach was introduced to detect the pronunciation errors made due to SSDs and provide low-level diagnostic information that can be used in constructing formative feedback and a detailed diagnostic report. Unlike existing MDD methods where detection and diagnosis are performed at the phoneme level, the proposed method achieved MDD at the speech attribute level, namely the manners and places of articulations. The speech attribute features describe the involved articulators and their interactions when making a speech sound allowing a low-level description of the pronunciation error to be provided. Two novel methods to model speech attributes are further proposed in this thesis, a frame-based (phoneme-alignment) method leveraging the Multi-Task Learning (MTL) criterion and training a separate model for each attribute, and an alignment-free jointly-learnt method based on the Connectionist Temporal Classification (CTC) sequence to sequence criterion. The proposed techniques have been evaluated using standard and publicly accessible adult and child speech corpora, while the MDD method has been validated using L2 speech corpora
    corecore