268 research outputs found

    The Effect of Speech Elicitation Method on Second Language Phonemic Accuracy

    Get PDF
    The present study, a One-Group Posttest-Only Repeated-Measures Design, examined the effect of speech elicitation method on second language (L2) phonemic accuracy of high functional load initial phonemes found in frequently occurring nouns in American English. This effect was further analyzed by including the variable of first language (L1) to determine if L1 moderated any effects found. The data consisted of audio recordings of 61 adult English learners (ELs) enrolled in English for Academic Purposes (EAP) courses at a large, public, post-secondary institution in the United States. Phonemic accuracy was judged by two independent raters as either approximating a standard American English (SAE) pronunciation of the intended phoneme or not, thus a dichotomous scale, and scores were assigned to each participant in terms of the three speech elicitation methods of word reading, word repetition, and picture naming. Results from a repeated measures ANOVA test revealed a statistically significant difference in phonemic accuracy (F(1.47, 87.93) = 25.94, p = .000) based on speech elicitation method, while the two-factor mixed design ANOVA test indicated no statistically significant differences for the moderator variable of native language. However, post-hoc analyses revealed that mean scores of picture naming tasks differed significantly from the other two elicitation methods of word reading and word repetition. Moreover, the results of this study should heighten attention to the role that various speech elicitation methods, or input modalities, might play on L2 productive accuracy. Implications for practical application suggest that caution should be used when utilizing pictures to elicit specific vocabulary words–even high-frequency words–as they might result in erroneous productions or no utterance at all. These methods could inform pronunciation instructors about best teaching practices when pronunciation accuracy is the objective. Finally, the impact of L1 on L2 pronunciation accuracy might not be as important as once thought

    Voice onset time of Mankiyali language: an acoustic analysis

    Get PDF
    The endangered Indo-Aryan language Mankiyali, spoken in northern Pakistan, lacks linguistic documentation and necessitates research. This study explores the Voice Onset Time (VOT) values of Mankiyali's stop consonants to determine the duration of sound release, characterized as negative, positive, and zero VOTs. The investigation aims to identify the laryngeal categories present in the language. Using a mixed methods approach, data were collected from five native male speakers via the Zoom H6 platform. The study employed the theoretical framework of Fant's (1970) source filter model and analyzed each phoneme using PRAAT software. Twenty-five tokens of a single phoneme were recorded across the five speakers. The results reveal that Mankiyali encompasses three laryngeal categories: voiceless unaspirated (VLUA) stops, voiceless aspirated (VLA) stops, and voiced unaspirated (VDUA) stops. The study highlights significant differences in VOTs based on place of articulation and phonation. In terms of phonation, the VLUA bilabial stop /p/, alveolar stop /t/, and velar stop /k/ exhibit shorter voicing lag compared to their VLA counterparts /pʰ, tʰ, kʰ/. All VLUA and VLA stops display +VOT values, while all VDUA stops exhibit -VOT values. Regarding place of articulation, the bilabial /p/ demonstrates a longer voicing lag than the alveolar /t/ but a shorter lag than the velar /k/. Additionally, the results indicate similarities in voicing lag among the VDUA stops /b, d, ց/. This study offers valuable insights into the phonetic and phonological aspects of Mankiyali and holds potential significance for the language's preservation

    Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

    Full text link
    Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research

    Arabic Fluency Assessment: Procedures for Assessing Stuttering in Arabic Preschool Children

    Get PDF
    The primary aim of this thesis was to screen school-aged (4+) children for two separate types of fluency issues and to distinguish both groups from fluent children. The two fluency issues are Word-Finding Difficulty (WFD) and other speech disfluencies (primarily stuttering). The cohort examined consisted of children who spoke Arabic and English. We first designed a phonological assessment procedure that can equitably test Arabic and English children, called the Arabic English non-word repetition task (AEN_NWR). Riley’s Stuttering Severity Instrument (SSI) is the standard way of assessing fluency for speakers of English. There is no standardized version of SSI for Arabic speakers. Hence, we designed a scheme to measure disfluency symptoms in Arabic speech (Arabic fluency assessment). The scheme recognizes that Arabic and English differ at all language levels (lexically, phonologically and syntactically). After the children with WFD had been separated from those with stuttering, our second aim was to develop and deliver appropriate interventions for the different cohorts. Specifically, we aimed to develop treatments for the children with WFD using short procedures that are suitable for conducting in schools. Children who stutter are referred to SLTs to receive the appropriate type of intervention. To treat WFD, another set of non-word materials was designed to include phonemic patterns not used in the speaker’s native language that are required if that speaker uses another targeted language (e.g. phonemic patterns that occur in English, but not Arabic). The goal was to use these materials in an intervention to train phonemic sequences that are not used in the child’s additional language such as the phonemic patterns that occur in English, but not Arabic. The hypothesis is that a native Arabic speaker learning English would be expected to struggle on those phonotactic patterns not used in Arabic that are required for English. In addition to the screening and intervention protocols designed, self-report procedures are desirable to assess speech fluency when time for testing is limited. To that end, the last chapter discussed the importance of designing a fluency questionnaire that can assess fluency in the entire population of speakers. Together with the AEN_NWR, the brief self-report instrument forms a package of assessment procedures that facilitate screening of speech disfluencies in Arabic children (aged 4+) when they first enter school. The seven chapters, described in more detail below, together constitute a package that achieves the aims of identifying speech problems in children using Arabic and/or English and offering intervention to treat WFD

    Acoustic Modelling for Under-Resourced Languages

    Get PDF
    Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages

    Voice Into Text: Case Studies in the History of Linguistic Transcription

    Get PDF
    As a contribution to the field of linguistic historiography (Swiggers, 2010), this thesis offers a detailed narrative of the ‘mental worlds’ of writers tackling the task of transcribing languages both before the appearance of the International Phonetic Alphabet in 1888 and at a time when the IPA was emerging as the agreed standard for phonetic transcription. The narrative includes an account of how the cultural, historical and political background in which these writers operated, ultimately shaped their linguistic transcriptions. I argued that this approach, which also included observations drawn from fields other than linguistics, helped to provide a far richer illustration of their mental worlds, and that its omission would have rendered my analysis seriously deficient. This work has also demonstrated that the writers’ own linguistic training could also hinder, rather than aid, the transcription process. It has also therefore focused on how the authors mediated the tension between their pre-existing linguistic knowledge and the reality of the data they had to analyse. It has been argued that success in this context also resulted in a successful transcription. The two corpora presented in this thesis are the Mohawk religious corpus held at the British Library, and the phonetic transcriptions of the British recordings included in the Berliner Lautarchiv, also at the British Library. Their peculiar characteristics, the challenges they posed to the transcribers, and the factors that led to their creation are discussed at length. With regards to the Mohawk corpus, the analysis has focused on the comparison of the notations of Mohawk by writers belonging to the French tradition and those by English-, German-, or Dutch-speaking authors. The analysis of the Berliner Lautarchiv corpus has instead focused on the phonetic transcriptions created by Alois Brandl, an Austrian Anglicist who was also a student of Henry Sweet

    Investigating the perception and production of the Arabic pharyngealised sounds by L2 learners of Arabic

    Get PDF
    Pronunciation has received relatively little attention within the field of Arabic second language teaching and learning, particularly with respect to the more prominent areas of morphology, syntax, psycholinguistics and sociolinguistics. In the field of phonetics and phonology, it has been argued that Arabic pharyngealised sounds are distinctive and unique to Arabic and they are considered the most difficult sounds to acquire by L2 learners of Arabic. This research included two experiments that focused on examining the ability of a group of Arabic L2 learners from different L1 backgrounds to perceive and produce the fricative sounds /z/, /θ/, /f/, /ʃ/, /ħ/, /h/, /χ/, /ɣ/, /ʕ/, /sˤ/, /ðˤ/, /s/, /ð/, and the emphatic sounds /sˤ/, /ðˤ/, /dˤ/, and /tˤ/ in contrast with nonpharyngealised variants /s/, /ð/, /d/ and /t/. The aims were to investigate which aspects of acquisition were difficult and to examine the effects of technology-based instruction and traditional-based instruction to find an appropriate pronunciation teaching method to facilitate the perception and production of fricatives and emphatics. The technology-based method used in this study was adapted from Olson (2014) and Offerman and Olson (2016) to investigate the extent to which using speech analysis technology (Praat) can help in visualising the difference between pharyngealised and non-pharyngealised sounds in order to aid production and perception learning. The traditional-based method used in this study included repetition, practicing minimal pairs, and reading aloud techniques. Data were collected from forced-choice identification tasks and recordings taken during pre- and post-test conditions. The results revealed that the some of the fricatives and all the emphatic sounds posed perception and production difficulty to some L2 learners of Arabic, which is likely to be due to the absence of these sounds from the learners’ L1s. The results also showed significant improvements among all participants after the traditional and technology training courses. However, no significant difference was observed between L2 learners who received the traditional-based method and those who received the technology-based method. Both methods have increased students’ awareness and understanding of the features of the sounds under investigation. The contribution of the current study is to show how Arabic fricative and emphatic sounds can be effectively taught using form-focused instruction involving different traditional and technological techniques. This research has implications for the implementation of both techniques for language teachers and researchers as it shows how both approaches can be used to enhance students’ perceptive and productive skills

    Distinguishing a phonological encoding disorder from Apraxia of Speech in individuals with aphasia by using EEG

    Get PDF
    As we speak, various processes take place in our brains. We find the word, find and organize the speech sounds and program the movements for speech. A stroke may cause impairment at any of these processes. Usually, multiple processes are affected. Existing methods to distinguish a disorder in finding and organizing speech sounds (phonological encoding) from an impairment in programming the articulation (Apraxia of Speech) are not optimal. In this thesis, it was studied whether EEG, measuring small changes in electric brain activity with electrodes that are placed on the scalp, can be used for this purpose. A protocol was developed to trace the processes of speech production, which was successfully tested in a group of younger and one of older neurologically healthy adults. In the younger and older adults, the processes were registered at the same electrodes on the scalp, but the time window and the waveform of the processes differed. In individuals with a phonological encoding disorder and those with Apraxia of Speech the disordered processes could not be identified, because the severity of the impairment in the groups varied. Their impaired processes differed from those in neurologically healthy individuals. Also, because of their disorder in the previous stage, the programming of the articulation was different in individuals with a phonological encoding disorder. The protocol can distinguish a phonological encoding disorder from Apraxia of Speech due to differences in the EEG data (relative to neurologically healthy participants) that only were observed during programming movements for speech

    Procceding 2rd International Seminar on Linguistics

    Get PDF
    • …
    corecore