18 research outputs found

    Automatic Screening of Childhood Speech Sound Disorders and Detection of Associated Pronunciation Errors

    Full text link
    Speech disorders in children can affect their fluency and intelligibility. Delay in their diagnosis and treatment increases the risk of social impairment and learning disabilities. With the significant shortage of Speech and Language Pathologists (SLPs), there is an increasing interest in Computer-Aided Speech Therapy tools with automatic detection and diagnosis capability. However, the scarcity and unreliable annotation of disordered child speech corpora along with the high acoustic variations in the child speech data has impeded the development of reliable automatic detection and diagnosis of childhood speech sound disorders. Therefore, this thesis investigates two types of detection systems that can be achieved with minimum dependency on annotated mispronounced speech data. First, a novel approach that adopts paralinguistic features which represent the prosodic, spectral, and voice quality characteristics of the speech was proposed to perform segment- and subject-level classification of Typically Developing (TD) and Speech Sound Disordered (SSD) child speech using a binary Support Vector Machine (SVM) classifier. As paralinguistic features are both language- and content-independent, they can be extracted from an unannotated speech signal. Second, a novel Mispronunciation Detection and Diagnosis (MDD) approach was introduced to detect the pronunciation errors made due to SSDs and provide low-level diagnostic information that can be used in constructing formative feedback and a detailed diagnostic report. Unlike existing MDD methods where detection and diagnosis are performed at the phoneme level, the proposed method achieved MDD at the speech attribute level, namely the manners and places of articulations. The speech attribute features describe the involved articulators and their interactions when making a speech sound allowing a low-level description of the pronunciation error to be provided. Two novel methods to model speech attributes are further proposed in this thesis, a frame-based (phoneme-alignment) method leveraging the Multi-Task Learning (MTL) criterion and training a separate model for each attribute, and an alignment-free jointly-learnt method based on the Connectionist Temporal Classification (CTC) sequence to sequence criterion. The proposed techniques have been evaluated using standard and publicly accessible adult and child speech corpora, while the MDD method has been validated using L2 speech corpora

    The impact of regional accent variation on monolingual and bilingual infants’ lexical processing

    Get PDF
    Phonetic variation is inherent in natural speech. It can be lexically relevant, differentiating words, as well as lexically irrelevant indexical variation, which gives information about the talker or context, such as the gender, mood, regional or foreign accent. Efficient communication requires perceivers to discern how lexical versus indexical sources of variation affect the phonetic form of spoken words. While ample evidence is available on how children acquiring a single language handle variability in speech, less is known about how children simultaneously acquiring two languages deal with phonetic variation. This thesis investigates how the bilingual language environment affects children’s ability to accommodate accented speech. We consider three hypotheses. One is that bilingual infants may have an advantage relative to monolinguals due to their greater experience with phonetic variability across their two phonological systems. This is because the lexical representations in bilingual children, who have more experience with accent variation than monolingual children, might be more open to phonetic variation than monolinguals. Representations that are more open to variation might lead to higher flexibility in the word recognition of children with multi-accent input (bilinguals), resulting in accommodation benefits when processing an unfamiliar accent. An alternative hypothesis, however, is that bilingual children may have less stable lexical representations than monolinguals because their vocabulary size in each language is smaller. This could lead to processing costs in accent adaptation, resulting in accommodation disadvantages for bilinguals. The third and final hypothesis is that there would be no difference between bilinguals and their monolingual peers. This is because the effects of greater accent experience but less stable lexical representations in bilinguals may essentially neutralise each other, resulting in equivalent accent accommodation by bilinguals and monolinguals. To evaluate these hypotheses, three experiments were conducted with 17- and 25-month-old bilingual and monolingual children. Their ability to accommodate unfamiliar accented speech was analysed based on their language experience, pre-exposure to the unfamiliar accent, the type of phonetic variation (easy versus difficult phonetic change), and the cognitive demands of the experimental procedure. Taken together, the findings of Experiments 1-3 suggest that bilingual language input neither benefits nor hampers accent adaptation in bilingual children relative to monolingual children. The results carry implications for our current understanding of bilingualism and phonological development

    Chinese subtitles of English-language feature films in Taiwan: A systematic investigation of solution-types

    Get PDF
    Subtitling differs from the traditional idea of translation – from a written source text to a written target text. The transference is from a source text which consists of verbal information and non-verbal information from audio and visual channels, to a written target text which is constrained by the limited time and space on the screen. Subtitling involves not only transfer across languages but also a change of mode, from the spoken mode to the written mode and sometimes from the mode of moving images or sound effect to the written mode. Given the multimodal nature of film text, subtitling are expected to utilise different filmic signs and produce subtitles that fit into the montage of the film, taking into consideration the technical constraints and target viewers‟ processing effort. With the prevalence of translated audiovisual products, subtitling has drawn a considerable amount of scholarly attention. However, most of the research in this field focuses on the European scene and the language pairs studied are closely related. Given the lack of research into Chinese subtitles and the fact that the Chinese language and culture are very different from the English language and culture, the present study has aimed to investigate the way verbal elements in English-language feature films are translated into Mandarin subtitles in Taiwan. It looks at subtitling in general, subtitling extralinguistic cultural references and subtitling humour. Being descriptive in nature, it describes current translation practice by comparing the source text segment with its corresponding target text one and explores different types of solutions applied. By quantifying the frequency of each solution-type, some trends of subtitling are also generalised. The results show that subtitles of English-language feature films in Taiwan are source- text-oriented, as most of the source-text segments are closely rendered to the target text by source-language-oriented solutions, in which the source-text item undergoes minimum changes. Target-language-oriented solutions are seldom applied and extreme target-language-oriented ones are rarely found. The high percentage of source-language-oriented solutions indicates that Taiwanese subtitlers are reluctant to alter the source text; subtitling, as the preferred method of film translation in Taiwan, is seen as a means to bring the exotic experience to target viewers. It also suggests that most of the source-text elements can be transferred directly as the need to employ content-changing solutions is low. This study also compares its findings with those of other studies which are based on similar methods but focus on Scandinavian subtitling. Contrary to what might be expected, since the linguistic and cultural relatedness and the target audience‟s proficiency in the source language are different in these studies, the results are very similar. The trend towards source orientation in subtitling is observed in recent years across different languages, and it is largely due to globalisation, the influence of US popular culture and information boom that break cultural and linguistic boundaries. It appears that cultural influence is a more important factor than cultural affinity in determining a subtitles choice of solutions

    Modelling multimodal language processing

    Get PDF

    Nativeness, dominance, and the flexibility of listening to spoken language

    No full text

    L2 English fricative production by Thai learners

    Get PDF
    PhD ThesisIn early research on L2 (second language) phonology, researchers mainly focussed on whether L2 learners can achieve ‘target-likeness’, which relates to whether or not a sound is perceived as the intended target or whether it fits into the expected IPA category as determined by trained phonetician(s). The popular model for this focus was the contrastive analysis hypothesis (CAH) (Lado, 1957). Later research extended the focus to judgements of ‘native-likeness’, which is the extent to which the speaker’s L2 sound production has native-like qualities. Methods such as accent rating tasks and acoustic measurements have become popular over time, together with investigations of how the results correlate with external factors which are thought to influence L2 speech learning. Well-known models such as the Speech Learning Model (SLM) (Flege, 1995) and the Perceptual Assimilation Model (PAM) (Best, 1995) have been very influential in this field, but are mainly based on assumptions regarding L2 learners in a naturalistic setting. The aim of this thesis is to investigate L2 English fricative production by Thai learners of English with a combination of focus on target-likeness and native-likeness through four types of analysis: impressionistic, sound identification, accent rating, and acoustic analyses. This thesis also explores external factors which may contribute to target-likeness in L2 production which is more important than native-likeness as it helps in communication between interlocutors. The L2 fricatives are divided into those that have a counterpart in Thai (/f, s/ henceforth ‘shared’ sounds) and those that do not (/v, θ, ð, z, ʃ/, henceforth ‘non-shared’). As CAH focuses on target-likeness, it predicts that shared sounds are easy to produce; SLM, on the other hand, focuses on native-likeness and predicts that shared sounds are difficult to produce. Results from the four experiments in this study show mixed results. In terms of results from impressionistic and sound identification analyses, CAH-based hypotheses accurately predict most results, which show that shared sounds are more frequently produced in a target-like manner and more accurately identified. In terms of results from the accent rating task, SLM had to be rejected in this case, as results showed that shared fricatives were more often produced in a native-like manner, unlike non-shared fricatives. In the acoustic investigation, ii differences in the realisations of L2 shared sounds supported SLM-based hypotheses in some contexts. And although SLM-based hypotheses were disconfirmed when it came to the accent rating of L2 shared and non-shared sounds, the phonetic properties of non-shared sounds in the realisations that were deemed target-like were native-like in many contexts, suggesting some L2 attainment for non-shared sounds. Taken as a whole, these results emphasise the need to focus on both target-likeness and native-likeness in investigating L2 speech production. They also imply that L1 and L2 sound comparison is context- and task-dependent.Naresuan Universit

    Proceedings of the VIIth GSCP International Conference

    Get PDF
    The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)

    Cultures and Traditions of Wordplay and Wordplay Research

    Get PDF
    This volume focuses on realisations of wordplay in different cultures and social and historical contexts, and brings together various research traditions of approaching wordplay. Together with the volume DWP 7, it assembles selected papers presented at the interdisciplinary conference The Dynamics of Wordplay / La dynamique du jeu de mots (Trier, 2016) and stresses the inherent dynamicity of wordplay and wordplay research
    corecore