3,893 research outputs found

    Dual Language Models for Code Switched Speech Recognition

    Full text link
    In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text. Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models. We propose a novel technique called dual language models, which involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two. We evaluate the efficacy of our approach using a conversational Mandarin-English speech corpus. We prove the robustness of our model by showing significant improvements in perplexity measures over the standard bilingual language model without the use of any external information. Similar consistent improvements are also reflected in automatic speech recognition error rates.Comment: Accepted at Interspeech 201

    Glottal Spectral Separation for Speech Synthesis

    Get PDF

    Exploring auditory-motor interactions in normal and disordered speech

    Full text link
    Auditory feedback plays an important role in speech motor learning and in the online correction of speech movements. Speakers can detect and correct auditory feedback errors at the segmental and suprasegmental levels during ongoing speech. The frontal brain regions that contribute to these corrective movements have also been shown to be more active during speech in persons who stutter (PWS) compared to fluent speakers. Further, various types of altered auditory feedback can temporarily improve the fluency of PWS, suggesting that atypical auditory-motor interactions during speech may contribute to stuttering disfluencies. To investigate this possibility, we have developed and improved Audapter, a software that enables configurable dynamic perturbation of the spatial and temporal content of the speech auditory signal in real time. Using Audapter, we have measured the compensatory responses of PWS to static and dynamic perturbations of the formant content of auditory feedback and compared these responses with those from matched fluent controls. Our findings indicate deficient utilization of auditory feedback by PWS for short-latency online control of the spatial and temporal parameters of articulation during vowel production and during running speech. These findings provide further evidence that stuttering is associated with aberrant auditory-motor integration during speech.Published versio

    Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer

    Full text link
    One of the challenges in automatic speech recognition is foreign words recognition. It is observed that a speaker's pronunciation of a foreign word is influenced by his native language knowledge, and such phenomenon is known as the effect of language transfer. This paper focuses on examining the phonetic effect of language transfer in automatic speech recognition. A set of lexical rules is proposed to convert an English word into Mandarin phonetic representation. In this way, a Mandarin lexicon can be augmented by including English words. Hence, the Mandarin ASR system becomes capable to recognize English words without retraining or re-estimation of the acoustic model parameters. Using the lexicon that derived from the proposed rules, the ASR performance of Mandarin English mixed speech is improved without harming the accuracy of Mandarin only speech. The proposed lexical rules are generalized and they can be directly applied to unseen English words.Comment: Published by INTERSPEECH 201
    corecore