5 research outputs found

    Dyslexic children's reading pattern as input for ASR: Data, analysis, and pronunciation model

    Get PDF
    To realize an automatic speech recognition (ASR) model that is able to recognize the Bahasa Melayu reading difficulties of dyslexic children, the language corpora has to be generated beforehand. For this purpose, data collection is performed in two public schools involving ten dyslexic children aged between seven to fourteen years old. A total of 114 Bahasa Melayu words,representing 23 consonant-vowel patterns in the spelling system of the language, served as the stimuli. The patterns range from simple to somewhat complex formations of consonant-vowel pairs in words listed in a level one primary school syllabus. An analysis was performed aimed at identifying the most frequent errors made by these dyslexic children when reading aloud, and describing the emerging reading pattern of dyslexic children in general. This paper hence provides an overview of the entire process from data collection to analysis to modeling the pronunciations of words which will serve as the active lexicon for the ASR model. This paper also highlights the challenges of data collection involving dyslexic children when they are reading aloud, and other factors that contribute to the complex nature of the data collected

    A study of implicit and explicit modeling of coarticulation and pronunciation variation

    No full text
    In this paper, we focus on the modeling of coarticulation and pronunciation variation in Automatic Speech Recognition systems (ASR). Most ASR systems explicitly describe these production phenomena through context-dependent phoneme models and multiple pronunciation lexicons. Here, we explore the potential benefit of using feature spaces covering longer time segments in terms of implicit modeling of coarticulation and pronunciation variants. The study is based on the analysis at the phonetic level of the performance of context-independent and context-dependent acoustic models, and more particularly the impact of modeling different time context going from 70 ms up to 310 ms on typical cases of pronunciation variants. Results, confirmed by word recognition experiment, put into light some ability of generic acoustic models to implicitly handle pronunciation variation. 1
    corecore