10 research outputs found

    Computer analysis of children's non-native English speech for language learning and assessment

    Get PDF
    Children's ASR appears to be more challenging than adults' and it's even more difficult when it comes to non-native children's speech. This research investigates different techniques to compensate for the effects of non-native and children on the performance of ASR systems. The study mainly utilises hybrid DNN-HMM systems with conventional DNNs, LSTMs and more advanced TDNN models. This work uses the CALL-ST corpus and TLT-school corpus to study children's non-native English speech. Initially, data augmentation was explored on the CALL-ST corpus to address the lack of data problem using the AMI corpus and PF-STAR German corpus. Feature selection, acoustic model adaptation and selection were also investigated on CALL-ST. More aspects of the ASR system, including pronunciation modelling, acoustic modelling, language modelling and system fusion, were explored on the TLT-school corpus as this corpus has a bigger amount of data. Then, the relationships between the CALL-ST and TLT-school corpora were studied and utilised to improve ASR performance. The other part of the present work is text processing for non-native children's English speech. We focused on providing accept/reject feedback to learners based on the text generated by the ASR system from learners' spoken responses. A rule-based and a machine learning-based system were proposed for making the judgement, several aspects of the systems were evaluated. The influence of the ASR system on the text processing system was explored

    Overview of the 2018 spoken call shared task

    Get PDF
    © 2018 International Speech Communication Association. All rights reserved. We present an overview of the second edition of the Spoken CALL Shared Task. Groups competed on a prompt-response task using English-language data collected, through an online CALL game, from Swiss German teens in their second and third years of learning English. Each item consists of a written German prompt and an audio file containing a spoken response. The task is to accept linguistically correct responses and reject linguistically incorrect ones, with “linguistically correct” defined by a gold standard derived from human annotations. Scoring was performed using a metric defined as the ratio of the relative rejection rates on incorrect and correct responses. The second edition received eighteen entries and showed very substantial improvement on the first edition; all entries were better than the best entry from the first edition, and the best score was about four times higher. We present the task, the resources, the results, a discussion of the metrics used, and an analysis of what makes items challenging. In particular, we present quantitative evidence suggesting that incorrect responses are much more difficult to process than correct responses, and that the most significant factor in making a response challenging is its distance from the closest training example

    Overview of the 2018 Spoken CALL Shared Task

    Get PDF
    Contains fulltext : 199312.pdf (publisher's version ) (Open Access)The 19th Annual Conference of the International Speech Communication Association, 02 september 201
    corecore