Search CORE

815 research outputs found

Impact of ASR performance on spoken grammatical error detection

Author: Gales MJF
Knill KM
Lu Y
Manakul P
Wang L
Wang Y
Publication venue: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication date: 01/01/2019
Field of study

Computer assisted language learning (CALL) systems aidlearners to monitor their progress by providing scoring andfeedback on language assessment tasks. Free speaking tests al-low assessment of what a learner has said, as well as how theysaid it. For these tasks, Automatic Speech Recognition (ASR)is required to generate transcriptions of a candidate’s responses,the quality of these transcriptions is crucial to provide reliablefeedback in downstream processes. This paper considers theimpact of ASR performance on Grammatical Error Detection(GED) for free speaking tasks, as an example of providing feed-back on a learner’s use of English. The performance of an ad-vanced deep-learning based GED system, initially trained onwritten corpora, is used to evaluate the influence of ASR errors.One consequence of these errors is that grammatical errors canresult from incorrect transcriptions as well as learner errors, thismay yield confusing feedback. To mitigate the effect of theseerrors, and reduce erroneous feedback, ASR confidence scoresare incorporated into the GED system. By additionally adaptingthe written text GED system to the speech domain, using ASRtranscriptions, significant gains in performance can be achieved.Analysis of the GED performance for different grammatical er-ror types and across grade is also presented.ALT

Crossref

Apollo (Cambridge)

Recommended from our members

Automatic Grammatical Error Detection of Non-native Spoken Learner English

Author: Caines AP
Gales MJF
Knill KM
Manakul PP
Publication venue: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publication date: 01/01/2019
Field of study

Automatic language assessment and learning systems are required to support the global growth in English language learning. They need to be able to provide reliable and meaningful feedback to help learners develop their skills. This paper considers the question of detecting grammatical errors in non-native spoken English as a first step to providing feedback on a learner's use of the language. A state-of-the-art deep learning based grammatical error detection (GED) system designed for written texts is investigated on free speaking tasks across the full range of proficiency grades with a mix of first languages (L1s). This presents a number of challenges. Free speech contains disfluencies that disrupt the spoken language flow but are not grammatical errors. The lower the level of the learner the more these both will occur which makes the underlying task of automatic transcription harder. The baseline written GED system is seen to perform less well on manually transcribed spoken language. When the GED model is fine-tuned to free speech data from the target domain the spoken system is able to match the written performance. Given the current state-of-the-art in ASR, however, and the ability to detect disfluencies grammatical error feedback from automated transcriptions remains a challenge.This paper reports on research supported by Cambridge Assessment, University of Cambridge. Thanks to Cambridge English Language Assessment for supporting this research and providing access to the BULATS dat

Apollo (Cambridge)

On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems

Author: Mark J. F. Gales
Stefano Bannò
Yiting Lu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

Spoken ‘grammatical error correction’ (SGEC) is an important process to provide feedback for second language learning. Due to a lack of end-to-end training data, SGEC is often implemented as a cascaded, modular system, consisting of speech recognition, disfluency removal, and grammatical error correction (GEC). This cascaded structure enables efficient use of training data for each module. It is, however, difficult to compare and evaluate the performance of individual modules as preceeding modules may introduce errors. For example the GEC module input depends on the output of non-native speech recognition and disfluency detection, both challenging tasks for learner data. This paper focuses on the assessment and development of SGEC systems. We first discuss metrics for evaluating SGEC, both individual modules and the overall system. The system-level metrics enable tuning for optimal system performance. A known issue in cascaded systems is error propagation between modules. To mitigate this problem semi-supervised approaches and self-distillation are investigated. Lastly, when SGEC system gets deployed it is important to give accurate feedback to users. Thus, we apply filtering to remove edits with low-confidence, aiming to improve overall feedback precision. The performance metrics are examined on a Linguaskill multi-level data set, which includes the original non-native speech, manual transcriptions and reference grammatical error corrections, to enable system analysis and development

Archivio della ricerca - Fondazione Bruno Kessler

Impact of ASR performance on free speaking language assessment

Author: Caines AP
Gales MJF
Knill KM
Kyriakopoulos K
Malinin A
Ragni A
Wang Y
Publication venue: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication date: 01/01/2018
Field of study

In free speaking tests candidates respond in spontaneous speech to prompts. This form of test allows the spoken language proficiency of a non-native speaker of English to be assessed more fully than read aloud tests. As the candidate's responses are unscripted, transcription by automatic speech recognition (ASR) is essential for automated assessment. ASR will never be 100% accurate so any assessment system must seek to minimise and mitigate ASR errors. This paper considers the impact of ASR errors on the performance of free speaking test auto-marking systems. Firstly rich linguistically related features, based on part-of-speech tags from statistical parse trees, are investigated for assessment. Then, the impact of ASR errors on how well the system can detect whether a learner's answer is relevant to the question asked is evaluated. Finally, the impact that these errors may have on the ability of the system to provide detailed feedback to the learner is analysed. In particular, pronunciation and grammatical errors are considered as these are important in helping a learner to make progress. As feedback resulting from an ASR error would be highly confusing, an approach to mitigate this problem using confidence scores is also analysed

Crossref

Apollo (Cambridge)

White Rose Research Online

Spoken language 'grammatical error correction'

Author: Gales MJF
Lu Y
Wang Y
Publication venue: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication date: 01/01/2020
Field of study

Spoken language ‘grammatical error correction’ (GEC) is an important mechanism to help learners of a foreign language, here English, improve their spoken grammar. GEC is challeng- ing for non-native spoken language due to interruptions from disfluent speech events such as repetitions and false starts and issues in strictly defining what is acceptable in spoken language. Furthermore there is little labelled data to train models. One way to mitigate the impact of speech events is to use a disflu- ency detection (DD) model. Removing the detected disfluencies converts the speech transcript to be closer to written language, which has significantly more labelled training data. This paper considers two types of approaches to leveraging DD models to boost spoken GEC performance. One is sequential, a separately trained DD model acts as a pre-processing module providing a more structured input to the GEC model. The second approach is to train DD and GEC models in an end-to-end fashion, simul- taneously optimising both modules. Embeddings enable end- to-end models to have a richer information flow. Experimen- tal results show that DD effectively regulates GEC input; end- to-end training works well when fine-tuned on limited labelled in-domain data; and improving DD by incorporating acoustic information helps improve spoken GEC

Crossref

Apollo (Cambridge)

Improving the translation environment for professional translators

Author: Augustinus Liesbeth
Bulté Bram
Buysschaert Joost
Coppers Sven
Daems Joke
Heyman Geert
Hoste Veronique
Lefever Els
Luyten Kris
Macken Lieve
Moens Marie-Francine
Pelemans Joris
Rigouts Terryn Ayla
Steurs Frieda
Tezcan Arda
Van den Bergh Jan
van der Lek-Ciudin Iulianna
Van Eynde Frank
Vanallemeersch Tom
Vandeghinste Vincent
Verwimp Lyan
Wambacq Patrick
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions

Author: Buttery P
Caines A
Craighead H
Yannakoudakis H
Publication venue: Proceedings of the Annual Meeting of the Association for Computational Linguistics
Publication date: 01/01/2020
Field of study

We address the task of automatically grading the language proficiency of spontaneous speech based on textual features from automatic speech recognition transcripts. Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). We encode the transcriptions with both bi-directional recurrent neural networks and with bi-directional representations from transformers, compare against a feature-rich baseline, and analyse performance at different proficiency levels and with transcriptions of varying error rates. Our best performance comes from a transformer encoder with L1 prediction as an auxiliary task. We discuss areas for improvement and potential applications for text-only speech scoring.Cambridge Assessmen

Crossref

Apollo (Cambridge)

King's Research Portal