5,238 research outputs found
Adapting Sequence Models for Sentence Correction
In a controlled experiment of sequence-to-sequence approaches for the task of
sentence correction, we find that character-based models are generally more
effective than word-based models and models that encode subword information via
convolutions, and that modeling the output data as a series of diffs improves
effectiveness over standard approaches. Our strongest sequence-to-sequence
model improves over our strongest phrase-based statistical machine translation
model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally,
in the data environment of the standard CoNLL-2014 setup, we demonstrate that
modeling (and tuning against) diffs yields similar or better M2 scores with
simpler models and/or significantly less data than previous
sequence-to-sequence approaches.Comment: EMNLP 201
Recommended from our members
Automatic Grammatical Error Detection of Non-native Spoken Learner English
Automatic language assessment and learning systems are required to support the global growth in English language learning. They need to be able to provide reliable and meaningful feedback to help learners develop their skills. This paper considers the question of detecting grammatical errors in non-native spoken English as a first step to providing feedback on a learner's use of the language. A state-of-the-art deep learning based grammatical error detection (GED) system designed for written texts is investigated on free speaking tasks across the full range of proficiency grades with a mix of first languages (L1s). This presents a number of challenges. Free speech contains disfluencies that disrupt the spoken language flow but are not grammatical errors. The lower the level of the learner the more these both will occur which makes the underlying task of automatic transcription harder. The baseline written GED system is seen to perform less well on manually transcribed spoken language. When the GED model is fine-tuned to free speech data from the target domain the spoken system is able to match the written performance. Given the current state-of-the-art in ASR, however, and the ability to detect disfluencies grammatical error feedback from automated transcriptions remains a challenge.This paper reports on research supported by Cambridge Assessment, University of Cambridge. Thanks to Cambridge English Language Assessment for supporting this research and providing access to the BULATS dat
Detecting grammatical errors with treebank-induced, probabilistic parsers
Today's grammar checkers often use hand-crafted rule systems that define acceptable language. The development of such rule systems is labour-intensive and has to be repeated for each language. At the same time, grammars automatically induced from syntactically annotated corpora (treebanks) are successfully employed in other applications, for example text understanding and machine translation. At first glance, treebank-induced grammars seem to be unsuitable for grammar checking as they massively over-generate and fail to reject ungrammatical input due to their high robustness. We present three new methods for judging the grammaticality of a sentence with probabilistic, treebank-induced grammars, demonstrating that such grammars can be successfully applied to automatically judge the grammaticality of an input string. Our best-performing method exploits the differences between parse results for grammars trained on grammatical and ungrammatical treebanks. The second approach builds an estimator of the probability of the most likely parse using grammatical training data that has previously been parsed and annotated with parse probabilities. If the estimated probability of an input sentence (whose grammaticality is to be judged by the system) is higher by a certain amount than the actual parse probability, the sentence is flagged as ungrammatical. The third approach extracts discriminative parse tree fragments in the form of CFG rules from parsed grammatical and ungrammatical corpora and trains a binary classifier to distinguish grammatical from ungrammatical sentences. The three approaches are evaluated on a large test set of grammatical and ungrammatical sentences. The ungrammatical test set is generated automatically by inserting common grammatical errors into the British National Corpus. The results are compared to two traditional approaches, one that uses a hand-crafted, discriminative grammar, the XLE ParGram English LFG, and one based on part-of-speech n-grams. In addition, the baseline methods and the new methods are combined in a machine learning-based framework, yielding further improvements
Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Large-scale pre-trained language models such as GPT-3 have shown remarkable
performance across various natural language processing tasks. However, applying
prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks
and their controllability remains underexplored. Controllability in GEC is
crucial for real-world applications, particularly in educational settings,
where the ability to tailor feedback according to learner levels and specific
error types can significantly enhance the learning process. This paper
investigates the performance and controllability of prompt-based methods with
GPT-3 for GEC tasks using zero-shot and few-shot setting. We explore the impact
of task instructions and examples on GPT-3's output, focusing on controlling
aspects such as minimal edits, fluency edits, and learner levels. Our findings
demonstrate that GPT-3 could effectively perform GEC tasks, outperforming
existing supervised and unsupervised approaches. We also showed that GPT-3
could achieve controllability when appropriate task instructions and examples
are given.Comment: Accepted in BEA 202
XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates
Text editing is a crucial task that involves modifying text to better align
with user intents. However, existing text editing benchmark datasets have
limitations in providing only coarse-grained instructions. Consequently,
although the edited output may seem reasonable, it often deviates from the
intended changes outlined in the gold reference, resulting in low evaluation
scores. To comprehensively investigate the text editing capabilities of large
language models, this paper introduces XATU, the first benchmark specifically
designed for fine-grained instruction-based explainable text editing. XATU
covers a wide range of topics and text types, incorporating lexical, syntactic,
semantic, and knowledge-intensive edits. To enhance interpretability, we
leverage high-quality data sources and human annotation, resulting in a
benchmark that includes fine-grained instructions and gold-standard edit
explanations. By evaluating existing open and closed large language models
against our benchmark, we demonstrate the effectiveness of instruction tuning
and the impact of underlying architecture across various editing tasks.
Furthermore, extensive experimentation reveals the significant role of
explanations in fine-tuning language models for text editing tasks. The
benchmark will be open-sourced to support reproduction and facilitate future
research.Comment: Work in progres
- …