4,237 research outputs found
Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Large-scale pre-trained language models such as GPT-3 have shown remarkable
performance across various natural language processing tasks. However, applying
prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks
and their controllability remains underexplored. Controllability in GEC is
crucial for real-world applications, particularly in educational settings,
where the ability to tailor feedback according to learner levels and specific
error types can significantly enhance the learning process. This paper
investigates the performance and controllability of prompt-based methods with
GPT-3 for GEC tasks using zero-shot and few-shot setting. We explore the impact
of task instructions and examples on GPT-3's output, focusing on controlling
aspects such as minimal edits, fluency edits, and learner levels. Our findings
demonstrate that GPT-3 could effectively perform GEC tasks, outperforming
existing supervised and unsupervised approaches. We also showed that GPT-3
could achieve controllability when appropriate task instructions and examples
are given.Comment: Accepted in BEA 202
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Text error correction aims to correct the errors in text sequences such as
those typed by humans or generated by speech recognition models. Previous error
correction methods usually take the source (incorrect) sentence as encoder
input and generate the target (correct) sentence through the decoder. Since the
error rate of the incorrect sentence is usually low (e.g., 10\%), the
correction model can only learn to correct on limited error tokens but
trivially copy on most tokens (correct tokens), which harms the effective
training of error correction. In this paper, we argue that the correct tokens
should be better utilized to facilitate effective training and then propose a
simple yet effective masking strategy to achieve this goal. Specifically, we
randomly mask out a part of the correct tokens in the source sentence and let
the model learn to not only correct the original error tokens but also predict
the masked tokens based on their context information. Our method enjoys several
advantages: 1) it alleviates trivial copy; 2) it leverages effective training
signals from correct tokens; 3) it is a plug-and-play module and can be applied
to different models and tasks. Experiments on spelling error correction and
speech recognition error correction on Mandarin datasets and grammar error
correction on English datasets with both autoregressive and non-autoregressive
generation models show that our method improves the correction accuracy
consistently.Comment: main track of EMNLP 202
Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction
In Grammatical Error Correction (GEC), it is crucial to ensure the user's
comprehension of a reason for correction. Existing studies present tokens,
examples, and hints as to the basis for correction but do not directly explain
the reasons for corrections. Although methods that use Large Language Models
(LLMs) to provide direct explanations in natural language have been proposed
for various tasks, no such method exists for GEC. Generating explanations for
GEC corrections involves aligning input and output tokens, identifying
correction points, and presenting corresponding explanations consistently.
However, it is not straightforward to specify a complex format to generate
explanations, because explicit control of generation is difficult with prompts.
This study introduces a method called controlled generation with Prompt
Insertion (PI) so that LLMs can explain the reasons for corrections in natural
language. In PI, LLMs first correct the input text, and then we automatically
extract the correction points based on the rules. The extracted correction
points are sequentially inserted into the LLM's explanation output as prompts,
guiding the LLMs to generate explanations for the correction points. We also
create an Explainable GEC (XGEC) dataset of correction reasons by annotating
NUCLE, CoNLL2013, and CoNLL2014. Although generations from GPT-3 and ChatGPT
using original prompts miss some correction points, the generation control
using PI can explicitly guide to describe explanations for all correction
points, contributing to improved performance in generating correction reasons.Comment: Work in progres
非英語母語話者のためのインタラクティブな書き換え
Tohoku University博士(情報科学)thesi
Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting
Recent studies have revealed that grammatical error correction methods in the
sequence-to-sequence paradigm are vulnerable to adversarial attack, and simply
utilizing adversarial examples in the pre-training or post-training process can
significantly enhance the robustness of GEC models to certain types of attack
without suffering too much performance loss on clean data. In this paper, we
further conduct a thorough robustness evaluation of cutting-edge GEC methods
for four different types of adversarial attacks and propose a simple yet very
effective Cycle Self-Augmenting (CSA) method accordingly. By leveraging the
augmenting data from the GEC models themselves in the post-training process and
introducing regularization data for cycle training, our proposed method can
effectively improve the model robustness of well-trained GEC models with only a
few more training epochs as an extra cost. More concretely, further training on
the regularization data can prevent the GEC models from over-fitting on
easy-to-learn samples and thus can improve the generalization capability and
robustness towards unseen data (adversarial noise/samples). Meanwhile, the
self-augmented data can provide more high-quality pseudo pairs to improve model
performance on the original testing data. Experiments on four benchmark
datasets and seven strong models indicate that our proposed training method can
significantly enhance the robustness of four types of attacks without using
purposely built adversarial examples in training. Evaluation results on clean
data further confirm that our proposed CSA method significantly improves the
performance of four baselines and yields nearly comparable results with other
state-of-the-art models. Our code is available at
https://github.com/ZetangForward/CSA-GEC
- …