Model ensemble has been in widespread use for Grammatical Error Correction
(GEC), boosting model performance. We hypothesize that model ensemble based on
the perplexity (PPL) computed by pre-trained language models (PLMs) should
benefit the GEC system. To this end, we explore several ensemble strategies
based on strong PLMs with four sophisticated single models. However, the
performance does not improve but even gets worse after the PLM-based ensemble.
This surprising result sets us doing a detailed analysis on the data and coming
up with some insights on GEC. The human references of correct sentences is far
from sufficient in the test data, and the gap between a correct sentence and an
idiomatic one is worth our attention. Moreover, the PLM-based ensemble
strategies provide an effective way to extend and improve GEC benchmark data.
Our source code is available at
https://github.com/JamyDon/PLM-based-CGEC-Model-Ensemble.Comment: 7 pages, 1 figure. Accepted by ACL 2023 (main conference, short
paper