We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for
developing and evaluating grammatical error correction (GEC). Unlike other
corpora, it represents a broad range of language proficiency levels and uses
holistic fluency edits to not only correct grammatical errors but also make the
original text more native sounding. We describe the types of corrections made
and benchmark four leading GEC systems on this corpus, identifying specific
areas in which they do well and how they can improve. JFLEG fulfills the need
for a new gold standard to properly assess the current state of GEC.Comment: To appear in EACL 2017 (short papers