280 research outputs found
Adapting Sequence Models for Sentence Correction
In a controlled experiment of sequence-to-sequence approaches for the task of
sentence correction, we find that character-based models are generally more
effective than word-based models and models that encode subword information via
convolutions, and that modeling the output data as a series of diffs improves
effectiveness over standard approaches. Our strongest sequence-to-sequence
model improves over our strongest phrase-based statistical machine translation
model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally,
in the data environment of the standard CoNLL-2014 setup, we demonstrate that
modeling (and tuning against) diffs yields similar or better M2 scores with
simpler models and/or significantly less data than previous
sequence-to-sequence approaches.Comment: EMNLP 201
Foreebank: Syntactic Analysis of Customer Support Forums
International audienceWe present a new treebank of English and French technical forum content which has been annotated for grammatical errors and phrase structure. This double annotation allows us to empirically measure the effect of errors on parsing performance. While it is slightly easier to parse the corrected versions of the forum sentences, the errors are not the main factor in making this kind of text hard to parse
Recommended from our members
Auxiliary Objectives for Neural Error Detection Models
We investigate the utility of different auxiliary
objectives and training strategies
within a neural sequence labeling approach
to error detection in learner writing.
Auxiliary costs provide the model
with additional linguistic information, allowing
it to learn general-purpose compositional
features that can then be exploited
for other objectives. Our experiments
show that a joint learning approach
trained with parallel labels on in-domain
data improves performance over the previous
best error detection system. While
the resulting model has the same number
of parameters, the additional objectives allow
it to be optimised more efficiently and
achieve better performance
- …