14 research outputs found
Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation
Non-Autoregressive Neural Machine Translation (NAT) achieves significant
decoding speedup through generating target words independently and
simultaneously. However, in the context of non-autoregressive translation, the
word-level cross-entropy loss cannot model the target-side sequential
dependency properly, leading to its weak correlation with the translation
quality. As a result, NAT tends to generate influent translations with
over-translation and under-translation errors. In this paper, we propose to
train NAT to minimize the Bag-of-Ngrams (BoN) difference between the model
output and the reference sentence. The bag-of-ngrams training objective is
differentiable and can be efficiently calculated, which encourages NAT to
capture the target-side sequential dependency and correlates well with the
translation quality. We validate our approach on three translation tasks and
show that our approach largely outperforms the NAT baseline by about 5.0 BLEU
scores on WMT14 EnDe and about 2.5 BLEU scores on WMT16
EnRo.Comment: AAAI 202