Search CORE

36,049 research outputs found

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

Author: Feng Yang
Meng Fandong
Shao Chenze
Zhang Jinchao
Zhou Jie
Publication venue
Publication date: 21/11/2019
Field of study

Non-Autoregressive Neural Machine Translation (NAT) achieves significant decoding speedup through generating target words independently and simultaneously. However, in the context of non-autoregressive translation, the word-level cross-entropy loss cannot model the target-side sequential dependency properly, leading to its weak correlation with the translation quality. As a result, NAT tends to generate influent translations with over-translation and under-translation errors. In this paper, we propose to train NAT to minimize the Bag-of-Ngrams (BoN) difference between the model output and the reference sentence. The bag-of-ngrams training objective is differentiable and can be efficiently calculated, which encourages NAT to capture the target-side sequential dependency and correlates well with the translation quality. We validate our approach on three translation tasks and show that our approach largely outperforms the NAT baseline by about 5.0 BLEU scores on WMT14 En

\leftrightarrow

De and about 2.5 BLEU scores on WMT16 En

\leftrightarrow

Ro.Comment: AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An Analysis of Source Context Dependency in Neural Machine Translation

Author: Koehn Philipp
Li Ke
Ma Xutai
Publication venue: European Association for Machine Translation
Publication date: 01/01/2018
Field of study

The encoder-decoder with attention model has become the state of the art for machine translation. However, more investigations are still needed to understand the internal mechanism of this end-to-end model. In this paper, we focus on how neural machine translation (NMT) models consider source information while decoding. We propose a numerical measurement of source context dependency in the NMT models and analyze the behaviors of the NMT decoder with this measurement under several circumstances. Experimental results show that this measurement is an appropriate estimate for source context dependency and consistent over different domains.This work was partially supported by the IARPA MATERIAL program

Repositorio Institucional de la Universidad de Alicante