Search CORE

4 research outputs found

Levenshtein Transformer

Author: Gu Jiatao
Wang Changhan
Zhao Jake
Publication venue
Publication date: 28/10/2019
Field of study

Modern neural sequence generation models are built to either generate tokens step-by-step from scratch or (iteratively) modify a sequence of tokens bounded by a fixed length. In this work, we develop Levenshtein Transformer, a new partially autoregressive model devised for more flexible and amenable sequence generation. Unlike previous approaches, the atomic operations of our model are insertion and deletion. The combination of them facilitates not only generation but also sequence refinement allowing dynamic length changes. We also propose a set of new training techniques dedicated at them, effectively exploiting one as the other's learning signal thanks to their complementary nature. Experiments applying the proposed model achieve comparable performance but much-improved efficiency on both generation (e.g. machine translation, text summarization) and refinement tasks (e.g. automatic post-editing). We further confirm the flexibility of our model by showing a Levenshtein Transformer trained by machine translation can straightforwardly be used for automatic post-editing.Comment: 17 pages (6 pages appendix). Camera ready, accepted by NeurIPS 201

arXiv.org e-Print Archive

On The Evaluation of Machine Translation Systems Trained With Back-Translation

Author: Auli Michael
Edunov Sergey
Ott Myle
Ranzato Marc'Aurelio
Publication venue
Publication date: 18/08/2020
Field of study

Back-translation is a widely used data augmentation technique which leverages target monolingual data. However, its effectiveness has been challenged since automatic metrics such as BLEU only show significant improvements for test examples where the source itself is a translation, or translationese. This is believed to be due to translationese inputs better matching the back-translated training data. In this work, we show that this conjecture is not empirically supported and that back-translation improves translation quality of both naturally occurring text as well as translationese according to professional human translators. We provide empirical evidence to support the view that back-translation is preferred by humans because it produces more fluent outputs. BLEU cannot capture human preferences because references are translationese when source sentences are natural text. We recommend complementing BLEU with a language model score to measure fluency.Comment: ACL 202

arXiv.org e-Print Archive

Tagged Back-Translation

Author: Caswell Isaac
Chelba Ciprian
Grangier David
Publication venue
Publication date: 14/06/2019
Field of study

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data. We show that the main role of such synthetic noise is not to diversify the source side, as previously suggested, but simply to indicate to the model that the given source is synthetic. We propose a simpler alternative to noising techniques, consisting of tagging back-translated source sentences with an extra token. Our results on WMT outperform noised back-translation in English-Romanian and match performance on English-German, re-defining state-of-the-art in the former.Comment: Accepted as oral presentation in WMT 2019; 9 pages; 9 tables; 1 figur

arXiv.org e-Print Archive

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

Author: Arivazhagan Naveen
Bapna Ankur
Cao Yuan
Chen Mia Xu
Chen Zhifeng
Cherry Colin
Firat Orhan
Foster George
Johnson Melvin
Krikun Maxim
Lepikhin Dmitry
Macherey Wolfgang
Wu Yonghui
Publication venue
Publication date: 11/07/2019
Field of study

We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair. We set a milestone towards this goal by building a single massively multilingual NMT model handling 103 languages trained on over 25 billion examples. Our system demonstrates effective transfer learning ability, significantly improving translation quality of low-resource languages, while keeping high-resource language translation quality on-par with competitive bilingual baselines. We provide in-depth analysis of various aspects of model building that are crucial to achieving quality and practicality in universal NMT. While we prototype a high-quality universal translation system, our extensive empirical analysis exposes issues that need to be further addressed, and we suggest directions for future research

arXiv.org e-Print Archive