Search CORE

23 research outputs found

Domain Robustness in Neural Machine Translation

Author: Müller Mathias
Rios Annette
Sennrich Rico
Publication venue
Publication date: 24/09/2020
Field of study

Translating text that diverges from the training domain is a key challenge for machine translation. Domain robustness---the generalization of models to unseen test domains---is low for both statistical (SMT) and neural machine translation (NMT). In this paper, we study the performance of SMT and NMT models on out-of-domain test sets. We find that in unknown domains, SMT and NMT suffer from very different problems: SMT systems are mostly adequate but not fluent, while NMT systems are mostly fluent, but not adequate. For NMT, we identify such hallucinations (translations that are fluent but unrelated to the source) as a key reason for low domain robustness. To mitigate this problem, we empirically compare methods that are reported to improve adequacy or in-domain robustness in terms of their effectiveness at improving domain robustness. In experiments on German to English OPUS data, and German to Romansh (a low-resource setting) we find that several methods improve domain robustness. While those methods do lead to higher BLEU scores overall, they only slightly increase the adequacy of translations compared to SMT.Comment: V2: AMTA camera-read

arXiv.org e-Print Archive

Edinburgh Research Explorer

The University of Edinburgh’s Neural MT Systems for WMT17

Author: Birch Alexandra
Currey Anna
Germann Ulrich
Haddow Barry
Heafield Kenneth
Miceli Barone Antonio Valerio
Sennrich Rico
Williams Philip
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

This paper describes the University of Edinburgh's submissions to the WMT17 shared news translation and biomedical translation tasks. We participated in 12 translation directions for news, translating between English and Czech, German, Latvian, Russian, Turkish and Chinese. For the biomedical task we submitted systems for English to Czech, German, Polish and Romanian. Our systems are neural machine translation systems trained with Nematus, an attentional encoder-decoder. We follow our setup from last year and build BPE-based models with parallel and back-translated monolingual training data. Novelties this year include the use of deep architectures, layer normalization, and more compact models due to weight tying and improvements in BPE segmentations. We perform extensive ablative experiments, reporting on the effectivenes of layer normalization, deep architectures, and different ensembling techniques.Comment: WMT 2017 shared task track; for Bibtex, see http://homepages.inf.ed.ac.uk/rsennric/bib.html#uedin-nmt:201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Aligning Neural Machine Translation Models: Human Feedback in Training and Inference

Author: Farinhas António
Fernandes Patrick
Martins André F. T.
Ramos Miguel Moura
Publication venue
Publication date: 15/11/2023
Field of study

Reinforcement learning from human feedback (RLHF) is a recent technique to improve the quality of the text generated by a language model, making it closer to what humans would generate. A core ingredient in RLHF's success in aligning and improving large language models (LLMs) is its reward model, trained using human feedback on model outputs. In machine translation (MT), where metrics trained from human annotations can readily be used as reward models, recent methods using minimum Bayes risk decoding and reranking have succeeded in improving the final quality of translation. In this study, we comprehensively explore and compare techniques for integrating quality metrics as reward models into the MT pipeline. This includes using the reward model for data filtering, during the training phase through RL, and at inference time by employing reranking techniques, and we assess the effects of combining these in a unified approach. Our experimental results, conducted across multiple translation tasks, underscore the crucial role of effective data filtering, based on estimated quality, in harnessing the full potential of RL in enhancing MT quality. Furthermore, our findings demonstrate the effectiveness of combining RL training with reranking techniques, showcasing substantial improvements in translation quality.Comment: 14 pages, work-in-progres

arXiv.org e-Print Archive

The Word Sense Disambiguation Test Suite at WMT18

Author: Müller Mathias
Rios Annette
Sennrich Rico
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Crossref

Edinburgh Research Explorer

Paraphrasing Revisited with Neural Machine Translation

Author: Lapata Maria
Mallinson Jonathan
Sennrich Rico
Publication venue
Publication date: 07/04/2017
Field of study

Edinburgh Research Explorer