38 research outputs found
Domain Robustness in Neural Machine Translation
Translating text that diverges from the training domain is a key challenge
for machine translation. Domain robustness---the generalization of models to
unseen test domains---is low for both statistical (SMT) and neural machine
translation (NMT). In this paper, we study the performance of SMT and NMT
models on out-of-domain test sets. We find that in unknown domains, SMT and NMT
suffer from very different problems: SMT systems are mostly adequate but not
fluent, while NMT systems are mostly fluent, but not adequate. For NMT, we
identify such hallucinations (translations that are fluent but unrelated to the
source) as a key reason for low domain robustness. To mitigate this problem, we
empirically compare methods that are reported to improve adequacy or in-domain
robustness in terms of their effectiveness at improving domain robustness. In
experiments on German to English OPUS data, and German to Romansh (a
low-resource setting) we find that several methods improve domain robustness.
While those methods do lead to higher BLEU scores overall, they only slightly
increase the adequacy of translations compared to SMT.Comment: V2: AMTA camera-read
Neural Monkey: The Current State and Beyond
Neural Monkey is an open-source toolkit for sequence-to-sequence learning. The focus of this paper is to present the current state of the toolkit to the intended audience, which includes students and researchers, both active in the deep learning community and newcomers. For each of these target groups, we describe the most relevant features of the toolkit, including the simple configuration scheme, methods of model inspection that promote useful intuitions, or a modular design for easy prototyping. We summarize relevant contributions to the research community which were made using this toolkit and discuss the characteristics of our toolkit with respect to other existing systems. We conclude with a set of proposals for future development
Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation
Zero-shot neural machine translation is an attractive goal because of the
high cost of obtaining data and building translation systems for new
translation directions. However, previous papers have reported mixed success in
zero-shot translation. It is hard to predict in which settings it will be
effective, and what limits performance compared to a fully supervised system.
In this paper, we investigate zero-shot performance of a multilingual
EN{FR,CS,DE,FI} system trained on WMT data. We find that
zero-shot performance is highly unstable and can vary by more than 6 BLEU
between training runs, making it difficult to reliably track improvements. We
observe a bias towards copying the source in zero-shot translation, and
investigate how the choice of subword segmentation affects this bias. We find
that language-specific subword segmentation results in less subword copying at
training time, and leads to better zero-shot performance compared to jointly
trained segmentation. A recent trend in multilingual models is to not train on
parallel data between all language pairs, but have a single bridge language,
e.g. English. We find that this negatively affects zero-shot translation and
leads to a failure mode where the model ignores the language tag and instead
produces English output in zero-shot directions. We show that this bias towards
English can be effectively reduced with even a small amount of parallel data in
some of the non-English pairs.Comment: Accepted at WMT 202
Identifying Semantic Divergences in Parallel Text without Annotations
Recognizing that even correct translations are not always semantically
equivalent, we automatically detect meaning divergences in parallel sentence
pairs with a deep neural model of bilingual semantic similarity which can be
trained for any parallel corpus without any manual annotation. We show that our
semantic model detects divergences more accurately than models based on surface
features derived from word alignments, and that these divergences matter for
neural machine translation.Comment: Accepted as a full paper to NAACL 201
Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation
Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search – the de facto standard inference algorithm in NMT – and Eikema & Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift