10,252 research outputs found
Findings of the 2019 Conference on Machine Translation (WMT19)
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.
Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation
Multilingual NMT with a language-independent attention bridge
In this paper, we propose a multilingual encoder-decoder architecture capable
of obtaining multilingual sentence representations by means of incorporating an
intermediate {\em attention bridge} that is shared across all languages. That
is, we train the model with language-specific encoders and decoders that are
connected via self-attention with a shared layer that we call attention bridge.
This layer exploits the semantics from each language for performing translation
and develops into a language-independent meaning representation that can
efficiently be used for transfer learning. We present a new framework for the
efficient development of multilingual NMT using this model and scheduled
training. We have tested the approach in a systematic way with a multi-parallel
data set. We show that the model achieves substantial improvements over strong
bilingual models and that it also works well for zero-shot translation, which
demonstrates its ability of abstraction and transfer learning
Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English
The necessity of using a fixed-size word vocabulary in order to control the
model complexity in state-of-the-art neural machine translation (NMT) systems
is an important bottleneck on performance, especially for morphologically rich
languages. Conventional methods that aim to overcome this problem by using
sub-word or character-level representations solely rely on statistics and
disregard the linguistic properties of words, which leads to interruptions in
the word structure and causes semantic and syntactic losses. In this paper, we
propose a new vocabulary reduction method for NMT, which can reduce the
vocabulary of a given input corpus at any rate while also considering the
morphological properties of the language. Our method is based on unsupervised
morphology learning and can be, in principle, used for pre-processing any
language pair. We also present an alternative word segmentation method based on
supervised morphological analysis, which aids us in measuring the accuracy of
our model. We evaluate our method in Turkish-to-English NMT task where the
input language is morphologically rich and agglutinative. We analyze different
representation methods in terms of translation accuracy as well as the semantic
and syntactic properties of the generated output. Our method obtains a
significant improvement of 2.3 BLEU points over the conventional vocabulary
reduction technique, showing that it can provide better accuracy in open
vocabulary translation of morphologically rich languages.Comment: The 20th Annual Conference of the European Association for Machine
Translation (EAMT), Research Paper, 12 page
Trivial Transfer Learning for Low-Resource Neural Machine Translation
Transfer learning has been proven as an effective technique for neural
machine translation under low-resource conditions. Existing methods require a
common target language, language relatedness, or specific training tricks and
regimes. We present a simple transfer learning method, where we first train a
"parent" model for a high-resource language pair and then continue the training
on a lowresource pair only by replacing the training corpus. This "child" model
performs significantly better than the baseline trained for lowresource pair
only. We are the first to show this for targeting different languages, and we
observe the improvements even for unrelated languages with different alphabets.Comment: Accepted to WMT18 reseach paper, Proceedings of the 3rd Conference on
Machine Translation 201
- …