11 research outputs found
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201
From Bilingual to Multilingual Neural Machine Translation by Incremental Training
Multilingual Neural Machine Translation approaches are based on the use of
task-specific models and the addition of one more language can only be done by
retraining the whole system. In this work, we propose a new training schedule
that allows the system to scale to more languages without modification of the
previous components based on joint training and language-independent
encoder/decoder modules allowing for zero-shot translation. This work in
progress shows close results to the state-of-the-art in the WMT task.Comment: Accepted paper at ACL 2019 Student Research Workshop. arXiv admin
note: substantial text overlap with arXiv:1905.0683
Adapting Multilingual Neural Machine Translation to Unseen Languages
Multilingual Neural Machine Translation (MNMT) for low-resource languages
(LRL) can be enhanced by the presence of related high-resource languages (HRL),
but the relatedness of HRL usually relies on predefined linguistic assumptions
about language similarity. Recently, adapting MNMT to a LRL has shown to
greatly improve performance. In this work, we explore the problem of adapting
an MNMT model to an unseen LRL using data selection and model adaptation. In
order to improve NMT for LRL, we employ perplexity to select HRL data that are
most similar to the LRL on the basis of language distance. We extensively
explore data selection in popular multilingual NMT settings, namely in
(zero-shot) translation, and in adaptation from a multilingual pre-trained
model, for both directions (LRL-en). We further show that dynamic adaptation of
the model's vocabulary results in a more favourable segmentation for the LRL in
comparison with direct adaptation. Experiments show reductions in training time
and significant performance gains over LRL baselines, even with zero LRL data
(+13.0 BLEU), up to +17.0 BLEU for pre-trained multilingual model dynamic
adaptation with related data selection. Our method outperforms current
approaches, such as massively multilingual models and data augmentation, on
four LRL.Comment: Accepted at the 16th International Workshop on Spoken Language
Translation (IWSLT), November, 201
Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank
Pretrained multilingual contextual representations have shown great success,
but due to the limits of their pretraining data, their benefits do not apply
equally to all language varieties. This presents a challenge for language
varieties unfamiliar to these models, whose labeled \emph{and unlabeled} data
is too limited to train a monolingual model effectively. We propose the use of
additional language-specific pretraining and vocabulary augmentation to adapt
multilingual models to low-resource settings. Using dependency parsing of four
diverse low-resource language varieties as a case study, we show that these
methods significantly improve performance over baselines, especially in the
lowest-resource cases, and demonstrate the importance of the relationship
between such models' pretraining data and target language varieties.Comment: In Findings of EMNLP 202
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation (NMT) models by means of a shared dynamic vocabulary. Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i.e., introducing new vocabulary items if they are not included in the initial model). The parameter transfer mechanism is evaluated in two scenarios: i) to adapt a trained single language NMT system to work with a new language pair and ii) to continuously add new language pairs to grow to a multilingual NMT system. In both the scenarios our goal is to improve the translation performance, while minimizing the training convergence time. Preliminary experiments spanning five languages with different training data sizes (i.e., 5k and 50k parallel sentences) show a significant performance gain ranging from +3.85 up to +13.63 BLEU in different language directions. Moreover, when compared with training an NMT model from scratch, our transfer-learning approach allows us to reach higher performance after training up to 4% of the total training steps