17 research outputs found
Neural Machine Translation into Language Varieties
Both research and commercial machine translation have so far neglected the
importance of properly handling the spelling, lexical and grammar divergences
occurring among language varieties. Notable cases are standard national
varieties such as Brazilian and European Portuguese, and Canadian and European
French, which popular online machine translation services are not keeping
distinct. We show that an evident side effect of modeling such varieties as
unique classes is the generation of inconsistent translations. In this work, we
investigate the problem of training neural machine translation from English to
specific pairs of language varieties, assuming both labeled and unlabeled
parallel texts, and low-resource conditions. We report experiments from English
to two pairs of dialects, EuropeanBrazilian Portuguese and European-Canadian
French, and two pairs of standardized varieties, Croatian-Serbian and
Indonesian-Malay. We show significant BLEU score improvements over baseline
systems when translation into similar languages is learned as a multilingual
task with shared representations.Comment: Published at EMNLP 2018: third conference on machine translation (WMT
2018
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201
Low Resource Neural Machine Translation: A Benchmark for Five African Languages
Recent advents in Neural Machine Translation (NMT) have shown improvements in
low-resource language (LRL) translation tasks. In this work, we benchmark NMT
between English and five African LRL pairs (Swahili, Amharic, Tigrigna, Oromo,
Somali [SATOS]). We collected the available resources on the SATOS languages to
evaluate the current state of NMT for LRLs. Our evaluation, comparing a
baseline single language pair NMT model against semi-supervised learning,
transfer learning, and multilingual modeling, shows significant performance
improvements both in the En-LRL and LRL-En directions. In terms of averaged
BLEU score, the multilingual approach shows the largest gains, up to +5 points,
in six out of ten translation directions. To demonstrate the generalization
capability of each model, we also report results on multi-domain test sets. We
release the standardized experimental data and the test sets for future works
addressing the challenges of NMT in under-resourced settings, in particular for
the SATOS languages.Comment: Accepted for AfricaNLP workshop at ICLR 202
Adapting Multilingual Neural Machine Translation to Unseen Languages
Multilingual Neural Machine Translation (MNMT) for low-resource languages
(LRL) can be enhanced by the presence of related high-resource languages (HRL),
but the relatedness of HRL usually relies on predefined linguistic assumptions
about language similarity. Recently, adapting MNMT to a LRL has shown to
greatly improve performance. In this work, we explore the problem of adapting
an MNMT model to an unseen LRL using data selection and model adaptation. In
order to improve NMT for LRL, we employ perplexity to select HRL data that are
most similar to the LRL on the basis of language distance. We extensively
explore data selection in popular multilingual NMT settings, namely in
(zero-shot) translation, and in adaptation from a multilingual pre-trained
model, for both directions (LRL-en). We further show that dynamic adaptation of
the model's vocabulary results in a more favourable segmentation for the LRL in
comparison with direct adaptation. Experiments show reductions in training time
and significant performance gains over LRL baselines, even with zero LRL data
(+13.0 BLEU), up to +17.0 BLEU for pre-trained multilingual model dynamic
adaptation with related data selection. Our method outperforms current
approaches, such as massively multilingual models and data augmentation, on
four LRL.Comment: Accepted at the 16th International Workshop on Spoken Language
Translation (IWSLT), November, 201
Multilingual Neural Machine Translation for Low-Resource Languages
In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT). However, NMT systems are limited in translating low-resourced languages, due to the significant amount of parallel data that is required to learn useful mappings between languages. In this work, we show how the so-called multilingual NMT can help to tackle the challenges associated with low-resourced language translation. The underlying principle of multilingual NMT is to force the creation of hidden representations of words in a shared semantic space across multiple languages, thus enabling a positive parameter transfer across languages. Along this direction, we present multilingual translation experiments with three languages (English, Italian, Romanian) covering six translation directions, utilizing both recurrent neural networks and transformer (or self-attentive) neural networks. We then focus on the zero-shot translation problem, that is how to leverage multi-lingual data in order to learn translation directions that are not covered by the available training material. To this aim, we introduce our recently proposed iterative self-training method, which incrementally improves a multilingual NMT on a zero-shot direction by just relying on monolingual data. Our results on TED talks data show that multilingual NMT outperforms conventional bilingual NMT, that the transformer NMT outperforms recurrent NMT, and that zero-shot NMT outperforms conventional pivoting methods and even matches the performance of a fully-trained bilingual system
Improving Zero-Shot Translation of Low-Resource Languages
Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zeroshot) translation directions not observed at training time. We investigate here a zero-shot translation in a particularly lowresource multilingual setting. We propose a simple iterative training procedure that leverages a duality of translations directly generated by the system for the zero-shot directions. The translations produced by the system (sub-optimal since they contain mixed language from the shared vocabulary), are then used together with the original parallel data to feed and iteratively re-train the multilingual network. Over time, this allows the system to learn from its own generated and increasingly better output. Our approach shows to be effective in improving the two zero-shot directions of our multilingual model. In particular, we observed gains of about 9 BLEU points over a baseline multilingual model and up to 2.08 BLEU over a pivoting mechanism using two bilingual models. Further analysis shows that there is also a slight improvement in the non-zero-shot language directions
FBK’s Multilingual Neural Machine Translation System for IWSLT 2017
Neural Machine Translation has been shown to enable in-ference and cross-lingual knowledge transfer across multi-ple language directions using a single multilingual model.Focusing on this multilingual translation scenario, this worksummarizes FBK’s participation in the IWSLT 2017 sharedtask. Our submissions rely on two multilingual systemstrained on five languages (English, Dutch, German, Ital-ian, and Romanian). The first one is a20language direc-tion model, which handles all possible combinations of thefive languages. The second multilingual system is trainedonly on16directions, leaving the others as zero-shot trans-lation directions (i.erepresenting a more complex inferencetask on language pairs not seen at training time). Morespecifically, our zero-shot directions are DutchRomanian (resulting in four language combi-nations). Despite the small amount of parallel data usedfor training these systems, the resulting multilingual modelsare effective, even in comparison with models trained sepa-rately for every language pair (i.e.in more favorable condi-tions). We compare and show the results of the two multi-lingual models against a baseline single language pair sys-tems. Particularly, we focus on the four zero-shot directionsand show how a multilingual model trained with small datacan provide reasonable results. Furthermore, we investigatehow pivoting (i.eusing a bridge/pivot language for inferencein a source!pivot!target translations) using a multilingualmodel can be an alternative to enable zero-shot translation ina low resource setting
Findings of the IWSLT 2022 Evaluation Campaign.
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation. A total of 27 teams participated in at least one of the shared tasks. This paper details, for each shared task, the purpose of the task, the data that were released, the evaluation metrics that were applied, the submissions that were received and the results that were achieved