11,951 research outputs found
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201
Zero-shot Neural Transfer for Cross-lingual Entity Linking
Cross-lingual entity linking maps an entity mention in a source language to
its corresponding entry in a structured knowledge base that is in a different
(target) language. While previous work relies heavily on bilingual lexical
resources to bridge the gap between the source and the target languages, these
resources are scarce or unavailable for many low-resource languages. To address
this problem, we investigate zero-shot cross-lingual entity linking, in which
we assume no bilingual lexical resources are available in the source
low-resource language. Specifically, we propose pivot-based entity linking,
which leverages information from a high-resource "pivot" language to train
character-level neural entity linking models that are transferred to the source
low-resource language in a zero-shot manner. With experiments on 9 low-resource
languages and transfer through a total of 54 languages, we show that our
proposed pivot-based framework improves entity linking accuracy 17% (absolute)
on average over the baseline systems, for the zero-shot scenario. Further, we
also investigate the use of language-universal phonological representations
which improves average accuracy (absolute) by 36% when transferring between
languages that use different scripts.Comment: To appear in AAAI 201
Contextual Parameter Generation for Universal Neural Machine Translation
We propose a simple modification to existing neural machine translation (NMT)
models that enables using a single universal model to translate between
multiple languages while allowing for language specific parameterization, and
that can also be used for domain adaptation. Our approach requires no changes
to the model architecture of a standard NMT system, but instead introduces a
new component, the contextual parameter generator (CPG), that generates the
parameters of the system (e.g., weights in a neural network). This parameter
generator accepts source and target language embeddings as input, and generates
the parameters for the encoder and the decoder, respectively. The rest of the
model remains unchanged and is shared across all languages. We show how this
simple modification enables the system to use monolingual data for training and
also perform zero-shot translation. We further show it is able to surpass
state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and
that the learned language embeddings are able to uncover interesting
relationships between languages.Comment: Published in the proceedings of Empirical Methods in Natural Language
Processing (EMNLP), 201
- …