1,543 research outputs found
Semi-Supervised Learning for Neural Machine Translation
While end-to-end neural machine translation (NMT) has made remarkable
progress recently, NMT systems only rely on parallel corpora for parameter
estimation. Since parallel corpora are usually limited in quantity, quality,
and coverage, especially for low-resource languages, it is appealing to exploit
monolingual corpora to improve NMT. We propose a semi-supervised approach for
training NMT models on the concatenation of labeled (parallel corpora) and
unlabeled (monolingual corpora) data. The central idea is to reconstruct the
monolingual corpora using an autoencoder, in which the source-to-target and
target-to-source translation models serve as the encoder and decoder,
respectively. Our approach can not only exploit the monolingual corpora of the
target language, but also of the source language. Experiments on the
Chinese-English dataset show that our approach achieves significant
improvements over state-of-the-art SMT and NMT systems.Comment: Corrected a typ
Joint Training for Neural Machine Translation Models with Monolingual Data
Monolingual data have been demonstrated to be helpful in improving
translation quality of both statistical machine translation (SMT) systems and
neural machine translation (NMT) systems, especially in resource-poor or domain
adaptation tasks where parallel data are not rich enough. In this paper, we
propose a novel approach to better leveraging monolingual data for neural
machine translation by jointly learning source-to-target and target-to-source
NMT models for a language pair with a joint EM optimization method. The
training process starts with two initial NMT models pre-trained on parallel
data for each direction, and these two models are iteratively updated by
incrementally decreasing translation losses on training data. In each iteration
step, both NMT models are first used to translate monolingual data from one
language to the other, forming pseudo-training data of the other NMT model.
Then two new NMT models are learnt from parallel data together with the pseudo
training data. Both NMT models are expected to be improved and better
pseudo-training data can be generated in next step. Experiment results on
Chinese-English and English-German translation tasks show that our approach can
simultaneously improve translation quality of source-to-target and
target-to-source models, significantly outperforming strong baseline systems
which are enhanced with monolingual data for model training including
back-translation.Comment: Accepted by AAAI 201
Domain adaptation strategies in statistical machine translation: a brief overview
© Cambridge University Press, 2015.Statistical machine translation (SMT) is gaining interest given that it can easily be adapted to any pair of languages. One of the main challenges in SMT is domain adaptation because the performance in translation drops when testing conditions deviate from training conditions. Many research works are arising to face this challenge. Research is focused on trying to exploit all kinds of material, if available. This paper provides an overview of research, which copes with the domain adaptation challenge in SMT.Peer ReviewedPostprint (author's final draft
The TALP–UPC Spanish–English WMT biomedical task: bilingual embeddings and char-based neural language model rescoring in a phrase-based system
This paper describes the TALP–UPC system in the Spanish–English WMT 2016 biomedical shared task. Our system is a standard phrase-based system enhanced with vocabulary expansion using bilingual word embeddings and a characterbased neural language model with rescoring. The former focuses on resolving outof- vocabulary words, while the latter enhances the fluency of the system. The two modules progressively improve the final translation as measured by a combination of several lexical metrics.Postprint (published version
LIUM Machine Translation Systems for WMT17 News Translation Task
This paper describes LIUM submissions to WMT17 News Translation Task for
English-German, English-Turkish, English-Czech and English-Latvian language
pairs. We train BPE-based attentive Neural Machine Translation systems with and
without factored outputs using the open source nmtpy framework. Competitive
scores were obtained by ensembling various systems and exploiting the
availability of target monolingual corpora for back-translation. The impact of
back-translation quantity and quality is also analyzed for English-Turkish
where our post-deadline submission surpassed the best entry by +1.6 BLEU.Comment: News Translation Task System Description paper for WMT1
Generative Neural Machine Translation
We introduce Generative Neural Machine Translation (GNMT), a latent variable
architecture which is designed to model the semantics of the source and target
sentences. We modify an encoder-decoder translation model by adding a latent
variable as a language agnostic representation which is encouraged to learn the
meaning of the sentence. GNMT achieves competitive BLEU scores on pure
translation tasks, and is superior when there are missing words in the source
sentence. We augment the model to facilitate multilingual translation and
semi-supervised learning without adding parameters. This framework
significantly reduces overfitting when there is limited paired data available,
and is effective for translating between pairs of languages not seen during
training
- …