122,648 research outputs found
XNMT: The eXtensible Neural Machine Translation Toolkit
This paper describes XNMT, the eXtensible Neural Machine Translation toolkit.
XNMT distin- guishes itself from other open-source NMT toolkits by its focus on
modular code design, with the purpose of enabling fast iteration in research
and replicable, reliable results. In this paper we describe the design of XNMT
and its experiment configuration system, and demonstrate its utility on the
tasks of machine translation, speech recognition, and multi-tasked machine
translation/parsing. XNMT is available open-source at
https://github.com/neulab/xnmtComment: To be presented at AMTA 2018 Open Source Software Showcas
Neural System Combination for Machine Translation
Neural machine translation (NMT) becomes a new approach to machine
translation and generates much more fluent results compared to statistical
machine translation (SMT).
However, SMT is usually better than NMT in translation adequacy. It is
therefore a promising direction to combine the advantages of both NMT and SMT.
In this paper, we propose a neural system combination framework leveraging
multi-source NMT, which takes as input the outputs of NMT and SMT systems and
produces the final translation.
Extensive experiments on the Chinese-to-English translation task show that
our model archives significant improvement by 5.3 BLEU points over the best
single system output and 3.4 BLEU points over the state-of-the-art traditional
system combination methods.Comment: Accepted as a short paper by ACL-201
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation
We introduce a Multi-modal Neural Machine Translation model in which a
doubly-attentive decoder naturally incorporates spatial visual features
obtained using pre-trained convolutional neural networks, bridging the gap
between image description and translation. Our decoder learns to attend to
source-language words and parts of an image independently by means of two
separate attention mechanisms as it generates words in the target language. We
find that our model can efficiently exploit not just back-translated in-domain
multi-modal data but also large general-domain text-only MT corpora. We also
report state-of-the-art results on the Multi30k data set.Comment: 8 pages (11 including references), 2 figure
- …