51 research outputs found
Results of the WMT17 Neural MT Training Task
This paper presents the results of the WMT17 Neural MT Training Task.
The objective of this task is to explore the methods of training a fixed neural architecture, aiming primarily at the best translation quality and, as a secondary goal, shorter training time.
Task participants were provided with a complete neural machine translation system, fixed training data and the configuration of the network.
The translation was performed in the English-to-Czech direction and the task was divided into two subtasks of different configurations - one scaled to fit on a 4GB and another on an 8GB GPU card.
We received 3 submissions for the 4GB variant and 1 submission for the 8GB variant; we provided also our run for each of the sizes and two baselines.
We translated the test set with the trained models and evaluated the outputs using several automatic metrics.
We also report results of the human evaluation of the submitted systems
Findings of the 2017 Conference on Machine Translation
This paper presents the results of the
WMT17 shared tasks, which included
three machine translation (MT) tasks
(news, biomedical, and multimodal), two
evaluation tasks (metrics and run-time estimation
of MT quality), an automatic
post-editing task, a neural MT training
task, and a bandit learning task
The QT21 Combined Machine Translation System for English to Latvian
This paper describes the joint submis-
sion of the QT21 projects for the
English
→
Latvian translation task of the
EMNLP 2017 Second Conference on Ma-
chine Translation
(WMT 2017). The sub-
mission is a system combination which
combines seven different statistical ma-
chine translation systems provided by the
different groups.
The systems are combined using either
RWTH’s system combination approach,
or
USFD’s
consensus-based
system-
selection approach. The final submission
shows an improvement of 0.5 B
LEU
compared to the best single system on
newstest2017
Neural Monkey: The Current State and Beyond
Neural Monkey is an open-source toolkit for sequence-to-sequence learning. The focus of this paper is to present the current state of the toolkit to the intended audience, which includes students and researchers, both active in the deep learning community and newcomers. For each of these target groups, we describe the most relevant features of the toolkit, including the simple configuration scheme, methods of model inspection that promote useful intuitions, or a modular design for easy prototyping. We summarize relevant contributions to the research community which were made using this toolkit and discuss the characteristics of our toolkit with respect to other existing systems. We conclude with a set of proposals for future development
Findings of the 2017 Conference on Machine Translation (WMT17)
This paper presents the results of theWMT17 shared tasks, which included three machine translation (MT) tasks(news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task
Latent Variable Model for Multi-modal Translation
In this work, we propose to model the interaction between visual and textual
features for multi-modal neural machine translation (MMT) through a latent
variable model. This latent variable can be seen as a multi-modal stochastic
embedding of an image and its description in a foreign language. It is used in
a target-language decoder and also to predict image features. Importantly, our
model formulation utilises visual and textual inputs during training but does
not require that images be available at test time. We show that our latent
variable MMT formulation improves considerably over strong baselines, including
a multi-task learning approach (Elliott and K\'ad\'ar, 2017) and a conditional
variational auto-encoder approach (Toyama et al., 2016). Finally, we show
improvements due to (i) predicting image features in addition to only
conditioning on them, (ii) imposing a constraint on the minimum amount of
information encoded in the latent variable, and (iii) by training on additional
target-language image descriptions (i.e. synthetic data).Comment: Paper accepted at ACL 2019. Contains 8 pages (11 including
references, 13 including appendix), 6 figure
- …