42 research outputs found
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation
We introduce a Multi-modal Neural Machine Translation model in which a
doubly-attentive decoder naturally incorporates spatial visual features
obtained using pre-trained convolutional neural networks, bridging the gap
between image description and translation. Our decoder learns to attend to
source-language words and parts of an image independently by means of two
separate attention mechanisms as it generates words in the target language. We
find that our model can efficiently exploit not just back-translated in-domain
multi-modal data but also large general-domain text-only MT corpora. We also
report state-of-the-art results on the Multi30k data set.Comment: 8 pages (11 including references), 2 figure
CUNI System for the WMT18 Multimodal Translation Task
We present our submission to the WMT18 Multimodal Translation Task. The main
feature of our submission is applying a self-attentive network instead of a
recurrent neural network. We evaluate two methods of incorporating the visual
features in the model: first, we include the image representation as another
input to the network; second, we train the model to predict the visual features
and use it as an auxiliary objective. For our submission, we acquired both
textual and multimodal additional data. Both of the proposed methods yield
significant improvements over recurrent networks and self-attentive textual
baselines.Comment: Published at WMT1
Findings of the 2015 Workshop on Statistical Machine Translation
This paper presents the results of the
WMT15 shared tasks, which included a
standard news translation task, a metrics
task, a tuning task, a task for run-time
estimation of machine translation quality,
and an automatic post-editing task. This
year, 68 machine translation systems from
24 institutions were submitted to the ten
translation directions in the standard translation
task. An additional 7 anonymized
systems were included, and were then
evaluated both automatically and manually.
The quality estimation task had three
subtasks, with a total of 10 teams, submitting
34 entries. The pilot automatic postediting
task had a total of 4 teams, submitting
7 entries
Findings of the 2022 Conference on Machine Translation (WMT22)
International audienceThis paper presents the results of the General Machine Translation Task organised as part of the Conference on Machine Translation (WMT) 2022. In the general MT task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting of four different domains. We evaluate system outputs with human annotators using two different techniques: reference-based direct assessment and (DA) and a combination of DA and scalar quality metric (DA+SQM)
Using images to improve machine-translating E-commerce product listings
In this paper we study the impact of using
images to machine-translate user-generated ecommerce product listings. We study how
a multi-modal Neural Machine Translation
(NMT) model compares to two text-only approaches: a conventional state-of-the-art attentional NMT and a Statistical Machine Translation (SMT) model. User-generated product
listings often do not constitute grammatical
or well-formed sentences. More often than
not, they consist of the juxtaposition of short
phrases or keywords. We train our models
end-to-end as well as use text-only and multimodal NMT models for re-ranking n-best lists
generated by an SMT model. We qualitatively evaluate our user-generated training data
also analyse how adding synthetic data impacts the results. We evaluate our models
quantitatively using BLEU and TER and find
that (i) additional synthetic data has a general
positive impact on text-only and multi-modal
NMT models, and that (ii) using a multi-modal
NMT model for re-ranking n-best lists improves TER significantly across different nbest list sizes
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks
Neural sequence to sequence learning recently became a very promising
paradigm in machine translation, achieving competitive results with statistical
phrase-based systems. In this system description paper, we attempt to utilize
several recently published methods used for neural sequential learning in order
to build systems for WMT 2016 shared tasks of Automatic Post-Editing and
Multimodal Machine Translation.Comment: Accepted to the First Conference of Machine Translation (WMT16
Region-Attentive Multimodal Neural Machine Translation
We propose a multimodal neural machine translation (MNMT) method with semantic image regions called region-attentive multimodal neural machine translation (RA-NMT). Existing studies on MNMT have mainly focused on employing global visual features or equally sized grid local visual features extracted by convolutional neural networks (CNNs) to improve translation performance. However, they neglect the effect of semantic information captured inside the visual features. This study utilizes semantic image regions extracted by object detection for MNMT and integrates visual and textual features using two modality-dependent attention mechanisms. The proposed method was implemented and verified on two neural architectures of neural machine translation (NMT): recurrent neural network (RNN) and self-attention network (SAN). Experimental results on different language pairs of Multi30k dataset show that our proposed method improves over baselines and outperforms most of the state-of-the-art MNMT methods. Further analysis demonstrates that the proposed method can achieve better translation performance because of its better visual feature use