Search CORE

38 research outputs found

DCU System Report on the WMT 2017 Multi-modal Machine Translation Task

Author: Calixto Iacer
Dutta Chowdhury Koel
Liu Qun
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We report experiments with multi-modal neural machine translation models that incorporate global visual features in different parts of the encoder and decoder, and use the VGG19 network to extract features for all images. In our experiments, we explore both different strategies to include global image features and also how ensembling different models at inference time impact translations. Our submissions ranked 3rd best for translating from English into French, always improving considerably over an neural machine translation baseline across all language pair evaluated, e.g. an increase of 7.0–9.2 METEOR points

Crossref

DCU Online Research Access Service

The MeMAD Submission to the WMT18 Multimodal Translation Task

Author: Grönroos Stig-Arne
Huet Benoit
Kurimo Mikko
Laaksonen Jorma
Merialdo Bernard
Pham Phu
Sjöberg Mats
Sulubacak Umut
Tiedemann Jörg
Troncy Raphaël
Vázquez Carrillo Juan Raúl
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2018
Field of study

This paper describes the MeMAD project entry to the WMT Multimodal Machine Translation Shared Task. We propose adapting the Transformer neural machine translation (NMT) architecture to a multi-modal setting. In this paper, we also describe the preliminary experiments with text-only translation systems leading us up to this choice. We have the top scoring system for both English-to-German and English-to-French, according to the automatic metrics for flickr18. Our experiments show that the effect of the visual features in our system is small. Our largest gains come from the quality of the underlying text-only NMT system. We find that appropriate use of additional data is effective.Peer reviewe

arXiv.org e-Print Archive

Crossref

Aaltodoc Publication Archive

Helsingin yliopiston digitaalinen arkisto

Region-Attentive Multimodal Neural Machine Translation

Author: Chu Chenhui
Kajiwara Tomoyuki
Komachi Mamoru
Zhao Yuting
Publication venue: 'Elsevier BV'
Publication date: 01/03/2022
Field of study

We propose a multimodal neural machine translation (MNMT) method with semantic image regions called region-attentive multimodal neural machine translation (RA-NMT). Existing studies on MNMT have mainly focused on employing global visual features or equally sized grid local visual features extracted by convolutional neural networks (CNNs) to improve translation performance. However, they neglect the effect of semantic information captured inside the visual features. This study utilizes semantic image regions extracted by object detection for MNMT and integrates visual and textual features using two modality-dependent attention mechanisms. The proposed method was implemented and verified on two neural architectures of neural machine translation (NMT): recurrent neural network (RNN) and self-attention network (SAN). Experimental results on different language pairs of Multi30k dataset show that our proposed method improves over baselines and outperforms most of the state-of-the-art MNMT methods. Further analysis demonstrates that the proposed method can achieve better translation performance because of its better visual feature use

Kyoto University Research Information Repository

Findings of the 2015 Workshop on Statistical Machine Translation

Author: Bojar Ondrej
Chatterjee Rajen
Federmann Christian
Haddow Barry
Hokamp Chris
Huck Matthias
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Post Matt
Scarton Carolina
Specia Lucia
Turchi Marco
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Findings of the 2014 Workshop on Statistical Machine Translation

Author: Bojar Ondrej
Buck Christian
Federmann Christian
Haddow Barry
Koehn Philipp
Leveling Johannes
Monz Christof
Pecina Pavel
Post Matt
Saint-Amand Herve
Soricut Radu
Specia Lucia
Tamchyna Ales
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents the results of the WMT14 shared tasks, which included a standard news translation task, a separate medical translation task, a task for run-time estimation of machine translation quality, and a metrics task. This year, 143 machine translation systems from 23 institutions were submitted to the ten translation directions in the standard translation task. An additional 6 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had four subtasks, with a total of 10 teams, submitting 57 entries

Crossref

Edinburgh Research Explorer

Biblio at Institute of Formal and Applied Linguistics

International Migration, Integration and Social Cohesion online publications

Incorporating visual information into neural machine translation

Author: Calixto Iacer
Publication venue: Dublin City University. ADAPT
Publication date: 01/11/2017
Field of study

In this work, we study diﬀerent ways to enrich Machine Translation (MT) models using information obtained from images. Speciﬁcally, we propose diﬀerent models to incorporate images into MT by transferring learning from pre-trained convolutional neural networks (CNN) trained for classifying images. We use these pre-trained CNNs for image feature extraction, and use two diﬀerent types of visual features: global visual features, that encode an entire image into one single real-valued feature vector; and local visual features, that encode diﬀerent areas of an image into separate real-valued vectors, therefore also encoding spatial information. We ﬁrst study how to train embeddings that are both multilingual and multi-modal, and use global visual features and multilingual sentences for training. Second, we propose diﬀerent models to incorporate global visual features into state-of-the-art Neural Machine Translation (NMT): (i) as words in the source sentence, (ii) to initialise the encoder hidden state, and (iii) as additional data to initialise the decoder hidden state. Finally, we put forward one model to incorporate local visual features into NMT: (i) a NMT model with an independent visual attention mechanism integrated into the same decoder Recurrent Neural Network (RNN) as the source-language attention mechanism. We evaluate our models on the Multi30k, a publicly available, general domain data set, and also on a proprietary data set of product listings and images built by eBay Inc., which was made available for the purpose of this research. We report state-of-the-art results on the publicly available Multi30k data set. Our best models also signiﬁcantly improve on comparable phrase-based Statistical MT (PBSMT) models trained on the same data set, according to widely adopted MT metrics

DCU Online Research Access Service