Multimodal machine translation is the task
of translating from a source text into the
target language using information from
other modalities. Existing multimodal
datasets have been restricted to only highly
resourced languages. In addition to that,
these datasets were collected by manual
translation of English descriptions from
the Flickr30K dataset. In this work, we
introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30,000 sentences which were created utilizing several
machine translation outputs. Using data
from MMDravi and a phonetic transcription of the corpus, we build an Multilingual
Multimodal Neural Machine Translation
system (MMNMT) for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We
evaluate our translations generated by the
proposed approach with human-annotated
evaluation dataset in terms of BLEU, METEOR, and TER metrics. Relying on
multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the underresourced languages.This work is supported by a research grant from
Science Foundation Ireland, co-funded by the European Regional Development Fund, for the Insight Centre under Grant Number SFI/12/RC/2289
and the European Union’s Horizon 2020 research
and innovation programme under grant agreement
No 731015, ELEXIS - European Lexical Infrastructure and grant agreement No 825182, Pret- ˆ a-`
LLOD