Neural Machine Translation into Language Varieties

S. M. Lakew; A. Aerofeeva; M. Federico

oai:cris.fbk.eu:11582/316279

Neural Machine Translation into Language Varieties

Authors: S. M. Lakew
A. Aerofeeva
M. Federico
Publication date: 1 January 2018
Publisher

Abstract

Both research and commercial machine trans- lation have so far neglected the importance of properly handling the spelling, lexical and grammar divergences occurring among lan- guage varieties. Notable cases are standard national varieties such as Brazilian and Euro- pean Portuguese, and Canadian and European French, which popular online machine transla- tion services are not keeping distinct. We show that an evident side effect of modeling such va- rieties as unique classes is the generation of inconsistent translations. In this work, we in- vestigate the problem of training neural ma- chine translation from English to specific pairs of language varieties, assuming both labeled and unlabeled parallel texts, and low-resource conditions. We report experiments from En- glish to two pairs of dialects, European- Brazilian Portuguese and European-Canadian French, and two pairs of standardized vari- eties, Croatian-Serbian and Indonesian-Malay. We show significant BLEU score improve- ments over baseline systems when translation into similar languages is learned as a multilin- gual task with shared representations

info:eu-repo/semantics/conferenceObject

Similar works

Full text

Archivio della ricerca - Fondazione Bruno Kessler

oai:cris.fbk.eu:11582/316279

Last time updated on 03/09/2019

This paper was published in Archivio della ricerca - Fondazione Bruno Kessler.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.