Search CORE

40,967 research outputs found

Extending Multilingual Machine Translation through Imitation Learning

Author: Fraser Alexander
Hangya Viktor
Lai Wen
Publication venue
Publication date: 14/11/2023
Field of study

Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new language and English. Previous approaches, such as continued training on parallel data including the new language, suffer from catastrophic forgetting (i.e., performance on other languages is reduced). Our novel approach Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert, a technique widely used in the computer vision area, but not well explored in NLP. More specifically, we construct a pseudo multi-parallel corpus of the new and the original languages by pivoting through English, and imitate the output distribution of the original MNMT model. Extensive experiments show that our approach significantly improves the translation performance between the new and the original languages, without severe catastrophic forgetting. We also demonstrate that our approach is capable of solving copy and off-target problems, which are two common issues existence in current large-scale MNMT models

arXiv.org e-Print Archive

Knowledge Bases and Neural Network Synthesis

Author: Davies Todd R.
Publication venue
Publication date: 01/01/1991
Field of study

We describe and try to motivate our project to build systems using both a knowledge based and a neural network approach. These two approaches are used at different stages in the solution of a problem, instead of using knowledge bases exclusively on some problems, and neural nets exclusively on others. The knowledge base (KB) is defined first in a declarative, symbolic language that is easy to use. It is then compiled into an efficient neural network (NN) representation, run, and the results from run time and (eventually) from learning are decompiled to a symbolic description of the knowledge contained in the network. After inspecting this recovered knowledge, a designer would be able to modify the KB and go through the whole cycle of compiling, running, and decompiling again. The central question with which this project is concerned is, therefore, How do we go from a KB to an NN, and back again? We are investigating this question by building tools consisting of a repertoire of language/translation/network types, and trying them on problems in a variety of domains

PhilPapers

Neural Machine Translation into Language Varieties

Author: Erofeeva Aliia
Federico Marcello
Lakew Surafel M.
Publication venue
Publication date: 01/01/2018
Field of study

Both research and commercial machine translation have so far neglected the importance of properly handling the spelling, lexical and grammar divergences occurring among language varieties. Notable cases are standard national varieties such as Brazilian and European Portuguese, and Canadian and European French, which popular online machine translation services are not keeping distinct. We show that an evident side effect of modeling such varieties as unique classes is the generation of inconsistent translations. In this work, we investigate the problem of training neural machine translation from English to specific pairs of language varieties, assuming both labeled and unlabeled parallel texts, and low-resource conditions. We report experiments from English to two pairs of dialects, EuropeanBrazilian Portuguese and European-Canadian French, and two pairs of standardized varieties, Croatian-Serbian and Indonesian-Malay. We show significant BLEU score improvements over baseline systems when translation into similar languages is learned as a multilingual task with shared representations.Comment: Published at EMNLP 2018: third conference on machine translation (WMT 2018

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler