2 research outputs found
Leveraging rule-based machine translation knowledge for under-resourced neural machine translation models
Rule-based machine translation is a machine translation paradigm where linguistic
knowledge is encoded by an expert in the
form of rules that translate from source to
target language. While this approach grants
total control over the output of the system,
the cost of formalising the needed linguistic knowledge is much higher than training
a corpus-based system, where a machine
learning approach is used to automatically
learn to translate from examples. In this
paper, we describe different approaches to
leverage the information contained in rulebased machine translation systems to improve a corpus-based one, namely, a neural
machine translation model, with a focus on
a low-resource scenario. Our results suggest that adding morphological information
to the source language is as effective as using subword units in this particular setting.This publication has emanated from research supported in part by a research grant from Science
Foundation Ireland (SFI) under Grant Number
SFI/12/RC/2289, co-funded by the European Regional Development Fund, and the Enterprise Ireland (EI) Innovation Partnership Programme under
grant agreement No IP20180729, NURS – Neural
Machine Translation for Under-Resourced Scenariospeer-reviewed2019-08-1
Leveraging rule-based machine translation knowledge for under-resourced neural machine translation models
Rule-based machine translation is a machine translation paradigm where linguistic
knowledge is encoded by an expert in the
form of rules that translate from source to
target language. While this approach grants
total control over the output of the system,
the cost of formalising the needed linguistic knowledge is much higher than training
a corpus-based system, where a machine
learning approach is used to automatically
learn to translate from examples. In this
paper, we describe different approaches to
leverage the information contained in rulebased machine translation systems to improve a corpus-based one, namely, a neural
machine translation model, with a focus on
a low-resource scenario. Our results suggest that adding morphological information
to the source language is as effective as using subword units in this particular setting.This publication has emanated from research supported in part by a research grant from Science
Foundation Ireland (SFI) under Grant Number
SFI/12/RC/2289, co-funded by the European Regional Development Fund, and the Enterprise Ireland (EI) Innovation Partnership Programme under
grant agreement No IP20180729, NURS – Neural
Machine Translation for Under-Resourced Scenarios2019-08-1