9 research outputs found
Findings of the 2019 Conference on Machine Translation (WMT19)
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.
Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation
Findings of the 2018 Conference on Machine Translation (WMT18)
This paper presents the results of the premier
shared task organized alongside the Confer-
ence on Machine Translation (WMT) 2018.
Participants were asked to build machine
translation systems for any of 7 language pairs
in both directions, to be evaluated on a test set
of news stories. The main metric for this task
is human judgment of translation quality. This
year, we also opened up the task to additional
test suites to probe specific aspects of transla-
tion
The WMT'18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English
Peer reviewe
Why don't people use character-level machine translation?
We present a literature and empirical survey that critically assesses the
state of the art in character-level modeling for machine translation (MT).
Despite evidence in the literature that character-level systems are comparable
with subword systems, they are virtually never used in competitive setups in
WMT competitions. We empirically show that even with recent modeling
innovations in character-level natural language processing, character-level MT
systems still struggle to match their subword-based counterparts.
Character-level MT systems show neither better domain robustness, nor better
morphological generalization, despite being often so motivated. However, we are
able to show robustness towards source side noise and that translation quality
does not degrade with increasing beam size at decoding time.Comment: 16 pages, 4 figures; Findings of ACL 2022, camera-read
The TALP-UPC machine translation systems for WMT18 news translation shared task
In this article we describe the TALP-UPC research group participation in the WMT18 news shared translation task for FinnishEnglish and Estonian-English within the multi-lingual subtrack. All of our primary submissions implement an attention-based Neural Machine Translation architecture. Given that Finnish and Estonian belong to the same language family and are similar, we use as training data the combination of the datasets of both language pairs to paliate the data scarceness of each individual pair. We also report the translation quality of systems trained on individual language pair data to serve as baseline and comparison reference.Peer ReviewedPostprint (published version
The TALP-UPC machine translation systems for WMT18 news translation shared task
In this article we describe the TALP-UPC research group participation in the WMT18 news shared translation task for FinnishEnglish and Estonian-English within the multi-lingual subtrack. All of our primary submissions implement an attention-based Neural Machine Translation architecture. Given that Finnish and Estonian belong to the same language family and are similar, we use as training data the combination of the datasets of both language pairs to paliate the data scarceness of each individual pair. We also report the translation quality of systems trained on individual language pair data to serve as baseline and comparison reference.Peer Reviewe