157 research outputs found
Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English–French and English–German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English–French subtask, five for French–English, nine for English–German, and six for German–English. Most of the submissions outperformed two strong language-model-based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction
We describe the design, the setup, and the
evaluation results of the DiscoMT 2017
shared task on cross-lingual pronoun prediction.
The task asked participants to
predict a target-language pronoun given a
source-language pronoun in the context of
a sentence. We further provided a lemmatized
target-language human-authored
translation of the source sentence, and
automatic word alignments between the
source sentence words and the targetlanguage
lemmata. The aim of the task
was to predict, for each target-language
pronoun placeholder, the word that should
replace it from a small, closed set of
classes, using any type of information that
can be extracted from the entire document.
We offered four subtasks, each for a
different language pair and translation
direction: English-to-French, Englishto-German,
German-to-English, and
Spanish-to-English. Five teams participated
in the shared task, making
submissions for all language pairs. The
evaluation results show that all participating
teams outperformed two strong
n-gram-based language model-based
baseline systems by a sizable margin
- …