22 research outputs found

    Findings of the 2019 Conference on Machine Translation (WMT19)

    Get PDF
    This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation

    The Effect of Translationese in Machine Translation Test Sets

    Get PDF

    The Effect of Translationese in Machine Translation Test Sets

    Get PDF

    Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses

    Get PDF
    We present a test set for evaluating an MT system’s capability to translate ambiguous conjunctions depending on the sentence structure. We concentrate on the English conjunction ”but” and its French equivalent ”mais” which can be translated into two different German conjunctions. We evaluate all English-to-German and French-to-German submissions to the WMT 2019 shared translation task. The evaluation is done mainly automatically, with additional fast manual inspection of unclear cases. All systems almost perfectly recognise the ta-get conjunction ”aber”, whereas accuracies fo rthe other target conjunction ”sondern” range from 78% to 97%, and the errors are mostly caused by replacing it with the alternative cojjunction ”aber”. The best performing system for both language pairs is a multilingual Transformer TartuNLP system trained on all WMT2019 language pairs which use the Latin script, indicating that the multilingual approach is beneficial for conjunction disambiguation. As for other system features, such as using synthetic back-translated data, context-aware, hybrid, etc., no particular (dis)advantages can be observed. Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right conjunction whereas the rest of the translation hypothesis is poor. On the other hand, the low ranked systems generally exhibit lower fluency and poor adequacy

    Findings of the 2018 Conference on Machine Translation (WMT18)

    Get PDF
    This paper presents the results of the premier shared task organized alongside the Confer- ence on Machine Translation (WMT) 2018. Participants were asked to build machine translation systems for any of 7 language pairs in both directions, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. This year, we also opened up the task to additional test suites to probe specific aspects of transla- tion

    The Effect of Translationese in Machine Translation Test Sets

    Get PDF

    Findings of the WMT 2022 Biomedical Translation Shared Task: Monolingual Clinical Case Reports

    Get PDF
    International audienceIn the seventh edition of the WMT Biomedical Task, we addressed a total of seven language pairs, namely English/German, English/French, English/Spanish, English/Portuguese, English/Chinese, English/Russian, English/Italian. This year’s test sets covered three types of biomedical text genre. In addition to scientific abstracts and terminology items used in previ- ous editions, we released test sets of clinical cases. The evaluation of clinical cases translations were given special attention by involving clinicians in the preparation of reference translations and manual evaluation. For the main MEDLINE test sets, we received a total of 609 submissions from 37 teams. For the ClinSpEn sub-task, we had the participation of five teams
    corecore