Search CORE

26 research outputs found

The QT21/HimL Combined Machine Translation System

Author: Alkhouli Tamer
Allauzen Alexandre
Aufrant Lauriane
Blain Frédéric
Bojar Ondrej
Braune Fabienne
Burlot Franck
Frank Stella
Fraser Alexander
Haddow Barry
Huck Matthias
knyazeva elena
Lavergne Thomas
Ney Hermann
Niehues Jan
Peter Jan-Thorsten
Pinnis Marcis
Sennrich Rico
Specia Lucia
Tamchyna Ales
Waibel Alex
Yvon François
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the joint submission of the QT21 and HimL projects for the English→Romanian translation task of the ACL 2016 First Conference on Machine Translation (WMT 2016). The submission is a system combination which combines twelve different statistical machine translation systems provided by the different groups (RWTH Aachen University, LMU Munich, Charles University in Prague, University of Edinburgh, University of Sheffield, Karlsruhe Institute of Technology, LIMSI, University of Amsterdam, Tilde). The systems are combined using RWTH’s system combination approach. The final submission shows an improvement of 1.0 BLEU compared to the best single system on newstest2016

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

Wolverhampton Intellectual Repository and E-theses

Results of the WMT17 metrics shared task

Author: Bojar Ondřej
Graham Yvette
Kamran Amir
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT17 news translation task and Neural MT training task. We collected scores of 14 metrics from 8 research groups. In addition to that, we computed scores of 7 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT17 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in judging the quality of a particular sentence). This year, we build upon two types of manual judgements: direct assessment (DA) and HUME manual semantic judgements

Crossref

Irish Universities

DCU Online Research Access Service

Biblio at Institute of Formal and Applied Linguistics

ParaCrawl: Web-Scale Acquisition of Parallel Corpora

Author: Bañón Marta
Chen Pinzhen
Esplà-Gomis Miquel
Forcada Mikel
Haddow Barry
Heafield Kenneth
Hoang Hieu
Kamran Amir
Kirefu Faheem
Koehn Philipp
Ortiz-Rojas Sergio
Pla Leopoldo
Ramírez-Sánchez Gema
Sarrías Elsa
Strelec Marek
Thompson Brian
Waites William
Wiggins Dion
Zaragoza Jaume
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software. We empirically compare alternative methods and publish benchmark data sets for sentence alignment and sentence pair filtering. We also describe the parallel corpora released and evaluate their quality and their usefulness to create machine translation systems

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Results of the WMT16 Metrics Shared Task

Author: Bojar Ondřej
Graham Yvette
Kamran Amir
Stanojević Miloš
Publication venue
Publication date: 01/01/2016
Field of study

This paper presents the results of the WMT16 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT16 Shared Translation Task. We collected scores of 16 metrics from 9 research groups. In addition to that, we computed scores of 9 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT16 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence). This year there are several additions to the setup: large number of language pairs (18 in total), datasets from different domains (news, IT and medical), and different kinds of judgments: relative ranking (RR), direct assessment (DA) and HUME manual semantic judgments. Finally, generation of large number of hybrid systems was trialed for provision of more conclusive system-level metric rankings

Crossref

Biblio at Institute of Formal and Applied Linguistics

Findings of the 2016 Conference on Machine Translation (WMT16)

Author: Bojar Ondrej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huck Matthias
Jimeno Yepes Antonio
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Neveol Aurelie
Neves Mariana
Popel Martin
Post Matt
Rubino Raphael
Scarton Carolina
Specia Lucia
Turchi Marco
Verspoor Karin
Zampieri Marcos
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments)

Archivio della ricerca - Fondazione Bruno Kessler

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Overview of the IWSLT 2017 Evaluation Campaign

Author: Bentivogli L.
Cettolo M.
Federico M.
Federmann C.
Niehues J.
Stüker S.
Sudoh K.
Yoshino K.
Publication venue: Association for Computational Linguistics
Publication date: 03/01/2024
Field of study

The IWSLT 2017 evaluation campaign has organised three tasks. The Multilingual task, which is about training machine translation systems handling many-to-many language directions, including so-called zero-shot directions. The Dialogue task, which calls for the integration of context information in machine translation, in order to resolve anaphoric references that typically occur in human-human dialogue turns. And, finally, the Lecture task, which offers the challenge of automatically transcribing and translating real-life university lectures. Following the tradition of these reports, we will described all tasks in detail and present the results of all runs submitted by their participants

KITopen

This paper presents the results of the WMT17 shared tasks, which included three machine translation (MT) tasks (news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Irish Universities

Edinburgh Research Explorer

Biblio at Institute of Formal and Applied Linguistics

DCU Online Research Access Service