Search CORE

351 research outputs found

Findings of the WMT 2016 Bilingual Document Alignment Shared Task

Author: Buck Christian
Koehn Philipp
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Crossref

Edinburgh Research Explorer

ParaCrawl: Web-Scale Acquisition of Parallel Corpora

Author: Bañón Marta
Chen Pinzhen
Esplà-Gomis Miquel
Forcada Mikel
Haddow Barry
Heafield Kenneth
Hoang Hieu
Kamran Amir
Kirefu Faheem
Koehn Philipp
Ortiz-Rojas Sergio
Pla Leopoldo
Ramírez-Sánchez Gema
Sarrías Elsa
Strelec Marek
Thompson Brian
Waites William
Wiggins Dion
Zaragoza Jaume
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software. We empirically compare alternative methods and publish benchmark data sets for sentence alignment and sentence pair filtering. We also describe the parallel corpora released and evaluate their quality and their usefulness to create machine translation systems

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Findings of the 2019 Conference on Machine Translation (WMT19)

Author: Barrault Loïc
Bojar Ondřej
Costa-Jussà Marta R.
Federmann Christian
Fishel Mark
Graham Yvette
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/08/2019
Field of study

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation

Irish Universities

DCU Online Research Access Service