Search CORE

14 research outputs found

SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task

Author: Agirre Eneko
Branco António
Gaudio Rosa
Gomes Luís
Labaka Gorka
Neale Steven
Oele Dieke
Osenova Petya
Popel Martin
Querido Andreia
Rendeiro Nuno
Rodrigues João
Silva João
Simov Kiril
van Noord Gertjan
Publication venue
Publication date: 01/01/2016
Field of study

This paper presents the description of 12 systems submitted to the WMT16 IT-task, covering six different languages, namely Basque, Bulgarian, Dutch, Czech, Portuguese and Spanish. All these systems were developed under the scope of the QTLeap project, presenting a common strategy. For each language two different systems were submitted, namely a phrase-based MT system built using Moses, and a system exploiting deep language engineering approaches, that in all the languages but Bulgarian was implemented using TectoMT. For 4 of the 6 languages, the TectoMT-based system performs better than the Moses-based one

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Biblio at Institute of Formal and Applied Linguistics

Dissertations of the University of Groningen

Dictionary-based Domain Adaptation of MT Systems without Retraining

Author: Bojar Ondřej
Novák Michal
Popel Martin
Rosa Rudolf
Sudarikov Roman
Publication venue
Publication date: 01/01/2016
Field of study

We describe our submission to the IT-domain translation task of WMT 2016. We perform domain adaptation with dictionary data on already trained MT systems with no further retraining. We apply our approach to two conceptually different systems developed within the QTLeap project: TectoMT and Moses, as well as Chimera, their combination. In all settings, our method improves the translation quality. Moreover, the basic variant of our approach is applicable to any MT system, including a black-box one

Biblio at Institute of Formal and Applied Linguistics

Automated Translation with Interlingual Word Representations

Author: Oele Dieke Merel
Publication venue: Rijksuniversiteit Groningen
Publication date: 01/01/2018
Field of study

University of Groningen

Findings of the 2016 Conference on Machine Translation (WMT16)

Author: Bojar Ondrej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huck Matthias
Jimeno Yepes Antonio
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Neveol Aurelie
Neves Mariana
Popel Martin
Post Matt
Rubino Raphael
Scarton Carolina
Specia Lucia
Turchi Marco
Verspoor Karin
Zampieri Marcos
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments)

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Findings of the 2016 Conference on Machine Translation.

Author: Bojar Ondˇrej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huck Matthias
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Neveol Aurelie
Neves Mariana
Popel Martin
Post Matt
Rubino Raphael
Scarton Carolina
Specia Lucia
Turchi Marco
Verspoor Karin
Yepes Antonio Jimeno
Zampieri Marcos
Publication venue: The Association for Computational Linguistics
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

Findings of the 2017 Conference on Machine Translation

Author: Bojar Ondřej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huang Shujian
Huck Matthias
Koehn Philipp
Liu Qun
Logacheva Varvara
Monz Christof
Negri Matteo
Post Matt
Rubino Raphael
Specia Lucia
Turchi Marco
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

This paper presents the results of the WMT17 shared tasks, which included three machine translation (MT) tasks (news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Irish Universities

Edinburgh Research Explorer

Biblio at Institute of Formal and Applied Linguistics

DCU Online Research Access Service

Automated Translation with Interlingual Word Representations

Author: Oele Dieke Merel
Publication venue: Rijksuniversiteit Groningen
Publication date: 01/01/2018
Field of study

In dit proefschrift onderzoeken we het gebruik vertaalsystemen die gebruiken maken van een transferfase met interlinguale representaties van woorden. Op deze manier benaderen we het probleem van de lexicale ambiguïteit in de automatische vertaalsystemen als twee afzonderlijke taken: het bepalen van woordbetekenis en lexicale selectie. Eerst worden de woorden in de brontaal op basis van hun betekenis gedesambigueerd, resulterend in interlinguale representaties van woorden. Vervolgens wordt een lexicale selectiemodule gebruikt die het meest geschikte woord in de doeltaal selecteert. We geven een gedetailleerde beschrijving van de ontwikkeling en evaluatie van vertaalsystemen voor Nederlands-Engels. Dit biedt een achtergrond voor de experimenten in het tweede en derde deel van dit proefschrift. Daarna beschrijven we een methode die de betekenis van woorden bepaalt. Deze is vergelijkbaar met het klassieke Lesk-algoritme, omdat het gebruik maakt van het idee dat gedeelde woorden tussen de context van een woord en zijn definitie informatie over de betekenis ervan verschaffen. Wij gebruiken echter, in plaats daarvan, woord- en betekenisvectoren om de overeenkomst te berekenen tussen de definitie van een betekenis en de context van een woord. We gebruiken onze methode bovendien voor het localiseren en -interpreteren van woordgrapjes.Ten slotte presenteren we een model voor lexicale keuze dat lemma's selecteert, gegeven de abstracte representaties van woorden. Dit doen we door de grammaticale bomen te converteren naar hidden Markov bomen. Op deze manier kan de optimale combinatie van lemmas en hun context berekend worden

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Findings of the 2017 Conference on Machine Translation (WMT17)

Author: Barry Haddow
Christian Federmann
Christof Monz
Lucia Specia
Marco Turchi .
Matt Post
Matteo Negri
Matthias Huck
Ondˇrej Bojar
Philipp Koehn
Qun Liu
Rajen Chatterjee
Raphael Rubino
Shujianhuang
Varvara Logacheva
Yvette Graham
Publication venue: The Association for Computational Linguistics
Publication date
Field of study

This paper presents the results of theWMT17 shared tasks, which included three machine translation (MT) tasks(news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task

Archivio della ricerca - Fondazione Bruno Kessler

English-to-Czech MT: Large Data and Beyond

Author: Bojar Ondřej
Publication venue
Publication date: 06/12/2018
Field of study

CU Digital Repository

Itzulpen automatiko gainbegiratu gabea

Author: Artexe Zurutuza Mikel
Publication venue
Publication date: 29/07/2020
Field of study

192 p.Modern machine translation relies on strong supervision in the form of parallel corpora. Such arequirement greatly departs from the way in which humans acquire language, and poses a major practicalproblem for low-resource language pairs. In this thesis, we develop a new paradigm that removes thedependency on parallel data altogether, relying on nothing but monolingual corpora to train unsupervisedmachine translation systems. For that purpose, our approach first aligns separately trained wordrepresentations in different languages based on their structural similarity, and uses them to initializeeither a neural or a statistical machine translation system, which is further trained through iterative backtranslation.While previous attempts at learning machine translation systems from monolingual corporahad strong limitations, our work¿along with other contemporaneous developments¿is the first to reportpositive results in standard, large-scale settings, establishing the foundations of unsupervised machinetranslation and opening exciting opportunities for future research

Archivo Digital para la Docencia y la Investigación