Search CORE

30 research outputs found

Findings of the 2016 Conference on Machine Translation (WMT16)

Author: Bojar Ondrej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huck Matthias
Jimeno Yepes Antonio
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Neveol Aurelie
Neves Mariana
Popel Martin
Post Matt
Rubino Raphael
Scarton Carolina
Specia Lucia
Turchi Marco
Verspoor Karin
Zampieri Marcos
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments)

Archivio della ricerca - Fondazione Bruno Kessler

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

European Language Grid

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2022
Field of study

This open access book provides an in-depth description of the EU project European Language Grid (ELG). Its motivation lies in the fact that Europe is a multilingual society with 24 official European Union Member State languages and dozens of additional languages including regional and minority languages. The only meaningful way to enable multilingualism and to benefit from this rich linguistic heritage is through Language Technologies (LT) including Natural Language Processing (NLP), Natural Language Understanding (NLU), Speech Technologies and language-centric Artificial Intelligence (AI) applications. The European Language Grid provides a single umbrella platform for the European LT community, including research and industry, effectively functioning as a virtual home, marketplace, showroom, and deployment centre for all services, tools, resources, products and organisations active in the field. Today the ELG cloud platform already offers access to more than 13,000 language processing tools and language resources. It enables all stakeholders to deposit, upload and deploy their technologies and datasets. The platform also supports the long-term objective of establishing digital language equality in Europe by 2030 – to create a situation in which all European languages enjoy equal technological support. This is the very first book dedicated to Language Technology and NLP platforms. Cloud technology has only recently matured enough to make the development of a platform like ELG feasible on a larger scale. The book comprehensively describes the results of the ELG project. Following an introduction, the content is divided into four main parts: (I) ELG Cloud Platform; (II) ELG Inventory of Technologies and Resources; (III) ELG Community and Initiative; and (IV) ELG Open Calls and Pilot Projects

Directory of Open Access Books (DOAB)

Proceedings of the 17th Annual Conference of the European Association for Machine Translation

Author
Publication venue: Hrvatsko društvo za jezične tehnologije
Publication date: 01/01/2014
Field of study

Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

New perspectives on cohesion and coherence: Implications for translation

Author: Kerremans Koen
Kunilovskaya Maria
Kunz Kerstin
Kutuzov Andrey
Lapshinova-Koltunski Ekaterina
Menzel Katrin
Rysová Kateřina
Rysová Magdaléna
Sim Smith Karin
Specia Lucia
Publication venue: Language Science Press
Publication date: 13/06/2017
Field of study

The contributions to this volume investigate relations of cohesion and coherence as well as instantiations of discourse phenomena and their interaction with information structure in multilingual contexts. Some contributions concentrate on procedures to analyze cohesion and coherence from a corpus-linguistic perspective. Others have a particular focus on textual cohesion in parallel corpora that include both originals and translated texts. Additionally, the papers in the volume discuss the nature of cohesion and coherence with implications for human and machine translation.The contributors are experts on discourse phenomena and textuality who address these issues from an empirical perspective. The chapters in this volume are grounded in the latest research making this book useful to both experts of discourse studies and computational linguistics, as well as advanced students with an interest in these disciplines. We hope that this volume will serve as a catalyst to other researchers and will facilitate further advances in the development of cost-effective annotation procedures, the application of statistical techniques for the analysis of linguistic phenomena and the elaboration of new methods for data interpretation in multilingual corpus linguistics and machine translation

Language Science Press

New perspectives on cohesion and coherence: Implications for translation

Author: Kerremans Koen
Kunilovskaya Maria
Kunz Kerstin
Kutuzov Andrey
Lapshinova-Koltunski Ekaterina
Menzel Katrin
Rysová Kateřina
Rysová Magdaléna
Sim Smith Karin
Specia Lucia
Publication venue: Language Science Press
Publication date: 13/06/2017
Field of study

Language Science Press

New perspectives on cohesion and coherence: Implications for translation

Author: Kerremans Koen
Kunilovskaya Maria
Kunz Kerstin
Kutuzov Andrey
Lapshinova-Koltunski Ekaterina
Menzel Katrin
Rysová Kateřina
Rysová Magdaléna
Sim Smith Karin
Specia Lucia
Publication venue: Language Science Press
Publication date: 13/06/2017
Field of study

Language Science Press