Search CORE

7 research outputs found

Overview of the IWSLT 2012 Evaluation Campaign

Author
Publication venue
Publication date: 29/06/2015
Field of study

open5siWe report on the ninth evaluation campaign organized by the IWSLT workshop. This year, the evaluation offered multiple tracks on lecture translation based on the TED corpus, and one track on dialog translation from Chinese to English based on the Olympic trilingual corpus. In particular, the TED tracks included a speech transcription track in English, a speech translation track from English to French, and text translation tracks from English to French and from Arabic to English. In addition to the official tracks, ten unofficial MT tracks were offered that required translating TED talks into English from either Chinese, Dutch, German, Polish, Portuguese (Brazilian), Romanian, Russian, Slovak, Slovene, or Turkish. 16 teams participated in the evaluation and submitted a total of 48 primary runs. All runs were evaluated with objective metrics, while runs of the official translation tracks were also ranked by crowd-sourced judges. In particular, subjective ranking for the TED task was performed on a progress test which permitted direct comparison of the results from this year against the best results from the 2011 round of the evaluation campaign.Marcello Federico; Mauro Cettolo; Luisa Bentivogli; Michael Paul; Sebastian StükerFederico, Marcello; Cettolo, Mauro; Bentivogli, Luisa; Michael, Paul; Sebastian, Stüke

Archivio della ricerca - Fondazione Bruno Kessler

Human Feedback in Statistical Machine Translation

Author: Logacheva Varvara
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/04/2017
Field of study

The thesis addresses the challenge of improving Statistical Machine Translation (SMT) systems via feedback given by humans on translation quality. The amount of human feedback available to systems is inherently low due to cost and time limitations. One of our goals is to simulate such information by automatically generating pseudo-human feedback. This is performed using Quality Estimation (QE) models. QE is a technique for predicting the quality of automatic translations without comparing them to oracle (human) translations, traditionally at the sentence or word levels. QE models are trained on a small collection of automatic translations manually labelled for quality, and then can predict the quality of any number of unseen translations. We propose a number of improvements for QE models in order to increase the reliability of pseudo-human feedback. These include strategies to artificially generate instances for settings where QE training data is scarce. We also introduce a new level of granularity for QE: the level of phrases. This level aims to improve the quality of QE predictions by better modelling inter-dependencies among errors at word level, and in ways that are tailored to phrase-based SMT, where the basic unit of translation is a phrase. This can thus facilitate work on incorporating human feedback during the translation process. Finally, we introduce approaches to incorporate pseudo-human feedback in the form of QE predictions in SMT systems. More specifically, we use quality predictions to select the best translation from a number of alternative suggestions produced by SMT systems, and integrate QE predictions into an SMT system decoder in order to guide the translation generation process

White Rose E-theses Online

Findings of the 2015 Workshop on Statistical Machine Translation

Author: Bojar Ondrej
Chatterjee Rajen
Federmann Christian
Haddow Barry
Hokamp Chris
Huck Matthias
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Post Matt
Scarton Carolina
Specia Lucia
Turchi Marco
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Proceedings of the 21st Annual Conference of the European Association for Machine Translation: 28-30 May 2018, Universitat d'Alacant, Alacant, Spain

Author: Esplà-Gomis Miquel (ed.)
Forcada Mikel L. (ed.)
Martins André (ed.)
Popović Maja (ed.)
Pérez-Ortiz Juan Antonio (ed.)
Rico Celia (ed.)
Sánchez-Martínez Felipe (ed.)
Van den Bogaert Joachim (ed.)
Publication venue: European Association for Machine Translation
Publication date: 01/01/2018
Field of study

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Findings of the 2014 Workshop on Statistical Machine Translation

Author: Bojar Ondrej
Buck Christian
Federmann Christian
Haddow Barry
Koehn Philipp
Leveling Johannes
Monz Christof
Pecina Pavel
Post Matt
Saint-Amand Herve
Soricut Radu
Specia Lucia
Tamchyna Ales
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents the results of the WMT14 shared tasks, which included a standard news translation task, a separate medical translation task, a task for run-time estimation of machine translation quality, and a metrics task. This year, 143 machine translation systems from 23 institutions were submitted to the ten translation directions in the standard translation task. An additional 6 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had four subtasks, with a total of 10 teams, submitting 57 entries

Crossref

Edinburgh Research Explorer

Biblio at Institute of Formal and Applied Linguistics

International Migration, Integration and Social Cohesion online publications

POSTECH machine translation system for IWSLT 2008 evaluation campaign

Author: 이근배
이종훈
Publication venue: ATR, NICT
Publication date: 20/10/2008
Field of study

포항공과대학교

POSTECH Machine Translation System for IWSLT 2008 Evaluation Campaign

Author: Gary Geunbae Lee
Jonghoon Lee
Publication venue
Publication date
Field of study

In this paper, we describe POSTECH system for IWSLT 2008 evaluation campaign. The system is based on phrase based statistical machine translation. We set up a baseline system using well known freely available software. A preprocessing method and a language modeling method have been applied to the baseline system in order to improve machine translation quality. The preprocessing method is to identify and remove useless tokens in source texts. And the language modeling method models phrase level n-gram. We have participated in the BTEC tasks to see the effects of our methods. 1

CiteSeerX