Search CORE

11 research outputs found

LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task

Author: Berard Alexandre
Besacier Laurent
Pietquin Olivier
Publication venue
Publication date: 17/07/2017
Field of study

This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017. We propose two neural post-editing models: a monosource model with a task-specific attention mechanism, which performs particularly well in a low-resource scenario; and a chained architecture which makes use of the source sentence to provide extra context. This latter architecture manages to slightly improve our results when more training data is available. We present and discuss our results on two datasets (en-de and de-en) that are made available for the task.Comment: keywords: neural post-edition, attention model

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Improving Translations by Combining Fuzzy-Match Repair with Automatic Post-Editing

Author: Negri Matteo
Ortega John
Sánchez-Martínez Felipe
Turchi Marco
Publication venue: European Association for Machine Translation
Publication date
Field of study

Two of the more predominant technologies that professional translators have at their disposal for improving productivity are machine translation (MT) and computer-aided translation (CAT) tools based on translation memories (TM). When translators use MT, they can use automatic post-editing (APE) systems to automate partof the post-editing work and get further productivity gains. When they use TM-based CAT tools, productivity may improve if they rely on fuzzy-match repair(FMR) methods. In this paper we combine FMR and APE: first a FMR proposal is produced from the translation unit proposed by the TM, then this proposal is further improved by an APE system specially tuned for this purpose. Experiments conducted on the translation of English texts into German show that, with the two combined technologies, the quality of the translations improves up to 23% compared to a pure MT system. The improvement over a pure FMR system is of 16%, showing the effectiveness of our joint solution

Archivio della ricerca - Fondazione Bruno Kessler

Instance Selection forOnline Automatic Post-Editing in a multi-domain scenario.

Author: Arcan Mihael
Chatterjee Rajen
Negri Matteo
Turchi Marco
Publication venue
Publication date
Field of study

In recent years, several end-to-end online translation systems have been proposed to success-fully incorporate human post-editing feedback in the translation workflow. The performance of these systems in a multi-domain translation environment (involving different text genres, post-editing styles, machine translation systems) within the automatic post-editing (APE) task has not been thoroughly investigated yet. In this work, we show that when used in the APE framework the existing online systems are not robust towards domain changes in the incoming data stream. In particular, these systems lack in the capability to learn and use domain-specific post-editing rules from a pool of multi-domain data sets. To cope with this problem, we propose an online learning framework that generates more reliable translations with significantly better quality as compared with the existing online and batch systems. Our framework includes: i) an instance selection technique based on information retrieval that helps to build domain-specificAPE systems, and ii)an optimization procedure to tune the feature weights of the log-linear model that allows the decoder to improve the post-editing quality

Archivio della ricerca - Fondazione Bruno Kessler

The FBK Participation in the WMT 2016 Automatic Post-editing Shared Task

Author: Chatterjee Rajen
de Souza José G. C.
Negri Matteo
Turchi Marco
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

In this paper, we present a novel approach to combine the two variants of phrase-based APE (monolingual and context-aware) by a factored machine translation model that is able to leverage benefits from both. Our factored APE models include part-of-speech-tag and class-based neural language models (LM) along with statistical word-based LM to improve the fluency of the post-edits. These models are built upon a data augmentation technique which helps to mitigate the problem of over-correction in phrase-based APE systems. Our primary APE system further incorporates a quality estimation (QE) model, which aims to select the best translation between the MT output and the automatic post-edit. According to the shared task results, our primary and contrastive (which does not include the QE module) submissions have similar performance and achieved significant improvement of 3.31% TER and 4.25% BLEU (relative) over the baseline MT system on the English-German evaluation set

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Publikationsserver der RWTH Aachen University

Findings of the 2016 Conference on Machine Translation.

Author: Bojar Ondˇrej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huck Matthias
Koehn Philipp
Logacheva Varvara
Monz Christof
Negri Matteo
Neveol Aurelie
Neves Mariana
Popel Martin
Post Matt
Rubino Raphael
Scarton Carolina
Specia Lucia
Turchi Marco
Verspoor Karin
Yepes Antonio Jimeno
Zampieri Marcos
Publication venue: The Association for Computational Linguistics
Publication date
Field of study

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments). The quality estimation task had three subtasks, with a total of 14 teams, submitting 39 entries. The automatic post-editing task had a total of 6 teams, submitting 11 entries

Archivio della ricerca - Fondazione Bruno Kessler