11 research outputs found
LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task
This paper presents the LIG-CRIStAL submission to the shared Automatic Post-
Editing task of WMT 2017. We propose two neural post-editing models: a
monosource model with a task-specific attention mechanism, which performs
particularly well in a low-resource scenario; and a chained architecture which
makes use of the source sentence to provide extra context. This latter
architecture manages to slightly improve our results when more training data is
available. We present and discuss our results on two datasets (en-de and de-en)
that are made available for the task.Comment: keywords: neural post-edition, attention model
Improving Translations by Combining Fuzzy-Match Repair with Automatic Post-Editing
Two of the more predominant technologies that professional translators have at their disposal for improving productivity are machine translation (MT) and computer-aided translation (CAT) tools based on translation memories (TM). When translators use MT, they can use automatic post-editing (APE) systems to automate partof the post-editing work and get further productivity gains. When they use TM-based CAT tools, productivity may improve if they rely on fuzzy-match repair(FMR) methods. In this paper we combine FMR and APE: first a FMR proposal is produced from the translation unit proposed by the TM, then this proposal is further improved by an APE system specially tuned for this purpose. Experiments conducted on the translation of English texts into German show that, with the two combined technologies, the quality of the translations improves up to 23% compared to a pure MT system. The improvement over a pure FMR system is of 16%, showing the effectiveness of our joint solution
Instance Selection forOnline Automatic Post-Editing in a multi-domain scenario.
In recent years, several end-to-end online translation systems have been proposed to success-fully incorporate human post-editing feedback in the translation workflow. The performance of these systems in a multi-domain translation environment (involving different text genres, post-editing styles, machine translation systems) within the automatic post-editing (APE) task has not been thoroughly investigated yet. In this work, we show that when used in the APE framework the existing online systems are not robust towards domain changes in the incoming data stream. In particular, these systems lack in the capability to learn and use domain-specific post-editing rules from a pool of multi-domain data sets. To cope with this problem, we propose an online learning framework that generates more reliable translations with significantly better quality as compared with the existing online and batch systems. Our framework includes: i) an instance selection technique based on information retrieval that helps to build domain-specificAPE systems, and ii)an optimization procedure to tune the feature weights of the log-linear model that allows the decoder to improve the post-editing quality
The FBK Participation in the WMT 2016 Automatic Post-editing Shared Task
In this paper, we present a novel approach
to combine the two variants of phrase-based
APE (monolingual and context-aware)
by a factored machine translation
model that is able to leverage benefits
from both. Our factored APE models include
part-of-speech-tag and class-based
neural language models (LM) along with
statistical word-based LM to improve the
fluency of the post-edits. These models
are built upon a data augmentation technique
which helps to mitigate the problem
of over-correction in phrase-based APE
systems. Our primary APE system further
incorporates a quality estimation (QE)
model, which aims to select the best translation
between the MT output and the
automatic post-edit. According to the
shared task results, our primary and contrastive
(which does not include the QE
module) submissions have similar performance
and achieved significant improvement
of 3.31% TER and 4.25% BLEU
(relative) over the baseline MT system on
the English-German evaluation set
Findings of the 2016 Conference on Machine Translation.
This paper presents the results of the
WMT16 shared tasks, which included five
machine translation (MT) tasks (standard
news, IT-domain, biomedical, multimodal,
pronoun), three evaluation tasks (metrics,
tuning, run-time estimation of MT quality),
and an automatic post-editing task
and bilingual document alignment task.
This year, 102 MT systems from 24 institutions
(plus 36 anonymized online systems)
were submitted to the 12 translation
directions in the news translation task. The
IT-domain task received 31 submissions
from 12 institutions in 7 directions and the
Biomedical task received 15 submissions
systems from 5 institutions. Evaluation
was both automatic and manual (relative
ranking and 100-point scale assessments).
The quality estimation task had three subtasks,
with a total of 14 teams, submitting
39 entries. The automatic post-editing task
had a total of 6 teams, submitting 11 entries