11,943 research outputs found
Eye-tracking as a measure of cognitive effort for post-editing of machine translation
The three measurements for post-editing effort as proposed by Krings (2001) have been adopted by many researchers in subsequent studies and publications. These measurements comprise temporal effort (the speed or productivity rate of post-editing, often measured in words per second or per minute at the segment level), technical effort (the number of actual edits performed by the post-editor, sometimes approximated using the Translation Edit Rate metric (Snover et al. 2006), again usually at the segment level), and cognitive effort. Cognitive effort has been measured using Think-Aloud Protocols, pause measurement, and, increasingly, eye-tracking. This chapter provides a review of studies of post-editing effort using eye-tracking, noting the influence of publications by Danks et al. (1997), and OâBrien (2006, 2008), before describing a single study in detail.
The detailed study examines whether predicted effort indicators affect post-editing effort and results were previously published as Moorkens et al. (2015). Most of the eye-tracking data analysed were unused in the previou
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
"Bilingual Expert" Can Find Translation Errors
Recent advances in statistical machine translation via the adoption of neural
sequence-to-sequence models empower the end-to-end system to achieve
state-of-the-art in many WMT benchmarks. The performance of such machine
translation (MT) system is usually evaluated by automatic metric BLEU when the
golden references are provided for validation. However, for model inference or
production deployment, the golden references are prohibitively available or
require expensive human annotation with bilingual expertise. In order to
address the issue of quality evaluation (QE) without reference, we propose a
general framework for automatic evaluation of translation output for most WMT
quality evaluation tasks. We first build a conditional target language model
with a novel bidirectional transformer, named neural bilingual expert model,
which is pre-trained on large parallel corpora for feature extraction. For QE
inference, the bilingual expert model can simultaneously produce the joint
latent representation between the source and the translation, and real-valued
measurements of possible erroneous tokens based on the prior knowledge learned
from parallel data. Subsequently, the features will further be fed into a
simple Bi-LSTM predictive model for quality evaluation. The experimental
results show that our approach achieves the state-of-the-art performance in the
quality estimation track of WMT 2017/2018.Comment: Accepted to AAAI 201
What Level of Quality can Neural Machine Translation Attain on Literary Text?
Given the rise of a new approach to MT, Neural MT (NMT), and its promising
performance on different text types, we assess the translation quality it can
attain on what is perceived to be the greatest challenge for MT: literary text.
Specifically, we target novels, arguably the most popular type of literary
text. We build a literary-adapted NMT system for the English-to-Catalan
translation direction and evaluate it against a system pertaining to the
previous dominant paradigm in MT: statistical phrase-based MT (PBSMT). To this
end, for the first time we train MT systems, both NMT and PBSMT, on large
amounts of literary text (over 100 million words) and evaluate them on a set of
twelve widely known novels spanning from the the 1920s to the present day.
According to the BLEU automatic evaluation metric, NMT is significantly better
than PBSMT (p < 0.01) on all the novels considered. Overall, NMT results in a
11% relative improvement (3 points absolute) over PBSMT. A complementary human
evaluation on three of the books shows that between 17% and 34% of the
translations, depending on the book, produced by NMT (versus 8% and 20% with
PBSMT) are perceived by native speakers of the target language to be of
equivalent quality to translations produced by a professional human translator.Comment: Chapter for the forthcoming book "Translation Quality Assessment:
From Principles to Practice" (Springer
- âŠ