27 research outputs found
Findings of the 2015 Workshop on Statistical Machine Translation
This paper presents the results of the
WMT15 shared tasks, which included a
standard news translation task, a metrics
task, a tuning task, a task for run-time
estimation of machine translation quality,
and an automatic post-editing task. This
year, 68 machine translation systems from
24 institutions were submitted to the ten
translation directions in the standard translation
task. An additional 7 anonymized
systems were included, and were then
evaluated both automatically and manually.
The quality estimation task had three
subtasks, with a total of 10 teams, submitting
34 entries. The pilot automatic postediting
task had a total of 4 teams, submitting
7 entries
Findings of the 2014 Workshop on Statistical Machine Translation
This paper presents the results of the
WMT14 shared tasks, which included a
standard news translation task, a separate
medical translation task, a task for
run-time estimation of machine translation
quality, and a metrics task. This year, 143
machine translation systems from 23 institutions
were submitted to the ten translation
directions in the standard translation
task. An additional 6 anonymized systems
were included, and were then evaluated
both automatically and manually. The
quality estimation task had four subtasks,
with a total of 10 teams, submitting 57 entries
Findings of the 2016 Conference on Machine Translation.
This paper presents the results of the
WMT16 shared tasks, which included five
machine translation (MT) tasks (standard
news, IT-domain, biomedical, multimodal,
pronoun), three evaluation tasks (metrics,
tuning, run-time estimation of MT quality),
and an automatic post-editing task
and bilingual document alignment task.
This year, 102 MT systems from 24 institutions
(plus 36 anonymized online systems)
were submitted to the 12 translation
directions in the news translation task. The
IT-domain task received 31 submissions
from 12 institutions in 7 directions and the
Biomedical task received 15 submissions
systems from 5 institutions. Evaluation
was both automatic and manual (relative
ranking and 100-point scale assessments).
The quality estimation task had three subtasks,
with a total of 14 teams, submitting
39 entries. The automatic post-editing task
had a total of 6 teams, submitting 11 entries
Findings of the 2016 Conference on Machine Translation (WMT16)
This paper presents the results of the
WMT16 shared tasks, which included five
machine translation (MT) tasks (standard
news, IT-domain, biomedical, multimodal,
pronoun), three evaluation tasks (metrics,
tuning, run-time estimation of MT quality),
and an automatic post-editing task
and bilingual document alignment task.
This year, 102 MT systems from 24 institutions
(plus 36 anonymized online systems)
were submitted to the 12 translation
directions in the news translation task. The
IT-domain task received 31 submissions
from 12 institutions in 7 directions and the
Biomedical task received 15 submissions
systems from 5 institutions. Evaluation
was both automatic and manual (relative
ranking and 100-point scale assessments)
UGENT-LT3 SCATE system for machine translation quality estimation
This paper describes the submission of the UGENT-LT3 SCATE system to the WMT15 Shared Task on Quality Estima-tion (QE), viz. English-Spanish word and sentence-level QE. We conceived QE as a supervised Machine Learning (ML) problem and designed additional features and combined these with the baseline feature set to estimate quality. The sen-tence-level QE system re-uses the word level predictions of the word-level QE system. We experimented with different learning methods and observe improve-ments over the baseline system for word-level QE with the use of the new features and by combining learning methods into ensembles. For sentence-level QE we show that using a single feature based on word-level predictions can perform better than the baseline system and using this in combination with additional features led to further improvements in performance
UGENT-LT3 SCATE Submission for WMT16 Shared Task on Quality Estimation
This paper describes the submission of the UGENT-LT3 SCATE system to the WMT16 Shared Task on Quality Estimation (QE), viz. English-German word and sentence-level QE. Based on the observation that the data set is homogeneous (all sentences belong to the IT domain), we performed bilingual terminology extraction and added features derived from the resulting term list to the well-performing features of the word-level QE task of last year. For sentence-level QE, we analyzed the importance of the features and based on those insights extended the feature set of last year. We also experimented with different learning methods and ensembles. We present our observations from the different experiments we conducted and our submissions for both tasks
An Unsupervised Method for Automatic Translation Memory Cleaning.
We address the problem of automatically
cleaning a large-scale Translation Memory
(TM) in a fully unsupervised fashion,
i.e. without human-labelled data.
We approach the task by: i) designing
a set of features that capture the similarity
between two text segments in different
languages, ii) use them to induce reliable
training labels for a subset of the
translation units (TUs) contained in the
TM, and iii) use the automatically labelled
data to train an ensemble of binary classifiers.
We apply our method to clean a
test set composed of 1,000 TUs randomly
extracted from the English-Italian version
of MyMemory, the worldâs largest public
TM. Our results show competitive performance
not only against a strong baseline
that exploits machine translation, but also
against a state-of-the-art method that relies
on human-labelled data
transcrater a tool for automatic speech recognition quality estimation
We present TranscRater, an open-source tool for automatic speech recognition (ASR) quality estimation (QE). The tool allows users to perform ASR evaluation bypassing the need of reference transcripts and confidence information, which is common to current assessment protocols. TranscRater includes: i) methods to extract a variety of quality indicators from (signal, transcription) pairs and ii) machine learning algorithms which make possible to build ASR QE models exploiting the extracted features. Confirming the positive results of previous evaluations, new experiments with TranscRater indicate its effectiveness both in WER prediction and transcription ranking tasks
TMop: a Tool for Unsupervised Translation Memory Cleaning
We present TMop, the first open-source
tool for automatic Translation Memory
(TM) cleaning. The tool implements a
fully unsupervised approach to the task,
which allows spotting unreliable translation
units (sentence pairs in different languages,
which are supposed to be translations
of each other) without requiring
labeled training data. TMop includes a
highly configurable and extensible set of
filters capturing different aspects of translation
quality. It has been evaluated on
a test set composed of 1,000 translation
units (TUs) randomly extracted from the
English-Italian version of MyMemory, a
large-scale public TM. Results indicate its
effectiveness in automatic removing âbadâ
TUs, with comparable performance to a
state-of-the-art supervised method (76.3
vs. 77.7 balanced accuracy)