506 research outputs found
What Level of Quality can Neural Machine Translation Attain on Literary Text?
Given the rise of a new approach to MT, Neural MT (NMT), and its promising
performance on different text types, we assess the translation quality it can
attain on what is perceived to be the greatest challenge for MT: literary text.
Specifically, we target novels, arguably the most popular type of literary
text. We build a literary-adapted NMT system for the English-to-Catalan
translation direction and evaluate it against a system pertaining to the
previous dominant paradigm in MT: statistical phrase-based MT (PBSMT). To this
end, for the first time we train MT systems, both NMT and PBSMT, on large
amounts of literary text (over 100 million words) and evaluate them on a set of
twelve widely known novels spanning from the the 1920s to the present day.
According to the BLEU automatic evaluation metric, NMT is significantly better
than PBSMT (p < 0.01) on all the novels considered. Overall, NMT results in a
11% relative improvement (3 points absolute) over PBSMT. A complementary human
evaluation on three of the books shows that between 17% and 34% of the
translations, depending on the book, produced by NMT (versus 8% and 20% with
PBSMT) are perceived by native speakers of the target language to be of
equivalent quality to translations produced by a professional human translator.Comment: Chapter for the forthcoming book "Translation Quality Assessment:
From Principles to Practice" (Springer
An Italian to Catalan RBMT system reusing data from existing language pairs
This paper presents an Italian! Catalan RBMT system automatically built by combining the linguistic data of the
existing pairs Spanish–Catalan and Spanish–Italian. A lightweight manual postprocessing is carried out in order to
fix inconsistencies in the automatically derived dictionaries and to add very frequent words that are missing according to a corpus analysis. The system is
evaluated on the KDE4 corpus and outperforms Google Translate by approximately ten absolute points in terms of
both TER and GTM
Topic modeling-based domain adaptation for system combination
This paper gives the system description of the domain adaptation team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the results of unsupervised document classification as meta information to the system combination module. For the Spanish-English data, our strategy achieved 26.33 BLEU points, 0.33 BLEU points absolute improvement over the standard confusion-network-based system combination. This was the best score in terms of BLEU among six participants in ML4HMT-12
Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019
We reassess the claims of human parity and super-human performance made at
the news shared task of WMT 2019 for three translation directions:
English-to-German, English-to-Russian and German-to-English. First we identify
three potential issues in the human evaluation of that shared task: (i) the
limited amount of intersentential context available, (ii) the limited
translation proficiency of the evaluators and (iii) the use of a reference
translation. We then conduct a modified evaluation taking these issues into
account. Our results indicate that all the claims of human parity and
super-human performance made at WMT 2019 should be refuted, except the claim of
human parity for English-to-German. Based on our findings, we put forward a set
of recommendations and open questions for future assessments of human parity in
machine translation.Comment: Accepted at the 22nd Annual Conference of the European Association
for Machine Translation (EAMT 2020
Ordering dynamics in the voter model with aging
The voter model with memory-dependent dynamics is theoretically and
numerically studied at the mean-field level. The `internal age', or time an
individual spends holding the same state, is added to the set of binary states
of the population, such that the probability of changing state (or activation
probability ) depends on this age. A closed set of integro-differential
equations describing the time evolution of the fraction of individuals with a
given state and age is derived, and from it analytical results are obtained
characterizing the behavior of the system close to the absorbing states. In
general, different age-dependent activation probabilities have different
effects on the dynamics. When the activation probability is an increasing
function of the age , the system reaches a steady state with coexistence of
opinions. In the case of aging, with being a decreasing function, either
the system reaches consensus or it gets trapped in a frozen state, depending on
the value of (zero or not) and the velocity of approaching
. Moreover, when the system reaches consensus, the time ordering of
the system can be exponential () or power-law like ().
Exact conditions for having one or another behavior, together with the
equations and explicit expressions for the exponents, are provided
- …