549,741 research outputs found
TermEval: an automatic metric for evaluating terminology translation in MT
Terminology translation plays a crucial role in domain-specific machine translation (MT). Preservation of domain-knowledge from source to target is arguably the most concerning factor for the customers in translation industry, especially for critical domains such as medical, transportation, military, legal and aerospace. However, evaluation of terminology translation, despite its huge importance in the translation industry, has been a less examined area in MT research. Term translation quality in MT is usually measured with domain experts, either in academia or industry. To the best of our knowledge, as of yet there is no publicly available solution to automatically evaluate terminology translation in MT. In particular, manual intervention is often needed to evaluate terminology translation in MT, which, by nature, is a time-consuming and highly expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems are often needed to be updated for many reasons (e.g. availability of new training data or leading MT techniques). Hence, there is a genuine need to have a faster and less expensive solution to this problem,
which could aid the end-users to instantly identify term translation problems in MT.
In this study, we propose an automatic evaluation metric, TermEval, for evaluating terminology translation in MT. To the best of our knowledge, there is no gold-standard dataset available for measuring terminology translation quality in MT. In the absence of gold standard evaluation test set, we semi-automatically create a gold-standard dataset from English--Hindi judicial domain parallel corpus.
We trained state-of-the-art phrase-based SMT (PB-SMT) and neural MT (NMT) models on two translation directions: English-to-Hindi and Hindi-to-English, and use TermEval to evaluate their performance on terminology translation over the created gold standard test set. In order to measure the correlation between TermEval scores and human judgments, translations of each source terms (of the gold standard test set) is validated with human evaluator. High correlation between TermEval and human judgements manifests the effectiveness of the proposed terminology translation evaluation metric. We also carry out comprehensive manual evaluation on terminology translation and present our observations
Sheffield University CLEF 2000 submission - bilingual track: German to English
We investigated dictionary based cross language information
retrieval using lexical triangulation. Lexical triangulation combines the results
of different transitive translations. Transitive translation uses a pivot language
to translate between two languages when no direct translation resource is
available. We took German queries and translated then via Spanish, or Dutch
into English. We compared the results of retrieval experiments using these
queries, with other versions created by combining the transitive translations or
created by direct translation. Direct dictionary translation of a query introduces
considerable ambiguity that damages retrieval, an average precision 79% below
monolingual in this research. Transitive translation introduces more ambiguity,
giving results worse than 88% below direct translation. We have shown that
lexical triangulation between two transitive translations can eliminate much of
the additional ambiguity introduced by transitive translation
European English terms for Italian legal concepts: the case of the Italian Code of Criminal Procedure
The translation of the Italian Code of Criminal Procedure into English, published in 2014, represents a way of explaining the functioning of the Italian criminal procedure to a wide English-speaking audience. Given the different varieties of English available, the translation team chose European English as the target language of the translation. After a brief overview of the central role played by English in the European supranational and international context, the paper presents a classification of translation equivalents used for the translation of the Code and illustrates it by concrete examples. Such classification is based on two criteria, namely the availability of European English translation equivalents in the reference corpus of European documents used by the translation team and the degree of embeddedness of the underlying concept in the national legal system. The resulting classification is threefold and comprises European English translation equivalents for Italian terms designating legal concepts shared by both national and supranational/international legal systems, European English translation equivalents for Italian terms designating legal concepts embedded in the national legal system only, and Italian terms designating legal concepts embedded in the national legal system with no European English translation equivalent
Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
Recent works in spoken language translation (SLT) have attempted to build
end-to-end speech-to-text translation without using source language
transcription during learning or decoding. However, while large quantities of
parallel texts (such as Europarl, OpenSubtitles) are available for training
machine translation systems, there are no large (100h) and open source parallel
corpora that include speech in a source language aligned to text in a target
language. This paper tries to fill this gap by augmenting an existing
(monolingual) corpus: LibriSpeech. This corpus, used for automatic speech
recognition, is derived from read audiobooks from the LibriVox project, and has
been carefully segmented and aligned. After gathering French e-books
corresponding to the English audio-books from LibriSpeech, we align speech
segments at the sentence level with their respective translations and obtain
236h of usable parallel data. This paper presents the details of the processing
as well as a manual evaluation conducted on a small subset of the corpus. This
evaluation shows that the automatic alignments scores are reasonably correlated
with the human judgments of the bilingual alignment quality. We believe that
this corpus (which is made available online) is useful for replicable
experiments in direct speech translation or more general spoken language
translation experiments.Comment: LREC 2018, Japa
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
Many efforts of research are devoted to semantic role labeling (SRL) which is
crucial for natural language understanding. Supervised approaches have achieved
impressing performances when large-scale corpora are available for
resource-rich languages such as English. While for the low-resource languages
with no annotated SRL dataset, it is still challenging to obtain competitive
performances. Cross-lingual SRL is one promising way to address the problem,
which has achieved great advances with the help of model transferring and
annotation projection. In this paper, we propose a novel alternative based on
corpus translation, constructing high-quality training datasets for the target
languages from the source gold-standard SRL annotations. Experimental results
on Universal Proposition Bank show that the translation-based method is highly
effective, and the automatic pseudo datasets can improve the target-language
SRL performances significantly.Comment: Accepted at ACL 202
Audacious Translation: On Being Haunted and Getting Lost on the Way to Translating Spivak. A Reflection on Spivak’s “Translating into English”
In “Translating Into English” within An Aesthetic Education in the Era of Globalization (2012), Spivak eludes apprehension, spurns comprehension, and resists neat translation as I, an American educator, attempt to make sense of what is meant by an aesthetic education as Spivak translates the act of translation. Caught and othered as a language broker in learning the double bind of translation, I find no answers, only new questions as I grope toward ways to conceptualize and to name this moment for translators and language educators: (1) What does it mean to be a translator?; (2) Can and should the convenient genie of English as the language of power and globalization be pushed back into the bottle to make room for linguistic diversity?; (3) What is essentially lost in translation when indigenous languages are abandoned and no longer nuanced with meaning, when “lingual memory” is no longer available? This paper then examines the ethics and struggle of honoring Spivak’s call to be haunted in light of the double bind faced by immigrant K-12 students encountering the power of English in U.S. K-12 schools
TermEval: an automatic metric for evaluating terminology translation in MT
Terminology translation plays a crucial role in domain-specific machine translation (MT). Preservation of domain-knowledge from source to target
is arguably the most concerning factor for the customers in translation industry,
especially for critical domains such as medical, transportation, military, legal and
aerospace. However, evaluation of terminology translation, despite its huge importance in the translation industry, has been a less examined area in MT research.
Term translation quality in MT is usually measured with domain experts, either in
academia or industry. To the best of our knowledge, as of yet there is no publicly
available solution to automatically evaluate terminology translation in MT. In particular, manual intervention is often needed to evaluate terminology translation
in MT, which, by nature, is a time-consuming and highly expensive task. In fact,
this is unimaginable in an industrial setting where customised MT systems are
often needed to be updated for many reasons (e.g. availability of new training data
or leading MT techniques). Hence, there is a genuine need to have a faster and
less expensive solution to this problem, which could aid the end-users to instantly
identify term translation problems in MT. In this study, we propose an automatic
evaluation metric, TermEval, for evaluating terminology translation in MT. To the
best of our knowledge, there is no gold-standard dataset available for measuring
terminology translation quality in MT. In the absence of gold standard evaluation
test set, we semi-automatically create a gold-standard dataset from English–Hindi
judicial domain parallel corpus.
We trained state-of-the-art phrase-based SMT (PB-SMT) and neural MT (NMT)
models on two translation directions: English-to-Hindi and Hindi-to-English, and
use TermEval to evaluate their performance on terminology translation over the
created gold standard test set. In order to measure the correlation between TermEval scores and human judgments, translations of each source terms (of the gold
standard test set) is validated with human evaluator. High correlation between
TermEval and human judgements manifests the effectiveness of the proposed terminology translation evaluation metric. We also carry out comprehensive manual
evaluation on terminology translation and present our observations
Recommended from our members
Machine Translation of Arabic Dialects
This thesis discusses different approaches to machine translation (MT) from Dialectal Arabic (DA) to English. These approaches handle the varying stages of Arabic dialects in terms of types of available resources and amounts of training data. The overall theme of this work revolves around building dialectal resources and MT systems or enriching existing ones using the currently available resources (dialectal or standard) in order to quickly and cheaply scale to more dialects without the need to spend years and millions of dollars to create such resources for every dialect.
Unlike Modern Standard Arabic (MSA), DA-English parallel corpora is scarcely available for few dialects only. Dialects differ from each other and from MSA in orthography, morphology, phonology, and to some lesser degree syntax. This means that combining all available parallel data, from dialects and MSA, to train DA-to-English statistical machine translation (SMT) systems might not provide the desired results. Similarly, translating dialectal sentences with an SMT system trained on that dialect only is also challenging due to different factors that affect the sentence word choices against that of the SMT training data. Such factors include the level of dialectness (e.g., code switching to MSA versus dialectal training data), topic (sports versus politics), genre (tweets versus newspaper), script (Arabizi versus Arabic), and timespan of test against training. The work we present utilizes any available Arabic resource such as a preprocessing tool or a parallel corpus, whether MSA or DA, to improve DA-to-English translation and expand to more dialects and sub-dialects.
The majority of Arabic dialects have no parallel data to English or to any other foreign language. They also have no preprocessing tools such as normalizers, morphological analyzers, or tokenizers. For such dialects, we present an MSA-pivoting approach where DA sentences are translated to MSA first, then the MSA output is translated to English using the wealth of MSA-English parallel data. Since there is virtually no DA-MSA parallel data to train an SMT system, we build a rule-based DA-to-MSA MT system, ELISSA, that uses morpho-syntactic translation rules along with dialect identification and language modeling components. We also present a rule-based approach to quickly and cheaply build a dialectal morphological analyzer, ADAM, which provides ELISSA with dialectal word analyses.
Other Arabic dialects have a relatively small-sized DA-English parallel data amounting to a few million words on the DA side. Some of these dialects have dialect-dependent preprocessing tools that can be used to prepare the DA data for SMT systems. We present techniques to generate synthetic parallel data from the available DA-English and MSA- English data. We use this synthetic data to build statistical and hybrid versions of ELISSA as well as improve our rule-based ELISSA-based MSA-pivoting approach. We evaluate our best MSA-pivoting MT pipeline against three direct SMT baselines trained on these three parallel corpora: DA-English data only, MSA-English data only, and the combination of DA-English and MSA-English data. Furthermore, we leverage the use of these four MT systems (the three baselines along with our MSA-pivoting system) in two system combination approaches that benefit from their strengths while avoiding their weaknesses.
Finally, we propose an approach to model dialects from monolingual data and limited DA-English parallel data without the need for any language-dependent preprocessing tools. We learn DA preprocessing rules using word embedding and expectation maximization. We test this approach by building a morphological segmentation system and we evaluate its performance on MT against the state-of-the-art dialectal tokenization tool
Investigating the cross-lingual translatability of VerbNet-style classification.
VerbNet-the most extensive online verb lexicon currently available for English-has proved useful in supporting a variety of NLP tasks. However, its exploitation in multilingual NLP has been limited by the fact that such classifications are available for few languages only. Since manual development of VerbNet is a major undertaking, researchers have recently translated VerbNet classes from English to other languages. However, no systematic investigation has been conducted into the applicability and accuracy of such a translation approach across different, typologically diverse languages. Our study is aimed at filling this gap. We develop a systematic method for translation of VerbNet classes from English to other languages which we first apply to Polish and subsequently to Croatian, Mandarin, Japanese, Italian, and Finnish. Our results on Polish demonstrate high translatability with all the classes (96% of English member verbs successfully translated into Polish) and strong inter-annotator agreement, revealing a promising degree of overlap in the resultant classifications. The results on other languages are equally promising. This demonstrates that VerbNet classes have strong cross-lingual potential and the proposed method could be applied to obtain gold standards for automatic verb classification in different languages. We make our annotation guidelines and the six language-specific verb classifications available with this paper
- …