86 research outputs found
Integrating source-language context into log-linear models of statistical machine translation
The translation features typically used in state-of-the-art statistical machine translation (SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling directly into log-linear phrase-based SMT (PB-SMT) and hierarchical PB-SMT (HPB-SMT), and can positively
influence the weighting and selection of target phrases, and thus improve translation quality. In this thesis we present novel approaches to incorporate source-language contextual modelling into the state-of-the-art SMT models in order to enhance the quality of lexical selection. We investigate the effectiveness of use of a range of contextual features, including lexical features of neighbouring words, part-of-speech tags, supertags, sentence-similarity features, dependency information, and semantic roles. We explored a series of language pairs featuring typologically different languages, and examined the scalability of our research to larger amounts of training data.
While our results are mixed across feature selections, language pairs, and learning curves, we observe that including contextual features of the source sentence
in general produces improvements. The most significant improvements involve the integration of long-distance contextual features, such as dependency relations in
combination with part-of-speech tags in Dutch-to-English subtitle translation, the combination of dependency parse and semantic role information in English-to-Dutch parliamentary debate translation, supertag features in English-to-Chinese translation, or combination of supertag and lexical features in English-to-Dutch subtitle
translation. Furthermore, we investigate the applicability of our lexical contextual model in another closely related NLP problem, namely machine transliteration
TermEval: an automatic metric for evaluating terminology translation in MT
Terminology translation plays a crucial role in domain-specific machine translation (MT). Preservation of domain-knowledge from source to target is arguably the most concerning factor for the customers in translation industry, especially for critical domains such as medical, transportation, military, legal and aerospace. However, evaluation of terminology translation, despite its huge importance in the translation industry, has been a less examined area in MT research. Term translation quality in MT is usually measured with domain experts, either in academia or industry. To the best of our knowledge, as of yet there is no publicly available solution to automatically evaluate terminology translation in MT. In particular, manual intervention is often needed to evaluate terminology translation in MT, which, by nature, is a time-consuming and highly expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems are often needed to be updated for many reasons (e.g. availability of new training data or leading MT techniques). Hence, there is a genuine need to have a faster and less expensive solution to this problem,
which could aid the end-users to instantly identify term translation problems in MT.
In this study, we propose an automatic evaluation metric, TermEval, for evaluating terminology translation in MT. To the best of our knowledge, there is no gold-standard dataset available for measuring terminology translation quality in MT. In the absence of gold standard evaluation test set, we semi-automatically create a gold-standard dataset from English--Hindi judicial domain parallel corpus.
We trained state-of-the-art phrase-based SMT (PB-SMT) and neural MT (NMT) models on two translation directions: English-to-Hindi and Hindi-to-English, and use TermEval to evaluate their performance on terminology translation over the created gold standard test set. In order to measure the correlation between TermEval scores and human judgments, translations of each source terms (of the gold standard test set) is validated with human evaluator. High correlation between TermEval and human judgements manifests the effectiveness of the proposed terminology translation evaluation metric. We also carry out comprehensive manual evaluation on terminology translation and present our observations
Using supertags as source language context in SMT
Recent research has shown that Phrase-Based Statistical Machine Translation (PB-SMT) systems can benefit from two
enhancements: (i) using words and POS tags as context-informed features on the source side; and (ii) incorporating lexical syntactic descriptions in the form of supertags on the target side. In this work we
present a novel PB-SMT model that combines these two aspects by using supertags as source language contextinformed features. These features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. In our experiments two
kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar.
We use a memory-based classification framework that enables the estimation of these features while avoiding
problems of sparseness. Despite the differences between these two approaches, the supertaggers give similar improvements. We evaluate the performance of our approach on an English-to-Chinese translation task using a state-of-the-art phrase-based SMT system, and report an
improvement of 7.88% BLEU score in translation quality when adding supertags as context-informed features
TermEval: an automatic metric for evaluating terminology translation in MT
Terminology translation plays a crucial role in domain-specific machine translation (MT). Preservation of domain-knowledge from source to target
is arguably the most concerning factor for the customers in translation industry,
especially for critical domains such as medical, transportation, military, legal and
aerospace. However, evaluation of terminology translation, despite its huge importance in the translation industry, has been a less examined area in MT research.
Term translation quality in MT is usually measured with domain experts, either in
academia or industry. To the best of our knowledge, as of yet there is no publicly
available solution to automatically evaluate terminology translation in MT. In particular, manual intervention is often needed to evaluate terminology translation
in MT, which, by nature, is a time-consuming and highly expensive task. In fact,
this is unimaginable in an industrial setting where customised MT systems are
often needed to be updated for many reasons (e.g. availability of new training data
or leading MT techniques). Hence, there is a genuine need to have a faster and
less expensive solution to this problem, which could aid the end-users to instantly
identify term translation problems in MT. In this study, we propose an automatic
evaluation metric, TermEval, for evaluating terminology translation in MT. To the
best of our knowledge, there is no gold-standard dataset available for measuring
terminology translation quality in MT. In the absence of gold standard evaluation
test set, we semi-automatically create a gold-standard dataset from English–Hindi
judicial domain parallel corpus.
We trained state-of-the-art phrase-based SMT (PB-SMT) and neural MT (NMT)
models on two translation directions: English-to-Hindi and Hindi-to-English, and
use TermEval to evaluate their performance on terminology translation over the
created gold standard test set. In order to measure the correlation between TermEval scores and human judgments, translations of each source terms (of the gold
standard test set) is validated with human evaluator. High correlation between
TermEval and human judgements manifests the effectiveness of the proposed terminology translation evaluation metric. We also carry out comprehensive manual
evaluation on terminology translation and present our observations
Experiments on domain adaptation for English-Hindi SMT
Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text and monolingual target language text. If a significant amount of out-of-domain data is added to the training data, the quality of translation can drop. On the other hand, training an SMT system on a small amount of training material for given indomain data leads to narrow lexical coverage which again results in a low translation quality. In this paper, (i) we explore domain-adaptation techniques to combine large out-of-domain training data with small-scale in-domain training data for English—Hindi statistical machine translation and (ii) we cluster large out-of-domain training data to extract sentences similar to in-domain sentences and apply adaptation techniques to combine clustered sub-corpora
with in-domain training data into a unified framework, achieving a 0.44 absolute corresponding to a 4.03% relative improvement in terms of BLEU over the baseline
Fine-tuning Large Language Models for Adaptive Machine Translation
This paper presents the outcomes of fine-tuning Mistral 7B, a general-purpose
large language model (LLM), for adaptive machine translation (MT). The
fine-tuning process involves utilising a combination of zero-shot and one-shot
translation prompts within the medical domain. The primary objective is to
enhance real-time adaptive MT capabilities of Mistral 7B, enabling it to adapt
translations to the required domain at inference time. The results,
particularly for Spanish-to-English MT, showcase the efficacy of the fine-tuned
model, demonstrating quality improvements in both zero-shot and one-shot
translation scenarios, surpassing Mistral 7B's baseline performance. Notably,
the fine-tuned Mistral outperforms ChatGPT "gpt-3.5-turbo" in zero-shot
translation while achieving comparable one-shot translation quality. Moreover,
the zero-shot translation of the fine-tuned Mistral matches NLLB 3.3B's
performance, and its one-shot translation quality surpasses that of NLLB 3.3B.
These findings emphasise the significance of fine-tuning efficient LLMs like
Mistral 7B to yield high-quality zero-shot translations comparable to
task-oriented models like NLLB 3.3B. Additionally, the adaptive gains achieved
in one-shot translation are comparable to those of commercial LLMs such as
ChatGPT. Our experiments demonstrate that, with a relatively small dataset of
20,000 segments that incorporate a mix of zero-shot and one-shot prompts,
fine-tuning significantly enhances Mistral's in-context learning ability,
especially for real-time adaptive MT
MaTrEx: the DCU machine translation system for ICON 2008
In this paper, we give a description of the machine translation system developed at DCU that was used for our participation in the NLP Tools Contest of the International
Conference on Natural Language Processing (ICON 2008). This was our first ever attempt at working on any Indian language. In this participation, we focus on various techniques for word and phrase alignment to improve system quality. For the English-Hindi translation task we exploit
source-language reordering. We also carried out experiments combining both in-domain and out-of-domain data to improve
the system performance and, as a post-processing step we transliterate out-of-vocabulary items
Dependency relations as source context in phrase-based SMT
The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical
choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and
supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline
English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification
framework that enables efficient estimation of these features while avoiding data sparseness problems.We carried out experiments both at character and transliteration unit (TU) level. Position-dependent source context features produce significant improvements in terms of all evaluation metrics
- …