802 research outputs found
Machine translation evaluation resources and methods: a survey
We introduce the Machine Translation (MT) evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. The advanced human assessments include task-oriented measures, post-editing, segment ranking, and extended criteriea, etc. We classify the automatic evaluation methods into two categories, including lexical similarity scenario and linguistic features application. The lexical similarity methods contain edit distance, precision, recall, F-measure, and word order. The linguistic features can be divided into syntactic features and semantic features respectively. The syntactic features include part of speech tag, phrase types and sentence structures, and the semantic features include named entity, synonyms, textual entailment, paraphrase, semantic roles, and language models. The deep learning models for evaluation are very newly proposed. Subsequently, we also introduce the evaluation methods for MT evaluation including different correlation scores, and the recent quality estimation (QE) tasks for MT.
This paper differs from the existing works\cite {GALEprogram2009, EuroMatrixProject2007} from several aspects, by introducing some recent development of MT evaluation measures, the different classifications from manual to automatic evaluation measures, the introduction of recent QE tasks of MT, and the concise construction of the content
Accuracy-based scoring for phrase-based statistical machine translation
Although the scoring features of state-of-the-art Phrase-Based Statistical Machine Translation (PB-SMT) models are weighted so as to optimise an objective function measuring
translation quality, the estimation of the features
themselves does not have any relation to such quality metrics. In this paper, we introduce a translation quality-based feature to PBSMT in a bid to improve the translation quality of the system. Our feature is estimated by averaging
the edit-distance between phrase pairs involved in the translation of oracle sentences, chosen by automatic evaluation metrics from the N-best outputs of a baseline system, and phrase pairs occurring in the N-best list. Using
our method, we report a statistically significant 2.11% relative improvement in BLEU score for the WMT 2009 Spanish-to-English translation task. We also report that using our
method we can achieve statistically significant improvements over the baseline using many other MT evaluation metrics, and a substantial increase in speed and reduction in memory use (due to a reduction in phrase-table size of 87%) while maintaining significant gains in
translation quality
Neural Automatic Post-Editing Using Prior Alignment and Reranking
We present a second-stage machine translation (MT) system based on a neural machine translation (NMT) approach to automatic post-editing (APE) that improves the translation quality provided by a firststage MT system. Our APE system (AP ESym) is an extended version of an attention based NMT model with bilingual
symmetry employing bidirectional models, mt → pe and pe → mt. APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system. Re-ranking (AP ERerank) of the
n-best translations from the phrase-based APE and AP ESym systems provides further substantial improvements over the symmetric neural APE model. Human evaluation confirms that the AP ERerank
generated PE translations improve on the previous best neural APE system at WMT 2016.Santanu Pal is supported by the People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471. Sudip Kumar Naskar is supported by Media Lab Asia, MeitY, Government of India, under the Young Faculty Research Fellowship of the Visvesvaraya PhD Scheme for Electronics & IT.
Qun Liu and Josef van Genabith is supported by funding from the European Union Horizon 2020
research and innovation programme under grant agreement no 645452 (QT21)
Machine Translation Using Automatically Inferred Construction-Based Correspondence and Language Models
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Coupling hierarchical word reordering and decoding in phrase-based statistical machine translation
In this paper, we start with the existing idea of taking reordering rules automatically derived from syntactic representations, and applying them in a preprocessing step before translation to make the source sentence structurally more like the target; and we propose a new approach to hierarchically extracting these rules. We evaluate this, combined with a lattice-based decoding, and show improvements over stateof-the-art distortion models.Postprint (published version
Syntactic discriminative language model rerankers for statistical machine translation
This article describes a method that successfully exploits syntactic features for n-best translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over n-gram language models in differentiating between Statistical Machine Translation output and human translations. Our approach uses discriminative language modelling to rerank the n-best translations generated by a statistical machine translation system. The performance is evaluated for Arabic-to-English translation using NIST’s MT-Eval benchmarks. While deep features extracted from parse trees do not consistently help, we show how features extracted from a shallow Part-of-Speech annotation layer outperform a competitive baseline and a state-of-the-art comparative reranking approach, leading to significant BLEU improvements on three different test sets
Coverage model for character-based neural machine translation
En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)In recent years, Neural Machine Translation (NMT) has achieved state-of-the art performance
in translating from a language; source language, to another; target language. However,
many of the proposed methods use word embedding techniques to represent a sentence
in the source or target language. Character embedding techniques for this task has been
suggested to represent the words in a sentence better. Moreover, recent NMT models use
attention mechanism where the most relevant words in a source sentence are used to generate
a target word. The problem with this approach is that while some words are translated multiple
times, some other words are not translated. To address this problem, coverage model
has been integrated into NMT to keep track of already-translated words and focus on the
untranslated ones. In this research, we present a new architecture in which we use character
embedding for representing the source and target words, and also use coverage model to
make certain that all words are translated. We compared our model with the previous models
and our model shows comparable improvements. Our model achieves an improvement of
2.87 BLEU (BiLingual Evaluation Understudy) score over the baseline; attention model, for
German-English translation, and 0.34 BLEU score improvement for Catalan-Spanish translation
- …