Search CORE

9,506 research outputs found

Bayesian reordering model with feature selection

Author: Alrajeh Abdullah
Niranjan Mahesan
Publication venue
Publication date: 01/01/2014
Field of study

In phrase-based statistical machine translation systems, variation in grammatical structures between source and target languages can cause large movements of phrases. Modeling such movements is crucial in achieving translations of long sentences that appear natural in the target language. We explore generative learning approach to phrase reordering in Arabic to English. Formulating the reordering problem as a classification problem and using naive Bayes with feature selection, we achieve an improvement in the BLEU score over a lexicalized reordering model. The proposed model is compact, fast and scalable to a large corpus

Southampton (e-Prints Soton)

Gap between theory and practice: noise sensitive word alignment in machine translation

Author: Graham Yvette
Okita Tsuyoshi
Way Andy
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2010
Field of study

Word alignment is to estimate a lexical translation probability p(e|f), or to estimate the correspondence g(e, f) where a function g outputs either 0 or 1, between a source word f and a target word e for given bilingual sentences. In practice, this formulation does not consider the existence of ‘noise’ (or outlier) which may cause problems depending on the corpus. N-to-m mapping objects, such as paraphrases, non-literal translations, and multiword expressions, may appear as both noise and also as valid training data. From this perspective, this paper tries to answer the following two questions: 1) how to detect stable patterns where noise seems legitimate, and 2) how to reduce such noise, where applicable, by supplying extra information as prior knowledge to a word aligner

DCU Online Research Access Service