Search CORE

19,428 research outputs found

Low-resource machine translation using MATREX: The DCU machine translation system for IWSLT 2009

Author: Cetinoglu Ozlem
Du Jinhua
Ma Yanjun
Okita Tsuyoshi
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

In this paper, we give a description of the Machine Translation (MT) system developed at DCU that was used for our fourth participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2009). Two techniques are deployed in our system in order to improve the translation quality in a low-resource scenario. The first technique is to use multiple segmentations in MT training and to utilise word lattices in decoding stage. The second technique is used to select the optimal training data that can be used to build MT systems. In this year’s participation, we use three different prototype SMT systems, and the output from each system are combined using standard system combination method. Our system is the top system for Chinese–English CHALLENGE task in terms of BLEU score

CiteSeerX

Irish Universities

DCU Online Research Access Service

Using TERp to augment the system combination for SMT

Author: Du Jinhua
Way Andy
Publication venue: Association for Machine Translation in the Americas
Publication date: 01/01/2010
Field of study

TER-Plus (TERp) is an extended TER evaluation metric incorporating morphology, synonymy and paraphrases. There are three new edit operations in TERp: Stem Matches, Synonym Matches and Phrase Substitutions (Para-phrases). In this paper, we propose a TERp-based augmented system combination in terms of the backbone selection and consensus decoding network. Combining the new properties\ud of the TERp, we also propose a two-pass decoding strategy for the lattice-based phrase-level confusion network(CN) to generate the final result. The experiments conducted on the NIST2008 Chinese-to-English test set show that our TERp-based augmented system combination framework achieves significant improvements in terms of BLEU and TERp scores compared to the state-of-the-art word-level system combination framework and a TER-based combination strategy

Irish Universities

DCU Online Research Access Service

System combination with extra alignment information

Author: Liu Qun
Okita Tsuyoshi
van Genabith Josef
Wu Xiaofeng
Publication venue
Publication date: 09/12/2012
Field of study

This paper provides the system description of the IHMM team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). Our work is based on a confusion network-based approach to system combination. We propose a new method to build a confusion network for this: (1) incorporate extra alignment information extracted from given meta data, treating them as sure alignments, into the results from IHMM, and (2) decode together with this information. We also heuristically set one of the system outputs as the default backbone. Our results show that this backbone, which is the RBMT system output, achieves an 0.11% improvement in BLEU over the backbone chosen by TER, while the extra information we added in the decoding part does not improve the results

DCU Online Research Access Service

Filling Knowledge Gaps in a Broad-Coverage Machine Translation System

Author: Chander Ishwar
Haines Matthew
Hatzivassiloglou Vasileios
Hovy Eduard
Iida Masayo
Knight Kevin
Luk Steve K.
Whitney Richard
Yamada Kenji
Publication venue
Publication date: 01/01/1995
Field of study

Knowledge-based machine translation (KBMT) techniques yield high quality in domains with detailed semantic models, limited vocabulary, and controlled input grammar. Scaling up along these dimensions means acquiring large knowledge resources. It also means behaving reasonably when definitive knowledge is not yet available. This paper describes how we can fill various KBMT knowledge gaps, often using robust statistical techniques. We describe quantitative and qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9

arXiv.org e-Print Archive

CiteSeerX

Exploring different representational units in English-to-Turkish statistical machine translation

Author: Durgar El-Kahlout İlknur
Durgar El-Kahlout Ilknur
Oflazer Kemal
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

We investigate different representational granularities for sub-lexical representation in statistical machine translation work from English to Turkish. We find that (i) representing both Turkish and English at the morpheme-level but with some selective morpheme-grouping on the Turkish side of the training data, (ii) augmenting the training data with “sentences” comprising only the content words of the original training data to bias root word alignment, (iii) reranking the n-best morpheme-sequence outputs of the decoder with a word-based language model, and (iv) using model iteration all provide a non-trivial improvement over a fully word-based baseline. Despite our very limited training data, we improve from 20.22 BLEU points for our simplest model to 25.08 BLEU points for an improvement of 4.86 points or 24% relative

CiteSeerX

Crossref

Sabanci University Research Database

Neural System Combination for Machine Translation

Author: Hu Wenpeng
Zhang Jiajun
Zhou Long
Zong Chengqing
Publication venue
Publication date: 01/01/2017
Field of study

Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.Comment: Accepted as a short paper by ACL-201

arXiv.org e-Print Archive

Crossref