Search CORE

170 research outputs found

MATREX: the DCU MT system for WMT 2009

Author: Du Jinhua
He Yifan
Penkale Sergio
Way Andy
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

In this paper, we describe the machine translation system in the evaluation campaign of the Fourth Workshop on Statistical Machine Translation at EACL 2009. We describe the modular design of our multi-engine MT system with particular focus on the components used in this participation. We participated in the translation task for the following translation directions: French–English and English–French, in which we employed our multi-engine architecture to translate. We also participated in the system combination task which was carried out by the MBR decoder and Confusion Network decoder. We report results on the provided development and test sets

CiteSeerX

Irish Universities

DCU Online Research Access Service

Syntactic phrase-based statistical machine translation

Author: Hassan Hany
Hearne Mary
Sima'an Khalil
Way Andy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Phrase-based statistical machine translation (PBSMT) systems represent the dominant approach in MT today. However, unlike systems in other paradigms, it has proven difficult to date to incorporate syntactic knowledge in order to improve translation quality. This paper improves on recent research which uses 'syntactified' target language phrases, by incorporating supertags as constraints to better resolve parse tree fragments. In addition, we do not impose any sentence-length limit, and using a log-linear decoder, we outperform a state-of-the-art PBSMT system by over 1.3 BLEU points (or 3.51% relative) on the NIST 2003 Arabic-English test corpus

Crossref

Irish Universities

DCU Online Research Access Service

International Migration, Integration and Social Cohesion online publications

Towards Neural Machine Translation with Latent Tree Attention

Author: Bradbury James
Socher Richard
Publication venue
Publication date: 01/01/2017
Field of study

Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree structures on both the source and target. When trained on character-level datasets with no explicit segmentation or parse annotation, the model learns a plausible segmentation and shallow parse, obtaining performance close to an attentional baseline.Comment: Presented at SPNLP 201

arXiv.org e-Print Archive

Crossref

Neural System Combination for Machine Translation

Author: Hu Wenpeng
Zhang Jiajun
Zhou Long
Zong Chengqing
Publication venue
Publication date: 01/01/2017
Field of study

Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.Comment: Accepted as a short paper by ACL-201

arXiv.org e-Print Archive

Crossref

Chunk-Based Bi-Scale Decoder for Neural Machine Translation

Author: Chen Jiajun
Huang Shujian
Li Hang
Liu Xiaohua
Tu Zhaopeng
Zhou Hao
Publication venue
Publication date: 01/01/2017
Field of study

In typical neural machine translation~(NMT), the decoder generates a sentence word by word, packing all linguistic granularities in the same time-scale of RNN. In this paper, we propose a new type of decoder for NMT, which splits the decode state into two parts and updates them in two different time-scales. Specifically, we first predict a chunk time-scale state for phrasal modeling, on top of which multiple word time-scale states are generated. In this way, the target sentence is translated hierarchically from chunks to words, with information in different granularities being leveraged. Experiments show that our proposed model significantly improves the translation performance over the state-of-the-art NMT model.Comment: Accepted as a short paper by ACL 201

arXiv.org e-Print Archive

Crossref

A syntactic skeleton for statistical machine translation

Author: Groves Declan
Mellebeek Bart
Owczarzak Karolina
van Genabith Josef
Way Andy
Publication venue
Publication date: 01/01/2006
Field of study

We present a method for improving statistical machine translation performance by using linguistically motivated syntactic information. Our algorithm recursively decomposes source language sentences into syntactically simpler and shorter chunks, and recomposes their translation to form target language sentences. This improves both the word order and lexical selection of the translation. We report statistically significant relative improvementsof 3.3% BLEU score in an experiment (English!Spanish) carried out on an 800-sentence test set extracted from the Europarl corpus

CiteSeerX

Irish Universities

DCU Online Research Access Service

The impact of morphological errors in phrase-based statistical machine translation from German and English into Swedish

Author: Täckström Oscar
Publication venue
Publication date: 01/01/2009
Field of study

We have investigated the potential for improvement in target language morphology when translating into Swedish from English and German, by measuring the errors made by a state of the art phrase-based statistical machine translation system. Our results show that there is indeed a performance gap to be filled by better modelling of inflectional morphology and compounding; and that the gap is not filled by simply feeding the translation system with more training data

Publikationer från Uppsala Universitet

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive