170 research outputs found
MATREX: the DCU MT system for WMT 2009
In this paper, we describe the machine translation system in the evaluation campaign of the Fourth Workshop on Statistical Machine Translation at EACL 2009.
We describe the modular design of our multi-engine MT system with particular focus on the components used in this participation. We participated in the translation task
for the following translation directions: FrenchâEnglish and EnglishâFrench, in which we employed our multi-engine architecture to translate. We also participated in the system combination task which was carried out by the MBR decoder and Confusion Network decoder.
We report results on the provided development and test sets
Syntactic phrase-based statistical machine translation
Phrase-based statistical machine translation (PBSMT) systems represent the dominant approach in MT today. However, unlike systems in other paradigms, it has proven difficult to date to incorporate syntactic knowledge in order to improve translation quality. This paper improves on recent research which uses 'syntactified' target language phrases, by incorporating supertags as constraints to better resolve parse tree fragments. In addition, we do not impose any sentence-length limit, and using a log-linear decoder, we outperform a state-of-the-art PBSMT system by over 1.3 BLEU points (or 3.51% relative) on the NIST 2003 Arabic-English test corpus
Towards Neural Machine Translation with Latent Tree Attention
Building models that take advantage of the hierarchical structure of language
without a priori annotation is a longstanding goal in natural language
processing. We introduce such a model for the task of machine translation,
pairing a recurrent neural network grammar encoder with a novel attentional
RNNG decoder and applying policy gradient reinforcement learning to induce
unsupervised tree structures on both the source and target. When trained on
character-level datasets with no explicit segmentation or parse annotation, the
model learns a plausible segmentation and shallow parse, obtaining performance
close to an attentional baseline.Comment: Presented at SPNLP 201
Neural System Combination for Machine Translation
Neural machine translation (NMT) becomes a new approach to machine
translation and generates much more fluent results compared to statistical
machine translation (SMT).
However, SMT is usually better than NMT in translation adequacy. It is
therefore a promising direction to combine the advantages of both NMT and SMT.
In this paper, we propose a neural system combination framework leveraging
multi-source NMT, which takes as input the outputs of NMT and SMT systems and
produces the final translation.
Extensive experiments on the Chinese-to-English translation task show that
our model archives significant improvement by 5.3 BLEU points over the best
single system output and 3.4 BLEU points over the state-of-the-art traditional
system combination methods.Comment: Accepted as a short paper by ACL-201
Chunk-Based Bi-Scale Decoder for Neural Machine Translation
In typical neural machine translation~(NMT), the decoder generates a sentence
word by word, packing all linguistic granularities in the same time-scale of
RNN. In this paper, we propose a new type of decoder for NMT, which splits the
decode state into two parts and updates them in two different time-scales.
Specifically, we first predict a chunk time-scale state for phrasal modeling,
on top of which multiple word time-scale states are generated. In this way, the
target sentence is translated hierarchically from chunks to words, with
information in different granularities being leveraged. Experiments show that
our proposed model significantly improves the translation performance over the
state-of-the-art NMT model.Comment: Accepted as a short paper by ACL 201
A syntactic skeleton for statistical machine translation
We present a method for improving statistical machine translation performance by using linguistically motivated syntactic information. Our algorithm recursively decomposes source language sentences into syntactically simpler and shorter chunks, and recomposes their translation to form target language sentences. This improves both the word order and lexical selection of the translation. We report statistically significant relative improvementsof 3.3% BLEU score in an experiment (English!Spanish) carried out on
an 800-sentence test set extracted from the Europarl corpus
The impact of morphological errors in phrase-based statistical machine translation from German and English into Swedish
We have investigated the potential for improvement in target language morphology when translating into Swedish from English and German, by measuring the errors made by a state of the art phrase-based statistical machine translation system. Our results show that there is indeed a performance gap to be filled by better modelling of inflectional morphology and compounding; and that the gap is not filled by
simply feeding the translation system with more training data
- âŠ