26,429 research outputs found
Learning Bilingual Word Representations by Marginalizing Alignments
We present a probabilistic model that simultaneously learns alignments and
distributed representations for bilingual data. By marginalizing over word
alignments the model captures a larger semantic context than prior work relying
on hard alignments. The advantage of this approach is demonstrated in a
cross-lingual classification task, where we outperform the prior published
state of the art.Comment: Proceedings of ACL 2014 (Short Papers
Tuning syntactically enhanced word alignment for statistical machine translation
We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word
alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of
both unsupervised and supervised word alignment approaches by obtaining anchor alignments from unsupervised generative
models and seeding the anchor alignments into a supervised discriminative model. Second, this model offers the flexibility of tuning the alignment according to different
optimisation criteria. Our experiments show that using our word alignment in a Phrase-Based Statistical Machine Translation system yields a 5.38% relative increase
on IWSLT 2007 task in terms of BLEU score
Tracking relevant alignment characteristics for machine translation
In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. In this paper we compare alignments tuned directly according to alignment F-score and BLEU score in order to investigate
the alignment characteristics that are helpful in translation. We report results for two different SMT systems (a phrase-based and an n-gram-based system) on Chinese to English IWSLT data, and Spanish to English
European Parliament data. We give alignment hints to improve BLEU score, depending on the SMT system used and the type of corpus
The impact of morphological errors in phrase-based statistical machine translation from German and English into Swedish
We have investigated the potential for improvement in target language morphology when translating into Swedish from English and German, by measuring the errors made by a state of the art phrase-based statistical machine translation system. Our results show that there is indeed a performance gap to be filled by better modelling of inflectional morphology and compounding; and that the gap is not filled by
simply feeding the translation system with more training data
- …