Search CORE

23,997 research outputs found

Learning Semantic Representations for the Phrase Translation Model

Author: Deng Li
Gao Jianfeng
He Xiaodong
Yih Wen-tau
Publication venue
Publication date: 27/11/2013
Field of study

This paper presents a novel semantic-based phrase translation model. A pair of source and target phrases are projected into continuous-valued vector representations in a low-dimensional latent semantic space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a multi-layer neural network whose weights are learned on parallel training data. The learning is aimed to directly optimize the quality of end-to-end machine translation results. Experimental evaluation has been performed on two Europarl translation tasks, English-French and German-English. The results show that the new semantic-based phrase translation model significantly improves the performance of a state-of-the-art phrase-based statistical machine translation sys-tem, leading to a gain of 0.7-1.0 BLEU points

arXiv.org e-Print Archive

CiteSeerX

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Author: Bahdanau Dzmitry
Bengio Yoshua
Bougares Fethi
Cho Kyunghyun
Gulcehre Caglar
Schwenk Holger
van Merrienboer Bart
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixed-length vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder-Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref