79,299 research outputs found
Learning Semantic Representations for the Phrase Translation Model
This paper presents a novel semantic-based phrase translation model. A pair
of source and target phrases are projected into continuous-valued vector
representations in a low-dimensional latent semantic space, where their
translation score is computed by the distance between the pair in this new
space. The projection is performed by a multi-layer neural network whose
weights are learned on parallel training data. The learning is aimed to
directly optimize the quality of end-to-end machine translation results.
Experimental evaluation has been performed on two Europarl translation tasks,
English-French and German-English. The results show that the new semantic-based
phrase translation model significantly improves the performance of a
state-of-the-art phrase-based statistical machine translation sys-tem, leading
to a gain of 0.7-1.0 BLEU points
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
In this paper, we propose a novel neural network model called RNN
Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN
encodes a sequence of symbols into a fixed-length vector representation, and
the other decodes the representation into another sequence of symbols. The
encoder and decoder of the proposed model are jointly trained to maximize the
conditional probability of a target sequence given a source sequence. The
performance of a statistical machine translation system is empirically found to
improve by using the conditional probabilities of phrase pairs computed by the
RNN Encoder-Decoder as an additional feature in the existing log-linear model.
Qualitatively, we show that the proposed model learns a semantically and
syntactically meaningful representation of linguistic phrases.Comment: EMNLP 201
Learning an Artificial Language for Knowledge-Sharing in Multilingual Translation
The cornerstone of multilingual neural translation is shared representations across languages. Given the theoretically infinite representation power of neural networks, semantically identical sentences are likely represented differently. While representing sentences in the continuous latent space ensures expressiveness, it introduces the risk of capturing of irrelevant features which hinders the learning of a common representation. In this work, we discretize the encoder output latent space of multilingual models by assigning encoder states to entries in a codebook, which in effect represents source sentences in a new artificial language. This discretization process not only offers a new way to interpret the otherwise black-box model representations, but, more importantly, gives potential for increasing robustness in unseen testing conditions. We validate our approach on large-scale experiments with realistic data volumes and domains. When tested in zero-shot conditions, our approach is competitive with two strong alternatives from the literature. We also use the learned artificial language to analyze model behavior, and discover that using a similar bridge language increases knowledge-sharing among the remaining languages
- …