18,041 research outputs found
Multi-channel Encoder for Neural Machine Translation
Attention-based Encoder-Decoder has the effective architecture for neural
machine translation (NMT), which typically relies on recurrent neural networks
(RNN) to build the blocks that will be lately called by attentive reader during
the decoding process. This design of encoder yields relatively uniform
composition on source sentence, despite the gating mechanism employed in
encoding RNN. On the other hand, we often hope the decoder to take pieces of
source sentence at varying levels suiting its own linguistic structure: for
example, we may want to take the entity name in its raw form while taking an
idiom as a perfectly composed unit. Motivated by this demand, we propose
Multi-channel Encoder (MCE), which enhances encoding components with different
levels of composition. More specifically, in addition to the hidden state of
encoding RNN, MCE takes 1) the original word embedding for raw encoding with no
composition, and 2) a particular design of external memory in Neural Turing
Machine (NTM) for more complex composition, while all three encoding strategies
are properly blended during decoding. Empirical study on Chinese-English
translation shows that our model can improve by 6.52 BLEU points upon a strong
open source NMT system: DL4MT1. On the WMT14 English- French task, our single
shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep
models.Comment: Accepted by AAAI-201
- …