Search CORE

8,142 research outputs found

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

Author: Bengio Yoshua
Dutil Francis
Gulcehre Caglar
Trischler Adam
Publication venue
Publication date: 01/01/2017
Field of study

We investigate the integration of a planning mechanism into an encoder-decoder architecture with an explicit alignment for character-level machine translation. We develop a model that plans ahead when it computes alignments between the source and target sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the strategic attentive reader and writer (STRAW) model. Our proposed model is end-to-end trainable with fully differentiable operations. We show that it outperforms a strong baseline on three character-level decoder neural machine translation on WMT'15 corpus. Our analysis demonstrates that our model can compute qualitatively intuitive alignments and achieves superior performance with fewer parameters.Comment: Accepted to Rep4NLP 2017 Workshop at ACL 2017 Conferenc

arXiv.org e-Print Archive

Crossref