8,142 research outputs found
Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder
We investigate the integration of a planning mechanism into an
encoder-decoder architecture with an explicit alignment for character-level
machine translation. We develop a model that plans ahead when it computes
alignments between the source and target sequences, constructing a matrix of
proposed future alignments and a commitment vector that governs whether to
follow or recompute the plan. This mechanism is inspired by the strategic
attentive reader and writer (STRAW) model. Our proposed model is end-to-end
trainable with fully differentiable operations. We show that it outperforms a
strong baseline on three character-level decoder neural machine translation on
WMT'15 corpus. Our analysis demonstrates that our model can compute
qualitatively intuitive alignments and achieves superior performance with fewer
parameters.Comment: Accepted to Rep4NLP 2017 Workshop at ACL 2017 Conferenc
- …