Search CORE

18,509 research outputs found

What does Attention in Neural Machine Translation Pay Attention to?

Author: Ghader Hamidreza
Monz Christof
Publication venue
Publication date: 01/01/2017
Field of study

Attention in neural machine translation provides the possibility to encode relevant parts of the source sentence at each translation step. As a result, attention is considered to be an alignment model as well. However, there is no work that specifically studies attention and provides analysis of what is being learned by attention models. Thus, the question still remains that how attention is similar or different from the traditional alignment. In this paper, we provide detailed analysis of attention and compare it to traditional alignment. We answer the question of whether attention is only capable of modelling translational equivalent or it captures more information. We show that attention is different from alignment in some cases and is capturing useful information other than alignments.Comment: To appear in IJCNLP 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Memory-augmented Neural Machine Translation

Author: Abel Andrew
Feng Yang
Wang Dong
Zhang Andi
Zhang Shiyue
Publication venue
Publication date: 01/01/2017
Field of study

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by

9.0

and

2.7

BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods

arXiv.org e-Print Archive

Crossref

University of Strathclyde Institutional Repository

Automatic Translating Between Ancient Chinese and Contemporary Chinese with Limited Aligned Corpora

Author: Li Wei
Su Qi
Zhang Zhiyuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/06/2020
Field of study

The Chinese language has evolved a lot during the long-term development. Therefore, native speakers now have trouble in reading sentences written in ancient Chinese. In this paper, we propose to build an end-to-end neural model to automatically translate between ancient and contemporary Chinese. However, the existing ancient-contemporary Chinese parallel corpora are not aligned at the sentence level and sentence-aligned corpora are limited, which makes it difficult to train the model. To build the sentence level parallel training data for the model, we propose an unsupervised algorithm that constructs sentence-aligned ancient-contemporary pairs by using the fact that the aligned sentence pair shares many of the tokens. Based on the aligned corpus, we propose an end-to-end neural model with copying mechanism and local attention to translate between ancient and contemporary Chinese. Experiments show that the proposed unsupervised algorithm achieves 99.4% F1 score for sentence alignment, and the translation model achieves 26.95 BLEU from ancient to contemporary, and 36.34 BLEU from contemporary to ancient.Comment: Acceptted by NLPCC 201

arXiv.org e-Print Archive

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

Author: Cohn Trevor
Haffari Gholamreza
He Xuanli
Vylomova Ekaterina
Publication venue
Publication date: 14/06/2016
Field of study

Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points

arXiv.org e-Print Archive

Monash University Research Portal