16 research outputs found
Memory-augmented Neural Machine Translation
Neural machine translation (NMT) has achieved notable success in recent
times, however it is also widely recognized that this approach has limitations
with handling infrequent words and word pairs. This paper presents a novel
memory-augmented NMT (M-NMT) architecture, which stores knowledge about how
words (usually infrequently encountered ones) should be translated in a memory
and then utilizes them to assist the neural model. We use this memory mechanism
to combine the knowledge learned from a conventional statistical machine
translation system and the rules learned by an NMT system, and also propose a
solution for out-of-vocabulary (OOV) words based on this framework. Our
experiments on two Chinese-English translation tasks demonstrated that the
M-NMT architecture outperformed the NMT baseline by and BLEU points
on the two tasks, respectively. Additionally, we found this architecture
resulted in a much more effective OOV treatment compared to competitive
methods
Multi-channel Encoder for Neural Machine Translation
Attention-based Encoder-Decoder has the effective architecture for neural
machine translation (NMT), which typically relies on recurrent neural networks
(RNN) to build the blocks that will be lately called by attentive reader during
the decoding process. This design of encoder yields relatively uniform
composition on source sentence, despite the gating mechanism employed in
encoding RNN. On the other hand, we often hope the decoder to take pieces of
source sentence at varying levels suiting its own linguistic structure: for
example, we may want to take the entity name in its raw form while taking an
idiom as a perfectly composed unit. Motivated by this demand, we propose
Multi-channel Encoder (MCE), which enhances encoding components with different
levels of composition. More specifically, in addition to the hidden state of
encoding RNN, MCE takes 1) the original word embedding for raw encoding with no
composition, and 2) a particular design of external memory in Neural Turing
Machine (NTM) for more complex composition, while all three encoding strategies
are properly blended during decoding. Empirical study on Chinese-English
translation shows that our model can improve by 6.52 BLEU points upon a strong
open source NMT system: DL4MT1. On the WMT14 English- French task, our single
shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep
models.Comment: Accepted by AAAI-201
Neural fuzzy repair : integrating fuzzy matches into neural machine translation
We present a simple yet powerful data augmentation method for boosting Neural Machine Translation (NMT) performance by leveraging information retrieved from a Translation Memory (TM). We propose and test two methods for augmenting NMT training data with fuzzy TM matches. Tests on the DGT-TM data set for two language pairs show consistent and substantial improvements over a range of baseline systems. The results suggest that this method is promising for any translation environment in which a sizeable TM is available and a certain amount of repetition across translations is to be expected, especially considering its ease of implementation
Integrating Transformer and Paraphrase Rules for Sentence Simplification
Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from machine translation studies and implicitly learned simplification mapping rules from normalsimple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we propose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple stateof-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification