612,293 research outputs found
Joint Training for Neural Machine Translation Models with Monolingual Data
Monolingual data have been demonstrated to be helpful in improving
translation quality of both statistical machine translation (SMT) systems and
neural machine translation (NMT) systems, especially in resource-poor or domain
adaptation tasks where parallel data are not rich enough. In this paper, we
propose a novel approach to better leveraging monolingual data for neural
machine translation by jointly learning source-to-target and target-to-source
NMT models for a language pair with a joint EM optimization method. The
training process starts with two initial NMT models pre-trained on parallel
data for each direction, and these two models are iteratively updated by
incrementally decreasing translation losses on training data. In each iteration
step, both NMT models are first used to translate monolingual data from one
language to the other, forming pseudo-training data of the other NMT model.
Then two new NMT models are learnt from parallel data together with the pseudo
training data. Both NMT models are expected to be improved and better
pseudo-training data can be generated in next step. Experiment results on
Chinese-English and English-German translation tasks show that our approach can
simultaneously improve translation quality of source-to-target and
target-to-source models, significantly outperforming strong baseline systems
which are enhanced with monolingual data for model training including
back-translation.Comment: Accepted by AAAI 201
TensorLayer: A Versatile Library for Efficient Deep Learning Development
Deep learning has enabled major advances in the fields of computer vision,
natural language processing, and multimedia among many others. Developing a
deep learning system is arduous and complex, as it involves constructing neural
network architectures, managing training/trained models, tuning optimization
process, preprocessing and organizing data, etc. TensorLayer is a versatile
Python library that aims at helping researchers and engineers efficiently
develop deep learning systems. It offers rich abstractions for neural networks,
model and data management, and parallel workflow mechanism. While boosting
efficiency, TensorLayer maintains both performance and scalability. TensorLayer
was released in September 2016 on GitHub, and has helped people from academia
and industry develop real-world applications of deep learning.Comment: ACM Multimedia 201
- …