Search CORE

17,312 research outputs found

Generating Synthetic Data for Neural Keyword-to-Question Models

Author: Bogdanova Dasha
Lin Chin-Yew
Mikolov Tomas
Ros German
Zheng Zhicheng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/07/2018
Field of study

Search typically relies on keyword queries, but these are often semantically ambiguous. We propose to overcome this by offering users natural language questions, based on their keyword queries, to disambiguate their intent. This keyword-to-question task may be addressed using neural machine translation techniques. Neural translation models, however, require massive amounts of training data (keyword-question pairs), which is unavailable for this task. The main idea of this paper is to generate large amounts of synthetic training data from a small seed set of hand-labeled keyword-question pairs. Since natural language questions are available in large quantities, we develop models to automatically generate the corresponding keyword queries. Further, we introduce various filtering mechanisms to ensure that synthetic training data is of high quality. We demonstrate the feasibility of our approach using both automatic and manual evaluation. This is an extended version of the article published with the same title in the Proceedings of ICTIR'18.Comment: Extended version of ICTIR'18 full paper, 11 page

arXiv.org e-Print Archive

Crossref

Neural Machine Translation with Word Predictions

Author: Chen Jiajun
Dai Xinyu
Huang Shujian
Weng Rongxiang
Zheng Zaixiang
Publication venue
Publication date: 01/01/2017
Field of study

In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time. We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors. In this paper, we propose to use word predictions as a mechanism for direct supervision. More specifically, we require these vectors to be able to predict the vocabulary in target sentence. Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation. It is also helpful in reducing the target side vocabulary and improving the decoding efficiency. Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectivelyComment: Accepted at EMNLP201

arXiv.org e-Print Archive

Crossref

A Convolutional Encoder Model for Neural Machine Translation

Author: Auli Michael
Dauphin Yann N.
Gehring Jonas
Grangier David
Publication venue
Publication date: 01/01/2017
Field of study

The prevalent approach to neural machine translation relies on bi-directional LSTMs to encode the source sentence. In this paper we present a faster and simpler architecture based on a succession of convolutional layers. This allows to encode the entire source sentence simultaneously compared to recurrent networks for which computation is constrained by temporal dependencies. On WMT'16 English-Romanian translation we achieve competitive accuracy to the state-of-the-art and we outperform several recently published results on the WMT'15 English-German task. Our models obtain almost the same accuracy as a very deep LSTM setup on WMT'14 English-French translation. Our convolutional encoder speeds up CPU decoding by more than two times at the same or higher accuracy as a strong bi-directional LSTM baseline.Comment: 13 page

arXiv.org e-Print Archive

Crossref