2,066 research outputs found
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
Convolutional neural network (CNN) is a neural network that can make use of
the internal structure of data such as the 2D structure of image data. This
paper studies CNN on text categorization to exploit the 1D structure (namely,
word order) of text data for accurate prediction. Instead of using
low-dimensional word vectors as input as is often done, we directly apply CNN
to high-dimensional text data, which leads to directly learning embedding of
small text regions for use in classification. In addition to a straightforward
adaptation of CNN from image to text, a simple but new variation which employs
bag-of-word conversion in the convolution layer is proposed. An extension to
combine multiple convolution layers is also explored for higher accuracy. The
experiments demonstrate the effectiveness of our approach in comparison with
state-of-the-art methods
Semi-Supervised Learning for Neural Keyphrase Generation
We study the problem of generating keyphrases that summarize the key points
for a given document. While sequence-to-sequence (seq2seq) models have achieved
remarkable performance on this task (Meng et al., 2017), model training often
relies on large amounts of labeled data, which is only applicable to
resource-rich domains. In this paper, we propose semi-supervised keyphrase
generation methods by leveraging both labeled data and large-scale unlabeled
samples for learning. Two strategies are proposed. First, unlabeled documents
are first tagged with synthetic keyphrases obtained from unsupervised keyphrase
extraction methods or a selflearning algorithm, and then combined with labeled
samples for training. Furthermore, we investigate a multi-task learning
framework to jointly learn to generate keyphrases as well as the titles of the
articles. Experimental results show that our semi-supervised learning-based
methods outperform a state-of-the-art model trained with labeled data only.Comment: To appear in EMNLP 2018 (12 pages, 7 figures, 6 tables
- …