4,943 research outputs found
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
Convolutional neural network (CNN) is a neural network that can make use of
the internal structure of data such as the 2D structure of image data. This
paper studies CNN on text categorization to exploit the 1D structure (namely,
word order) of text data for accurate prediction. Instead of using
low-dimensional word vectors as input as is often done, we directly apply CNN
to high-dimensional text data, which leads to directly learning embedding of
small text regions for use in classification. In addition to a straightforward
adaptation of CNN from image to text, a simple but new variation which employs
bag-of-word conversion in the convolution layer is proposed. An extension to
combine multiple convolution layers is also explored for higher accuracy. The
experiments demonstrate the effectiveness of our approach in comparison with
state-of-the-art methods
Cross-language Text Classification with Convolutional Neural Networks From Scratch
Cross language classification is an important task in multilingual learning, where documents in different languages often share the same set of categories. The main goal is to reduce the labeling cost of training classification model for each individual language. The novel approach by using Convolutional Neural Networks for multilingual language classification is proposed in this article. It learns representation of knowledge gained from languages. Moreover, current method works for new individual language, which was not used in training. The results of empirical study on large dataset of 21 languages demonstrate robustness and competitiveness of the presented approach
- …