175 research outputs found
Character-level Convolutional Networks for Text Classification
This article offers an empirical exploration on the use of character-level
convolutional networks (ConvNets) for text classification. We constructed
several large-scale datasets to show that character-level convolutional
networks could achieve state-of-the-art or competitive results. Comparisons are
offered against traditional models such as bag of words, n-grams and their
TFIDF variants, and deep learning models such as word-based ConvNets and
recurrent neural networks.Comment: An early version of this work entitled "Text Understanding from
Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has
considerably more experimental results and a rewritten introduction, Advances
in Neural Information Processing Systems 28 (NIPS 2015
Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classification.
Our experiments show that our fast text classifier fastText is often on par
with deep learning classifiers in terms of accuracy, and many orders of
magnitude faster for training and evaluation. We can train fastText on more
than one billion words in less than ten minutes using a standard multicore~CPU,
and classify half a million sentences among~312K classes in less than a minute
Automatic Detection and Categorization of Election-Related Tweets
With the rise in popularity of public social media and micro-blogging
services, most notably Twitter, the people have found a venue to hear and be
heard by their peers without an intermediary. As a consequence, and aided by
the public nature of Twitter, political scientists now potentially have the
means to analyse and understand the narratives that organically form, spread
and decline among the public in a political campaign. However, the volume and
diversity of the conversation on Twitter, combined with its noisy and
idiosyncratic nature, make this a hard task. Thus, advanced data mining and
language processing techniques are required to process and analyse the data. In
this paper, we present and evaluate a technical framework, based on recent
advances in deep neural networks, for identifying and analysing
election-related conversation on Twitter on a continuous, longitudinal basis.
Our models can detect election-related tweets with an F-score of 0.92 and can
categorize these tweets into 22 topics with an F-score of 0.90.Comment: ICWSM'16, May 17-20, 2016, Cologne, Germany. In Proceedings of the
10th AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne,
German
Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder
We present Tweet2Vec, a novel method for generating general-purpose vector
representation of tweets. The model learns tweet embeddings using
character-level CNN-LSTM encoder-decoder. We trained our model on 3 million,
randomly selected English-language tweets. The model was evaluated using two
methods: tweet semantic similarity and tweet sentiment categorization,
outperforming the previous state-of-the-art in both tasks. The evaluations
demonstrate the power of the tweet embeddings generated by our model for
various tweet categorization tasks. The vector representations generated by our
model are generic, and hence can be applied to a variety of tasks. Though the
model presented in this paper is trained on English-language tweets, the method
presented can be used to learn tweet embeddings for different languages.Comment: SIGIR 2016, July 17-21, 2016, Pisa. Proceedings of SIGIR 2016. Pisa,
Italy (2016
Shallow reading with Deep Learning: Predicting popularity of online content using only its title
With the ever decreasing attention span of contemporary Internet users, the
title of online content (such as a news article or video) can be a major factor
in determining its popularity. To take advantage of this phenomenon, we propose
a new method based on a bidirectional Long Short-Term Memory (LSTM) neural
network designed to predict the popularity of online content using only its
title. We evaluate the proposed architecture on two distinct datasets of news
articles and news videos distributed in social media that contain over 40,000
samples in total. On those datasets, our approach improves the performance over
traditional shallow approaches by a margin of 15%. Additionally, we show that
using pre-trained word vectors in the embedding layer improves the results of
LSTM models, especially when the training set is small. To our knowledge, this
is the first attempt of applying popularity prediction using only textual
information from the title
Cross-language Text Classification with Convolutional Neural Networks From Scratch
Cross language classification is an important task in multilingual learning, where documents in different languages often share the same set of categories. The main goal is to reduce the labeling cost of training classification model for each individual language. The novel approach by using Convolutional Neural Networks for multilingual language classification is proposed in this article. It learns representation of knowledge gained from languages. Moreover, current method works for new individual language, which was not used in training. The results of empirical study on large dataset of 21 languages demonstrate robustness and competitiveness of the presented approach
- …