1,873 research outputs found
Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm
Twitter is a popular social network platform where users can interact and
post texts of up to 280 characters called tweets. Hashtags, hyperlinked words
in tweets, have increasingly become crucial for tweet retrieval and search.
Using hashtags for tweet topic classification is a challenging problem because
of context dependent among words, slangs, abbreviation and emoticons in a short
tweet along with evolving use of hashtags. Since Twitter generates millions of
tweets daily, tweet analytics is a fundamental problem of Big data stream that
often requires a real-time Distributed processing. This paper proposes a
distributed online approach to tweet topic classification with hashtags. Being
implemented on Apache Storm, a distributed real time framework, our approach
incrementally identifies and updates a set of strong predictors in the Na\"ive
Bayes model for classifying each incoming tweet instance. Preliminary
experiments show promising results with up to 97% accuracy and 37% increase in
throughput on eight processors.Comment: IEEE International Conference on Big Data 201
#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds
Compounding of natural language units is a very common phenomena. In this
paper, we show, for the first time, that Twitter hashtags which, could be
considered as correlates of such linguistic units, undergo compounding. We
identify reasons for this compounding and propose a prediction model that can
identify with 77.07% accuracy if a pair of hashtags compounding in the near
future (i.e., 2 months after compounding) shall become popular. At longer times
T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This
technique has strong implications to trending hashtag recommendation since
newly formed hashtag compounds can be recommended early, even before the
compounding has taken place. Further, humans can predict compounds with an
overall accuracy of only 48.7% (treated as baseline). Notably, while humans can
discriminate the relatively easier cases, the automatic framework is successful
in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported
Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM
conference on Computer-Supported Cooperative Work and Social Computing (CSCW
2016
Is That Twitter Hashtag Worth Reading
Online social media such as Twitter, Facebook, Wikis and Linkedin have made a
great impact on the way we consume information in our day to day life. Now it
has become increasingly important that we come across appropriate content from
the social media to avoid information explosion. In case of Twitter, popular
information can be tracked using hashtags. Studying the characteristics of
tweets containing hashtags becomes important for a number of tasks, such as
breaking news detection, personalized message recommendation, friends
recommendation, and sentiment analysis among others.
In this paper, we have analyzed Twitter data based on trending hashtags,
which is widely used nowadays. We have used event based hashtags to know users'
thoughts on those events and to decide whether the rest of the users might find
it interesting or not. We have used topic modeling, which reveals the hidden
thematic structure of the documents (tweets in this case) in addition to
sentiment analysis in exploring and summarizing the content of the documents. A
technique to find the interestingness of event based twitter hashtag and the
associated sentiment has been proposed. The proposed technique helps twitter
follower to read, relevant and interesting hashtag.Comment: 10 pages, 6 figures, Presented at the Third International Symposium
on Women in Computing and Informatics (WCI-2015
On Identifying Hashtags in Disaster Twitter Data
Tweet hashtags have the potential to improve the search for information
during disaster events. However, there is a large number of disaster-related
tweets that do not have any user-provided hashtags. Moreover, only a small
number of tweets that contain actionable hashtags are useful for disaster
response. To facilitate progress on automatic identification (or extraction) of
disaster hashtags for Twitter data, we construct a unique dataset of
disaster-related tweets annotated with hashtags useful for filtering actionable
information. Using this dataset, we further investigate Long Short Term
Memory-based models within a Multi-Task Learning framework. The best performing
model achieves an F1-score as high as 92.22%. The dataset, code, and other
resources are available on Github
- …