Exploiting Topical Perceptions Over Multi-Lingual Text For Hashtag Suggestion On Twitter

Foroosh, Hassan; Gomez, Fernando; Karim, Asim; Tariq, Amara

Exploiting Topical Perceptions Over Multi-Lingual Text For Hashtag Suggestion On Twitter

Authors: Hassan Foroosh
Fernando Gomez
Asim Karim
Amara Tariq
Publication date: 1 January 2013
Publisher: 'Information Bulletin on Variable Stars (IBVS)'

Abstract

Microblogging websites, such as Twitter, provide seemingly endless amount of textual information on a wide variety of topics generated by a large number of users. Microblog posts, or tweets in Twitter, are often written in an informal manner using multi-lingual styles. Ignoring informal styles or multiple languages can hamper the usefulness of microblogging mining applications. In this paper, we present a statistical method for processing tweets according to users perceptions of topics and hashtags. Based on the non-classical notion of relatedness of vocabulary terms to topics in a corpus, which is quantified by discriminative term weights, our method builds a ranked list of terms related to hashtags. Subsequently, given a new tweet, our method can suggest a ranked list of hashtags. Our method allows enhanced understanding and normalization of users perceptions for improved information retrieval applications. We evaluate our method on a dataset of 14 million tweets collected over a period of 52 days. Results demonstrate that the method actually learns useful relationships between vocabulary terms and topics, and that the performance is better than a Naive Bayes suggestion system. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved

Similar works

Full text

Available Versions

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

oai:stars.library.ucf.edu:scop...

Last time updated on 18/10/2022

CiteSeerX

oai:CiteSeerX.psu:10.1.1.722.8...

Last time updated on 30/10/2017