10,719 research outputs found

    Classifying sentiment in microblogs: is brevity an advantage?

    Get PDF
    Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning document representations. In this work we examine the hypothesis that it is easier to classify the sentiment in these short form documents than in longer form documents. Surprisingly, we find classifying sentiment in microblogs easier than in blogs and make a number of observations pertaining to the challenge of supervised learning for sentiment analysis in microblogs

    Cashtag piggybacking: uncovering spam and bot activity in stock microblogs on Twitter

    Full text link
    Microblogs are increasingly exploited for predicting prices and traded volumes of stocks in financial markets. However, it has been demonstrated that much of the content shared in microblogging platforms is created and publicized by bots and spammers. Yet, the presence (or lack thereof) and the impact of fake stock microblogs has never systematically been investigated before. Here, we study 9M tweets related to stocks of the 5 main financial markets in the US. By comparing tweets with financial data from Google Finance, we highlight important characteristics of Twitter stock microblogs. More importantly, we uncover a malicious practice - referred to as cashtag piggybacking - perpetrated by coordinated groups of bots and likely aimed at promoting low-value stocks by exploiting the popularity of high-value ones. Among the findings of our study is that as much as 71% of the authors of suspicious financial tweets are classified as bots by a state-of-the-art spambot detection algorithm. Furthermore, 37% of them were suspended by Twitter a few months after our investigation. Our results call for the adoption of spam and bot detection techniques in all studies and applications that exploit user-generated content for predicting the stock market

    MAP: Microblogging Assisted Profiling of TV Shows

    Full text link
    Online microblogging services that have been increasingly used by people to share and exchange information, have emerged as a promising way to profiling multimedia contents, in a sense to provide users a socialized abstraction and understanding of these contents. In this paper, we propose a microblogging profiling framework, to provide a social demonstration of TV shows. Challenges for this study lie in two folds: First, TV shows are generally offline, i.e., most of them are not originally from the Internet, and we need to create a connection between these TV shows with online microblogging services; Second, contents in a microblogging service are extremely noisy for video profiling, and we need to strategically retrieve the most related information for the TV show profiling.To address these challenges, we propose a MAP, a microblogging-assisted profiling framework, with contributions as follows: i) We propose a joint user and content retrieval scheme, which uses information about both actors and topics of a TV show to retrieve related microblogs; ii) We propose a social-aware profiling strategy, which profiles a video according to not only its content, but also the social relationship of its microblogging users and its propagation in the social network; iii) We present some interesting analysis, based on our framework to profile real-world TV shows

    CLARITY at the TREC 2011 microblog track

    Get PDF
    For the first year of the TREC Microblog Track the CLARITY group concentrated on a number of areas, investigating the underlying term weighting scheme for ranking tweets, incorporating query expansion to introduce new terms into the query, as well as introducing an element of temporal re-weighting based on the temporal distribution of assumed relevant microblogs
    corecore