6,155 research outputs found

    Cashtag piggybacking: uncovering spam and bot activity in stock microblogs on Twitter

    Full text link
    Microblogs are increasingly exploited for predicting prices and traded volumes of stocks in financial markets. However, it has been demonstrated that much of the content shared in microblogging platforms is created and publicized by bots and spammers. Yet, the presence (or lack thereof) and the impact of fake stock microblogs has never systematically been investigated before. Here, we study 9M tweets related to stocks of the 5 main financial markets in the US. By comparing tweets with financial data from Google Finance, we highlight important characteristics of Twitter stock microblogs. More importantly, we uncover a malicious practice - referred to as cashtag piggybacking - perpetrated by coordinated groups of bots and likely aimed at promoting low-value stocks by exploiting the popularity of high-value ones. Among the findings of our study is that as much as 71% of the authors of suspicious financial tweets are classified as bots by a state-of-the-art spambot detection algorithm. Furthermore, 37% of them were suspended by Twitter a few months after our investigation. Our results call for the adoption of spam and bot detection techniques in all studies and applications that exploit user-generated content for predicting the stock market

    A cloud-based tool for sentiment analysis in reviews about restaurants on TripAdvisor

    Get PDF
    The tourism industry has been promoting its products and services based on the reviews that people often write on travel websites like TripAdvisor.com, Booking.com and other platforms like these. These reviews have a profound effect on the decision making process when evaluating which places to visit, such as which restaurants to book, etc. In this contribution is presented a cloud based software tool for the massive analysis of this social media data (TripAdvisor.com). The main characteristics of the tool developed are: i) the ability to aggregate data obtained from social media; ii) the possibility of carrying out combined analyses of both people and comments; iii) the ability to detect the sense (positive, negative or neutral) in which the comments rotate, quantifying the degree to which they are positive or negative, as well as predicting behaviour patterns from this information; and iv) the ease of doing everything in the same application (data downloading, pre-processing, analysis and visualisation). As a test and validation case, more than 33.500 revisions written in English on restaurants in the Province of Granada (Spain) were analyse

    CLARITY at the TREC 2011 microblog track

    Get PDF
    For the first year of the TREC Microblog Track the CLARITY group concentrated on a number of areas, investigating the underlying term weighting scheme for ranking tweets, incorporating query expansion to introduce new terms into the query, as well as introducing an element of temporal re-weighting based on the temporal distribution of assumed relevant microblogs
    corecore