7,183 research outputs found

    Towards Real-Time, Country-Level Location Classification of Worldwide Tweets

    Get PDF
    In contrast to much previous work that has focused on location classification of tweets restricted to a specific country, here we undertake the task in a broader context by classifying global tweets at the country level, which is so far unexplored in a real-time scenario. We analyse the extent to which a tweet's country of origin can be determined by making use of eight tweet-inherent features for classification. Furthermore, we use two datasets, collected a year apart from each other, to analyse the extent to which a model trained from historical tweets can still be leveraged for classification of new tweets. With classification experiments on all 217 countries in our datasets, as well as on the top 25 countries, we offer some insights into the best use of tweet-inherent features for an accurate country-level classification of tweets. We find that the use of a single feature, such as the use of tweet content alone -- the most widely used feature in previous work -- leaves much to be desired. Choosing an appropriate combination of both tweet content and metadata can actually lead to substantial improvements of between 20\% and 50\%. We observe that tweet content, the user's self-reported location and the user's real name, all of which are inherent in a tweet and available in a real-time scenario, are particularly useful to determine the country of origin. We also experiment on the applicability of a model trained on historical tweets to classify new tweets, finding that the choice of a particular combination of features whose utility does not fade over time can actually lead to comparable performance, avoiding the need to retrain. However, the difficulty of achieving accurate classification increases slightly for countries with multiple commonalities, especially for English and Spanish speaking countries.Comment: Accepted for publication in IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE

    Predicting Successful Memes using Network and Community Structure

    Full text link
    We investigate the predictability of successful memes using their early spreading patterns in the underlying social networks. We propose and analyze a comprehensive set of features and develop an accurate model to predict future popularity of a meme given its early spreading patterns. Our paper provides the first comprehensive comparison of existing predictive frameworks. We categorize our features into three groups: influence of early adopters, community concentration, and characteristics of adoption time series. We find that features based on community structure are the most powerful predictors of future success. We also find that early popularity of a meme is not a good predictor of its future popularity, contrary to common belief. Our methods outperform other approaches, particularly in the task of detecting very popular or unpopular memes.Comment: 10 pages, 6 figures, 2 tables. Proceedings of 8th AAAI Intl. Conf. on Weblogs and social media (ICWSM 2014

    Arabia Felix 2.0: a cross-linguistic Twitter analysis of happiness patterns in the United Arab Emirates

    Get PDF
    © 2019, The Author(s). The global popularity of social media platforms has given rise to unprecedented amounts of data, much of which reflects the thoughts, opinions and affective states of individual users. Systematic explorations of these large datasets can yield valuable information about a variety of psychological and sociocultural variables. The global nature of these platforms makes it important to extend this type of exploration across cultures and languages as each situation is likely to present unique methodological challenges and yield findings particular to the specific sociocultural context. To date, very few studies exploring large social media datasets have focused on the Arab world. This study examined social media use in Arabic and English across the United Arab Emirates (UAE), looking specifically at indicators of subjective wellbeing (happiness) across both languages. A large social media dataset, spanning 2013 to 2017, was extracted from Twitter. More than 17 million Twitter messages (tweets), written in Arabic and English and posted by users based in the UAE, were analyzed. Numerous differences were observed between individuals posting messages (tweeting) in English compared with those posting in Arabic. These differences included significant variations in the mean number of tweets posted, and the mean size of users networks (e.g. the number of followers). Additionally, using lexicon-based sentiment analytic tools (Hedonometer and Valence Shift Word Graphs), temporal patterns of happiness (expressions of positive sentiment) were explored in both languages across all seven regions (Emirates) of the UAE. Findings indicate that 7:00 am was the happiest hour, and Friday was the happiest day for both languages (the least happy day varied by language). The happiest months differed based on language, and there were also significant variations in sentiment patterns, peaks and troughs in happiness, associated with events of sociopolitical and religio-cultural significance for the UAE

    Data Portraits and Intermediary Topics: Encouraging Exploration of Politically Diverse Profiles

    Full text link
    In micro-blogging platforms, people connect and interact with others. However, due to cognitive biases, they tend to interact with like-minded people and read agreeable information only. Many efforts to make people connect with those who think differently have not worked well. In this paper, we hypothesize, first, that previous approaches have not worked because they have been direct -- they have tried to explicitly connect people with those having opposing views on sensitive issues. Second, that neither recommendation or presentation of information by themselves are enough to encourage behavioral change. We propose a platform that mixes a recommender algorithm and a visualization-based user interface to explore recommendations. It recommends politically diverse profiles in terms of distance of latent topics, and displays those recommendations in a visual representation of each user's personal content. We performed an "in the wild" evaluation of this platform, and found that people explored more recommendations when using a biased algorithm instead of ours. In line with our hypothesis, we also found that the mixture of our recommender algorithm and our user interface, allowed politically interested users to exhibit an unbiased exploration of the recommended profiles. Finally, our results contribute insights in two aspects: first, which individual differences are important when designing platforms aimed at behavioral change; and second, which algorithms and user interfaces should be mixed to help users avoid cognitive mechanisms that lead to biased behavior.Comment: 12 pages, 7 figures. To be presented at ACM Intelligent User Interfaces 201
    • …
    corecore