525 research outputs found

    Examining Polarized COVID-19 Twitter Discussion Using Inverse Reinforcement Learning

    Get PDF
    In this work, we model users\u27 behavior on Twitter in discussion of the COVID-19 outbreak using inverse reinforcement learning to better understand the underlying forces that drive the observed pattern of polarization. In doing so, we address the largely untapped potential of inverse reinforcement learning to model users\u27 behavior on social media, and contribute to the body of sociology, psychology, and communication research seeking to elucidate the causes of socio-cultural polarization. We hypothesize that structural characteristics of each week\u27s retweet network as well as COVID-19 data on cases, hospitalizations, and outcomes are related to the Twitter users\u27 reward function which leads to polarized discussion of COVID-19 on the platform. To derive the state space of our inverse reinforcement learning model, we compute the relative modularity of retweet networks formed from retweets about COVID-19. The action space is determined by the distribution of mask-wearing sentiment in tweets about COVID-19. We build a fine-tune a BERT text classifier to determine mask-wearing sentiment in tweet. We design state features which reflect both structural characteristics of the retweet networks and COVID-19 data on cases, hospitalizations, and outcomes. Our results indicate that polarized Twitter discussion about COVID-19 weighs more heavily on features relating to the severity of the COVID-19 outbreak and less heavily on features relating to the structure of retweet networks. Overall, our results demonstrate the aptitude of inverse reinforcement learning in helping understand user behavior on social media

    RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter

    Full text link
    Within OSNs, many of our supposedly online friends may instead be fake accounts called social bots, part of large groups that purposely re-share targeted content. Here, we study retweeting behaviors on Twitter, with the ultimate goal of detecting retweeting social bots. We collect a dataset of 10M retweets. We design a novel visualization that we leverage to highlight benign and malicious patterns of retweeting activity. In this way, we uncover a 'normal' retweeting pattern that is peculiar of human-operated accounts, and 3 suspicious patterns related to bot activities. Then, we propose a bot detection technique that stems from the previous exploration of retweeting behaviors. Our technique, called Retweet-Buster (RTbust), leverages unsupervised feature extraction and clustering. An LSTM autoencoder converts the retweet time series into compact and informative latent feature vectors, which are then clustered with a hierarchical density-based algorithm. Accounts belonging to large clusters characterized by malicious retweeting patterns are labeled as bots. RTbust obtains excellent detection results, with F1 = 0.87, whereas competitors achieve F1 < 0.76. Finally, we apply RTbust to a large dataset of retweets, uncovering 2 previously unknown active botnets with hundreds of accounts

    Fake News Analysis and Graph Classification on a COVID-19 Twitter Dataset

    Get PDF
    Earlier researches have showed that the spread of fake news through social media can have a huge impact to society and also to individuals in an extremely negative way. In this work we aim to study the spread of fake news compared to real news in a social network. We do that by performing classical social network analysis to discover various characteristics, and formulate the problem as a binary classification, where we have graphs modeling the spread of fake and real news. For our experiments we rely on how news are propagated through a popular social media services such as Twitter during the pandemic caused by the COVID-19 virus. In the past, several other approaches classify news as fake or real by deploying various graph embedding techniques and deep learning techniques. In this project we focus on developing a dataset that contains tweets specific to COVID-19 by performing initially text mining on the content of the tweet. Further, we create graphs of the fake and real news along with their retweets and followers and work on the graphs. We perform social network analysis and compare their characteristics. We study the propagation of fake and real news among users using community detection algorithms on the graphs. Finally, we create a model by deploying the Weisfeiler Lehman graph kernel for graph classification on our labeled dataset. The model is able to predict whether a new article is real or fake based on how the corresponding graph of the retweets and followers are connected

    Fake News on Twitter related to the Refugee Crisis 2016 : An exploratory case study

    Get PDF
    Master's thesis in Information systems (IS501)Fake news has,inrecentyears,gained traction in the public media and as a research topic. Events such as the U.S 2016 presidential election, Brexit,the COVID-19 pandemic, amongst others,have seen tracesof large amounts offake news in social media. Social media sites like Twitter have enabled individuals, politicians,and companies to sharecontent and opinions witha large numberof peopleacross the globe. This opportunityfor mass communication has also ledtoTwitter becoming a place for fake news sharing. Various narratives by various actors partakein the same public discussions,andknowing whatis true and whatis fake is increasingly difficult. The purpose of this study wastoexamine and analyze a previously not studied dataset of 14.3 million tweets related to the 2016 refugee crisisand attemptto find traces of fake news. Theresearch approachchosenwas an exploratory case studywith mixed data analysis.The analyzed focusedon findingthe characteristicsof tweets, the most prominent topics,identifyingfake news,some of the actors(webpages) spreading fake news,and classify the type of fake news.To identify what content was fake, an extensive amount of literature in combination with three fact-checking services were utilized

    Analysis of Retweeting Behavior Using Topic Models

    Get PDF
    Igapäevase eluga põimunud virtuaalsed sotsiaalvõrgustikud omavad üha kasvavat rolli sotsiaalsetes ja ärilistes nähtustes. Microblogging teenused nagu Twitter mängivad olulist rolli Interneti infovahetuses, muutes võimalikuks sõnumite leviku minutitega. Käesolevas uurimuses analüüsitakse korduvalt edastatavate sõnumite (retweet) levikut Twitteris. Kasutades Latent Dirichlet Allocation mudelit teemade eristamiseks näitame, et kasutajate ja sõnumites sisalduvate teemade vaheline suhteline kaugus on lühem korduvalt edastatavatel sõnumitel. Kasutades otsustuspuid hindame teemapõhise retweet mudeli täpsust ja kasulikkust. Töö tulemusena näitame, et teemapõhine mudel on tugevama ennustusvõimega võrreldes baseline mudelitega, millest lähtuvalt väidame, et antud lähenemine on sobiv korduvalt edastavate sõnumite ennustamiseks ning edasiseks arenduseks.Social networks are nowadays a constant presence in our lives and increasingly have a role in important social and commercial phenomena. Microblogging services such as Twitter appear to play an important role in the process of information dissemination on the Internet making it possible for messages to spread virally in a matter of minutes. In this research work we study the mechanism of re-broadcasting (called “retweeting”) information on Twitter; specifically we use Latent Dirichlet Allocation to analyze users and messages in terms of the topics that compose their text bodies and by means of ANOVA we are able to show that the topical distance between users and messages is shorter for tweets that are retweeted than for those that are not. Using Decision Tree learning we build several models in order to assess the accuracy and usefulness of our topic-based model of retweeting. Our results show that our topic-based model slightly outperforms a baseline prediction measure, so we conclude that such model is indeed a valid option to consider for predicting retweet behavior with possibilities open for improvement

    Flow of online misinformation during the peak of the COVID-19 pandemic in Italy

    Get PDF
    The COVID-19 pandemic has impacted on every human activity and, because of the urgency of finding the proper responses to such an unprecedented emergency, it generated a diffused societal debate. The online version of this discussion was not exempted by the presence of d/misinformation campaigns, but differently from what already witnessed in other debates, the COVID-19 -- intentional or not -- flow of false information put at severe risk the public health, reducing the effectiveness of governments' countermeasures. In the present manuscript, we study the effective impact of misinformation in the Italian societal debate on Twitter during the pandemic, focusing on the various discursive communities. In order to extract the discursive communities, we focus on verified users, i.e. accounts whose identity is officially certified by Twitter. We thus infer the various discursive communities based on how verified users are perceived by standard ones: if two verified accounts are considered as similar by non unverified ones, we link them in the network of certified accounts. We first observe that, beside being a mostly scientific subject, the COVID-19 discussion show a clear division in what results to be different political groups. At this point, by using a commonly available fact-checking software (NewsGuard), we assess the reputation of the pieces of news exchanged. We filter the network of retweets (i.e. users re-broadcasting the same elementary piece of information, or tweet) from random noise and check the presence of messages displaying an url. The impact of misinformation posts reaches the 22.1% in the right and center-right wing community and its contribution is even stronger in absolute numbers, due to the activity of this group: 96% of all non reputable urls shared by political groups come from this community.Comment: 25 pages, 4 figures. The Abstract, the Introduction, the Results, the Conclusions and the Methods were substantially rewritten. The plot of the network have been changed, as well as table
    corecore