525 research outputs found
Examining Polarized COVID-19 Twitter Discussion Using Inverse Reinforcement Learning
In this work, we model users\u27 behavior on Twitter in discussion of the COVID-19 outbreak using inverse reinforcement learning to better understand the underlying forces that drive the observed pattern of polarization. In doing so, we address the largely untapped potential of inverse reinforcement learning to model users\u27 behavior on social media, and contribute to the body of sociology, psychology, and communication research seeking to elucidate the causes of socio-cultural polarization. We hypothesize that structural characteristics of each week\u27s retweet network as well as COVID-19 data on cases, hospitalizations, and outcomes are related to the Twitter users\u27 reward function which leads to polarized discussion of COVID-19 on the platform. To derive the state space of our inverse reinforcement learning model, we compute the relative modularity of retweet networks formed from retweets about COVID-19. The action space is determined by the distribution of mask-wearing sentiment in tweets about COVID-19. We build a fine-tune a BERT text classifier to determine mask-wearing sentiment in tweet. We design state features which reflect both structural characteristics of the retweet networks and COVID-19 data on cases, hospitalizations, and outcomes. Our results indicate that polarized Twitter discussion about COVID-19 weighs more heavily on features relating to the severity of the COVID-19 outbreak and less heavily on features relating to the structure of retweet networks. Overall, our results demonstrate the aptitude of inverse reinforcement learning in helping understand user behavior on social media
RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter
Within OSNs, many of our supposedly online friends may instead be fake
accounts called social bots, part of large groups that purposely re-share
targeted content. Here, we study retweeting behaviors on Twitter, with the
ultimate goal of detecting retweeting social bots. We collect a dataset of 10M
retweets. We design a novel visualization that we leverage to highlight benign
and malicious patterns of retweeting activity. In this way, we uncover a
'normal' retweeting pattern that is peculiar of human-operated accounts, and 3
suspicious patterns related to bot activities. Then, we propose a bot detection
technique that stems from the previous exploration of retweeting behaviors. Our
technique, called Retweet-Buster (RTbust), leverages unsupervised feature
extraction and clustering. An LSTM autoencoder converts the retweet time series
into compact and informative latent feature vectors, which are then clustered
with a hierarchical density-based algorithm. Accounts belonging to large
clusters characterized by malicious retweeting patterns are labeled as bots.
RTbust obtains excellent detection results, with F1 = 0.87, whereas competitors
achieve F1 < 0.76. Finally, we apply RTbust to a large dataset of retweets,
uncovering 2 previously unknown active botnets with hundreds of accounts
Fake News Analysis and Graph Classification on a COVID-19 Twitter Dataset
Earlier researches have showed that the spread of fake news through social media can have a huge impact to society and also to individuals in an extremely negative way. In this work we aim to study the spread of fake news compared to real news in a social network. We do that by performing classical social network analysis to discover various characteristics, and formulate the problem as a binary classification, where we have graphs modeling the spread of fake and real news. For our experiments we rely on how news are propagated through a popular social media services such as Twitter during the pandemic caused by the COVID-19 virus. In the past, several other approaches classify news as fake or real by deploying various graph embedding techniques and deep learning techniques.
In this project we focus on developing a dataset that contains tweets specific to COVID-19 by performing initially text mining on the content of the tweet. Further, we create graphs of the fake and real news along with their retweets and followers and work on the graphs. We perform social network analysis and compare their characteristics. We study the propagation of fake and real news among users using community detection algorithms on the graphs. Finally, we create a model by deploying the Weisfeiler Lehman graph kernel for graph classification on our labeled dataset. The model is able to predict whether a new article is real or fake based on how the corresponding graph of the retweets and followers are connected
Fake News on Twitter related to the Refugee Crisis 2016 : An exploratory case study
Master's thesis in Information systems (IS501)Fake news has,inrecentyears,gained traction in the public media and as a research topic. Events such as the U.S 2016 presidential election, Brexit,the COVID-19 pandemic, amongst others,have seen tracesof large amounts offake news in social media. Social media sites like Twitter have enabled individuals, politicians,and companies to sharecontent and opinions witha large numberof peopleacross the globe. This opportunityfor mass communication has also ledtoTwitter becoming a place for fake news sharing. Various narratives by various actors partakein the same public discussions,andknowing whatis true and whatis fake is increasingly difficult. The purpose of this study wastoexamine and analyze a previously not studied dataset of 14.3 million tweets related to the 2016 refugee crisisand attemptto find traces of fake news. Theresearch approachchosenwas an exploratory case studywith mixed data analysis.The analyzed focusedon findingthe characteristicsof tweets, the most prominent topics,identifyingfake news,some of the actors(webpages) spreading fake news,and classify the type of fake news.To identify what content was fake, an extensive amount of literature in combination with three fact-checking services were utilized
Analysis of Retweeting Behavior Using Topic Models
Igapäevase eluga põimunud virtuaalsed sotsiaalvõrgustikud omavad üha kasvavat rolli
sotsiaalsetes ja ärilistes nähtustes. Microblogging teenused nagu Twitter mängivad
olulist rolli Interneti infovahetuses, muutes võimalikuks sõnumite leviku minutitega.
Käesolevas uurimuses analüüsitakse korduvalt edastatavate sõnumite (retweet) levikut
Twitteris. Kasutades Latent Dirichlet Allocation mudelit teemade eristamiseks näitame,
et kasutajate ja sõnumites sisalduvate teemade vaheline suhteline kaugus on lühem
korduvalt edastatavatel sõnumitel. Kasutades otsustuspuid hindame teemapõhise retweet
mudeli täpsust ja kasulikkust. Töö tulemusena näitame, et teemapõhine mudel on
tugevama ennustusvõimega võrreldes baseline mudelitega, millest lähtuvalt väidame, et
antud lähenemine on sobiv korduvalt edastavate sõnumite ennustamiseks ning edasiseks
arenduseks.Social networks are nowadays a constant presence in our lives and increasingly have a role in
important social and commercial phenomena. Microblogging services such as Twitter appear to
play an important role in the process of information dissemination on the Internet making it
possible for messages to spread virally in a matter of minutes. In this research work we study the
mechanism of re-broadcasting (called “retweeting”) information on Twitter; specifically we use
Latent Dirichlet Allocation to analyze users and messages in terms of the topics that compose
their text bodies and by means of ANOVA we are able to show that the topical distance between
users and messages is shorter for tweets that are retweeted than for those that are not. Using
Decision Tree learning we build several models in order to assess the accuracy and usefulness of
our topic-based model of retweeting. Our results show that our topic-based model slightly
outperforms a baseline prediction measure, so we conclude that such model is indeed a valid
option to consider for predicting retweet behavior with possibilities open for improvement
Flow of online misinformation during the peak of the COVID-19 pandemic in Italy
The COVID-19 pandemic has impacted on every human activity and, because of
the urgency of finding the proper responses to such an unprecedented emergency,
it generated a diffused societal debate. The online version of this discussion
was not exempted by the presence of d/misinformation campaigns, but differently
from what already witnessed in other debates, the COVID-19 -- intentional or
not -- flow of false information put at severe risk the public health, reducing
the effectiveness of governments' countermeasures. In the present manuscript,
we study the effective impact of misinformation in the Italian societal debate
on Twitter during the pandemic, focusing on the various discursive communities.
In order to extract the discursive communities, we focus on verified users,
i.e. accounts whose identity is officially certified by Twitter. We thus infer
the various discursive communities based on how verified users are perceived by
standard ones: if two verified accounts are considered as similar by non
unverified ones, we link them in the network of certified accounts. We first
observe that, beside being a mostly scientific subject, the COVID-19 discussion
show a clear division in what results to be different political groups. At this
point, by using a commonly available fact-checking software (NewsGuard), we
assess the reputation of the pieces of news exchanged. We filter the network of
retweets (i.e. users re-broadcasting the same elementary piece of information,
or tweet) from random noise and check the presence of messages displaying an
url. The impact of misinformation posts reaches the 22.1% in the right and
center-right wing community and its contribution is even stronger in absolute
numbers, due to the activity of this group: 96% of all non reputable urls
shared by political groups come from this community.Comment: 25 pages, 4 figures. The Abstract, the Introduction, the Results, the
Conclusions and the Methods were substantially rewritten. The plot of the
network have been changed, as well as table
- …