113 research outputs found
Who let the trolls out? Towards understanding state-sponsored trolls
Recent evidence has emerged linking coordinated campaigns by state-sponsored actors to manipulate public opinion on the Web. Campaigns revolving around major political events are enacted via mission-focused ?trolls." While trolls are involved in spreading disinformation on social media, there is little understanding of how they operate, what type of content they disseminate, how their strategies evolve over time, and how they influence the Web's in- formation ecosystem. In this paper, we begin to address this gap by analyzing 10M posts by 5.5K Twitter and Reddit users identified as Russian and Iranian state-sponsored trolls. We compare the behavior of each group of state-sponsored trolls with a focus on how their strategies change over time, the different campaigns they embark on, and differences between the trolls operated by Russia and Iran. Among other things, we find: 1) that Russian trolls were pro-Trump while Iranian trolls were anti-Trump; 2) evidence that campaigns undertaken by such actors are influenced by real-world events; and 3) that the behavior of such actors is not consistent over time, hence detection is not straightforward. Using Hawkes Processes, we quantify the influence these accounts have on pushing URLs on four platforms: Twitter, Reddit, 4chan's Politically Incorrect board (/pol/), and Gab. In general, Russian trolls were more influential and efficient in pushing URLs to all the other platforms with the exception of /pol/ where Iranians were more influential. Finally, we release our source code to ensure the reproducibility of our results and to encourage other researchers to work on understanding other emerging kinds of state-sponsored troll accounts on Twitter.https://arxiv.org/pdf/1811.03130.pdfAccepted manuscrip
The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans
A new era of Information Warfare has arrived. Various actors, including
state-sponsored ones, are weaponizing information on Online Social Networks to
run false information campaigns with targeted manipulation of public opinion on
specific topics. These false information campaigns can have dire consequences
to the public: mutating their opinions and actions, especially with respect to
critical world events like major elections. Evidently, the problem of false
information on the Web is a crucial one, and needs increased public awareness,
as well as immediate attention from law enforcement agencies, public
institutions, and in particular, the research community. In this paper, we make
a step in this direction by providing a typology of the Web's false information
ecosystem, comprising various types of false information, actors, and their
motives. We report a comprehensive overview of existing research on the false
information ecosystem by identifying several lines of work: 1) how the public
perceives false information; 2) understanding the propagation of false
information; 3) detecting and containing false information on the Web; and 4)
false information on the political stage. In this work, we pay particular
attention to political false information as: 1) it can have dire consequences
to the community (e.g., when election results are mutated) and 2) previous work
show that this type of false information propagates faster and further when
compared to other types of false information. Finally, for each of these lines
of work, we report several future research directions that can help us better
understand and mitigate the emerging problem of false information dissemination
on the Web
Large scale crowdsourcing and characterization of Twitter abusive behavior
In recent years online social networks have suffered an increase in sexism, racism, and other types of aggressive and cyberbullying behavior, often manifesting itself through offensive, abusive, or hateful language. Past scientific work focused on studying these forms of abusive activity in popular online social networks, such as Facebook and Twitter. Building on such work, we present an eight month study of the various forms of abusive behavior on Twitter, in a holistic fashion. Departing from past work, we examine a wide variety of labeling schemes, which cover different forms of abusive behavior. We propose an incremental and iterative methodology that leverages the power of crowdsourcing to annotate a large collection of tweets with a set of abuse-related labels.By applying our methodology and performing statistical analysis for label merging or elimination, we identify a reduced but robust set of labels to characterize abuse-related tweets. Finally, we offer a characterization of our annotated dataset
of 80 thousand tweets, which we make publicly available for further scientific exploration.Accepted manuscrip
Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models
Misinformation on YouTube is a significant concern, necessitating robust
detection strategies. In this paper, we introduce a novel methodology for video
classification, focusing on the veracity of the content. We convert the
conventional video classification task into a text classification task by
leveraging the textual content derived from the video transcripts. We employ
advanced machine learning techniques like transfer learning to solve the
classification challenge. Our approach incorporates two forms of transfer
learning: (a) fine-tuning base transformer models such as BERT, RoBERTa, and
ELECTRA, and (b) few-shot learning using sentence-transformers MPNet and
RoBERTa-large. We apply the trained models to three datasets: (a) YouTube
Vaccine-misinformation related videos, (b) YouTube Pseudoscience videos, and
(c) Fake-News dataset (a collection of articles). Including the Fake-News
dataset extended the evaluation of our approach beyond YouTube videos. Using
these datasets, we evaluated the models distinguishing valid information from
misinformation. The fine-tuned models yielded Matthews Correlation
Coefficient>0.81, accuracy>0.90, and F1 score>0.90 in two of three datasets.
Interestingly, the few-shot models outperformed the fine-tuned ones by 20% in
both Accuracy and F1 score for the YouTube Pseudoscience dataset, highlighting
the potential utility of this approach -- especially in the context of limited
training data
"It is just a flu": {A}ssessing the Effect of Watch History on {YouTube}'s Pseudoscientific Video Recommendations
YouTube has revolutionized the way people discover and consume videos, becoming one of the primary news sources for Internet users. Since content on YouTube is generated by its users, the platform is particularly vulnerable to misinformative and conspiratorial videos. Even worse, the role played by YouTube's recommendation algorithm in unwittingly promoting questionable content is not well understood, and could potentially make the problem even worse. This can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, e.g., during the COVID-19 pandemic. In this paper, we set out to characterize and detect pseudoscientific misinformation on YouTube. We collect 6.6K videos related to COVID-19, the flat earth theory, the anti-vaccination, and anti-mask movements; using crowdsourcing, we annotate them as pseudoscience, legitimate science, or irrelevant. We then train a deep learning classifier to detect pseudoscientific videos with an accuracy of 76.1%. Next, we quantify user exposure to this content on various parts of the platform (i.e., a user's homepage, recommended videos while watching a specific video, or search results) and how this exposure changes based on the user's watch history. We find that YouTube's recommendation algorithm is more aggressive in suggesting pseudoscientific content when users are searching for specific topics, while these recommendations are less common on a user's homepage or when actively watching pseudoscientific videos. Finally, we shed light on how a user's watch history substantially affects the type of recommended videos
"how over is it?" Understanding the Incel Community on YouTube
YouTube is by far the largest host of user-generated video content worldwide. Alas, the platform has also come under fire for hosting inappropriate, toxic, and hateful content. One community that has often been linked to sharing and publishing hateful and misogynistic content are the Involuntary Celibates (Incels), a loosely defined movement ostensibly focusing on men's issues. In this paper, we set out to analyze the Incel community on YouTube by focusing on this community's evolution over the last decade and understanding whether YouTube's recommendation algorithm steers users towards Incel-related videos. We collect videos shared on Incel communities within Reddit and perform a data-driven characterization of the content posted on YouTube.
Among other things, we find that the Incel community on YouTube is getting traction and that, during the last decade, the number of Incel-related videos and comments rose substantially. We also find that users have a 6.3% chance of being suggested an Incel-related video by YouTube's recommendation algorithm within five hops when starting from a non Incel-related video. Overall, our findings paint an alarming picture of online radicalization: not only Incel activity is increasing over time, but platforms may also play an active role in steering users towards such extreme content
What is Gab: A bastion of free speech or an alt-right echo chamber
H2020 Marie Skłodowska-Curie Action
- …