87 research outputs found
Analyzing the Digital Traces of Political Manipulation: The 2016 Russian Interference Twitter Campaign
Until recently, social media was seen to promote democratic discourse on
social and political issues. However, this powerful communication platform has
come under scrutiny for allowing hostile actors to exploit online discussions
in an attempt to manipulate public opinion. A case in point is the ongoing U.S.
Congress' investigation of Russian interference in the 2016 U.S. election
campaign, with Russia accused of using trolls (malicious accounts created to
manipulate) and bots to spread misinformation and politically biased
information. In this study, we explore the effects of this manipulation
campaign, taking a closer look at users who re-shared the posts produced on
Twitter by the Russian troll accounts publicly disclosed by U.S. Congress
investigation. We collected a dataset with over 43 million election-related
posts shared on Twitter between September 16 and October 21, 2016, by about 5.7
million distinct users. This dataset included accounts associated with the
identified Russian trolls. We use label propagation to infer the ideology of
all users based on the news sources they shared. This method enables us to
classify a large number of users as liberal or conservative with precision and
recall above 90%. Conservatives retweeted Russian trolls about 31 times more
often than liberals and produced 36x more tweets. Additionally, most retweets
of troll content originated from two Southern states: Tennessee and Texas.
Using state-of-the-art bot detection techniques, we estimated that about 4.9%
and 6.2% of liberal and conservative users respectively were bots. Text
analysis on the content shared by trolls reveals that they had a mostly
conservative, pro-Trump agenda. Although an ideologically broad swath of
Twitter users was exposed to Russian Trolls in the period leading up to the
2016 U.S. Presidential election, it was mainly conservatives who helped amplify
their message
Predicting Cyber Events by Leveraging Hacker Sentiment
Recent high-profile cyber attacks exemplify why organizations need better
cyber defenses. Cyber threats are hard to accurately predict because attackers
usually try to mask their traces. However, they often discuss exploits and
techniques on hacking forums. The community behavior of the hackers may provide
insights into groups' collective malicious activity. We propose a novel
approach to predict cyber events using sentiment analysis. We test our approach
using cyber attack data from 2 major business organizations. We consider 3
types of events: malicious software installation, malicious destination visits,
and malicious emails that surpassed the target organizations' defenses. We
construct predictive signals by applying sentiment analysis on hacker forum
posts to better understand hacker behavior. We analyze over 400K posts
generated between January 2016 and January 2018 on over 100 hacking forums both
on surface and Dark Web. We find that some forums have significantly more
predictive power than others. Sentiment-based models that leverage specific
forums can outperform state-of-the-art deep learning and time-series models on
forecasting cyber attacks weeks ahead of the events
Dynamics of Content Quality in Collaborative Knowledge Production
We explore the dynamics of user performance in collaborative knowledge
production by studying the quality of answers to questions posted on Stack
Exchange. We propose four indicators of answer quality: answer length, the
number of code lines and hyperlinks to external web content it contains, and
whether it is accepted by the asker as the most helpful answer to the question.
Analyzing millions of answers posted over the period from 2008 to 2014, we
uncover regular short-term and long-term changes in quality. In the short-term,
quality deteriorates over the course of a single session, with each successive
answer becoming shorter, with fewer code lines and links, and less likely to be
accepted. In contrast, performance improves over the long-term, with more
experienced users producing higher quality answers. These trends are not a
consequence of data heterogeneity, but rather have a behavioral origin. Our
findings highlight the complex interplay between short-term deterioration in
performance, potentially due to mental fatigue or attention depletion, and
long-term performance improvement due to learning and skill acquisition, and
its impact on the quality of user-generated content
Who Falls for Online Political Manipulation?
Social media, once hailed as a vehicle for democratization and the promotion
of positive social change across the globe, are under attack for becoming a
tool of political manipulation and spread of disinformation. A case in point is
the alleged use of trolls by Russia to spread malicious content in Western
elections. This paper examines the Russian interference campaign in the 2016 US
presidential election on Twitter. Our aim is twofold: first, we test whether
predicting users who spread trolls' content is feasible in order to gain
insight on how to contain their influence in the future; second, we identify
features that are most predictive of users who either intentionally or
unintentionally play a vital role in spreading this malicious content. We
collected a dataset with over 43 million elections-related posts shared on
Twitter between September 16 and November 9, 2016, by about 5.7 million users.
This dataset includes accounts associated with the Russian trolls identified by
the US Congress. Proposed models are able to very accurately identify users who
spread the trolls' content (average AUC score of 96%, using 10-fold
validation). We show that political ideology, bot likelihood scores, and some
activity-related account meta data are the most predictive features of whether
a user spreads trolls' content or not
- …