87 research outputs found

    Analyzing the Digital Traces of Political Manipulation: The 2016 Russian Interference Twitter Campaign

    Full text link
    Until recently, social media was seen to promote democratic discourse on social and political issues. However, this powerful communication platform has come under scrutiny for allowing hostile actors to exploit online discussions in an attempt to manipulate public opinion. A case in point is the ongoing U.S. Congress' investigation of Russian interference in the 2016 U.S. election campaign, with Russia accused of using trolls (malicious accounts created to manipulate) and bots to spread misinformation and politically biased information. In this study, we explore the effects of this manipulation campaign, taking a closer look at users who re-shared the posts produced on Twitter by the Russian troll accounts publicly disclosed by U.S. Congress investigation. We collected a dataset with over 43 million election-related posts shared on Twitter between September 16 and October 21, 2016, by about 5.7 million distinct users. This dataset included accounts associated with the identified Russian trolls. We use label propagation to infer the ideology of all users based on the news sources they shared. This method enables us to classify a large number of users as liberal or conservative with precision and recall above 90%. Conservatives retweeted Russian trolls about 31 times more often than liberals and produced 36x more tweets. Additionally, most retweets of troll content originated from two Southern states: Tennessee and Texas. Using state-of-the-art bot detection techniques, we estimated that about 4.9% and 6.2% of liberal and conservative users respectively were bots. Text analysis on the content shared by trolls reveals that they had a mostly conservative, pro-Trump agenda. Although an ideologically broad swath of Twitter users was exposed to Russian Trolls in the period leading up to the 2016 U.S. Presidential election, it was mainly conservatives who helped amplify their message

    Predicting Cyber Events by Leveraging Hacker Sentiment

    Full text link
    Recent high-profile cyber attacks exemplify why organizations need better cyber defenses. Cyber threats are hard to accurately predict because attackers usually try to mask their traces. However, they often discuss exploits and techniques on hacking forums. The community behavior of the hackers may provide insights into groups' collective malicious activity. We propose a novel approach to predict cyber events using sentiment analysis. We test our approach using cyber attack data from 2 major business organizations. We consider 3 types of events: malicious software installation, malicious destination visits, and malicious emails that surpassed the target organizations' defenses. We construct predictive signals by applying sentiment analysis on hacker forum posts to better understand hacker behavior. We analyze over 400K posts generated between January 2016 and January 2018 on over 100 hacking forums both on surface and Dark Web. We find that some forums have significantly more predictive power than others. Sentiment-based models that leverage specific forums can outperform state-of-the-art deep learning and time-series models on forecasting cyber attacks weeks ahead of the events

    Dynamics of Content Quality in Collaborative Knowledge Production

    Full text link
    We explore the dynamics of user performance in collaborative knowledge production by studying the quality of answers to questions posted on Stack Exchange. We propose four indicators of answer quality: answer length, the number of code lines and hyperlinks to external web content it contains, and whether it is accepted by the asker as the most helpful answer to the question. Analyzing millions of answers posted over the period from 2008 to 2014, we uncover regular short-term and long-term changes in quality. In the short-term, quality deteriorates over the course of a single session, with each successive answer becoming shorter, with fewer code lines and links, and less likely to be accepted. In contrast, performance improves over the long-term, with more experienced users producing higher quality answers. These trends are not a consequence of data heterogeneity, but rather have a behavioral origin. Our findings highlight the complex interplay between short-term deterioration in performance, potentially due to mental fatigue or attention depletion, and long-term performance improvement due to learning and skill acquisition, and its impact on the quality of user-generated content

    Who Falls for Online Political Manipulation?

    Full text link
    Social media, once hailed as a vehicle for democratization and the promotion of positive social change across the globe, are under attack for becoming a tool of political manipulation and spread of disinformation. A case in point is the alleged use of trolls by Russia to spread malicious content in Western elections. This paper examines the Russian interference campaign in the 2016 US presidential election on Twitter. Our aim is twofold: first, we test whether predicting users who spread trolls' content is feasible in order to gain insight on how to contain their influence in the future; second, we identify features that are most predictive of users who either intentionally or unintentionally play a vital role in spreading this malicious content. We collected a dataset with over 43 million elections-related posts shared on Twitter between September 16 and November 9, 2016, by about 5.7 million users. This dataset includes accounts associated with the Russian trolls identified by the US Congress. Proposed models are able to very accurately identify users who spread the trolls' content (average AUC score of 96%, using 10-fold validation). We show that political ideology, bot likelihood scores, and some activity-related account meta data are the most predictive features of whether a user spreads trolls' content or not
    • …
    corecore