25 research outputs found

    Bot Electioneering Volume: Visualizing Social Bot Activity During Elections

    Full text link
    It has been widely recognized that automated bots may have a significant impact on the outcomes of national events. It is important to raise public awareness about the threat of bots on social media during these important events, such as the 2018 US midterm election. To this end, we deployed a web application to help the public explore the activities of likely bots on Twitter on a daily basis. The application, called Bot Electioneering Volume (BEV), reports on the level of likely bot activities and visualizes the topics targeted by them. With this paper we release our code base for the BEV framework, with the goal of facilitating future efforts to combat malicious bots on social media.Comment: 3 pages, 3 figures. In submissio

    Scalable and Generalizable Social Bot Detection through Data Selection

    Full text link
    Efficient and reliable social bot classification is crucial for detecting information manipulation on social media. Despite rapid development, state-of-the-art bot detection models still face generalization and scalability challenges, which greatly limit their applications. In this paper we propose a framework that uses minimal account metadata, enabling efficient analysis that scales up to handle the full stream of public tweets of Twitter in real time. To ensure model accuracy, we build a rich collection of labeled datasets for training and validation. We deploy a strict validation system so that model performance on unseen datasets is also optimized, in addition to traditional cross-validation. We find that strategically selecting a subset of training data yields better model accuracy and generalization than exhaustively training on all available data. Thanks to the simplicity of the proposed model, its logic can be interpreted to provide insights into social bot characteristics.Comment: AAAI 202

    Uncovering Coordinated Networks on Social Media

    Full text link
    Coordinated campaigns are used to influence and manipulate social media platforms and their users, a critical challenge to the free exchange of information online. Here we introduce a general network-based framework to uncover groups of accounts that are likely coordinated. The proposed method construct coordination networks based on arbitrary behavioral traces shared among accounts. We present five case studies of influence campaigns in the diverse contexts of U.S. elections, Hong Kong protests, the Syrian civil war, and cryptocurrencies. In each of these cases, we detect networks of coordinated Twitter accounts by examining their identities, images, hashtag sequences, retweets, and temporal patterns. The proposed framework proves to be broadly applicable to uncover different kinds of coordination across information warfare scenarios

    Massive Multi-Agent Data-Driven Simulations of the GitHub Ecosystem

    Full text link
    Simulating and predicting planetary-scale techno-social systems poses heavy computational and modeling challenges. The DARPA SocialSim program set the challenge to model the evolution of GitHub, a large collaborative software-development ecosystem, using massive multi-agent simulations. We describe our best performing models and our agent-based simulation framework, which we are currently extending to allow simulating other planetary-scale techno-social systems. The challenge problem measured participant's ability, given 30 months of meta-data on user activity on GitHub, to predict the next months' activity as measured by a broad range of metrics applied to ground truth, using agent-based simulation. The challenge required scaling to a simulation of roughly 3 million agents producing a combined 30 million actions, acting on 6 million repositories with commodity hardware. It was also important to use the data optimally to predict the agent's next moves. We describe the agent framework and the data analysis employed by one of the winning teams in the challenge. Six different agent models were tested based on a variety of machine learning and statistical methods. While no single method proved the most accurate on every metric, the broadly most successful sampled from a stationary probability distribution of actions and repositories for each agent. Two reasons for the success of these agents were their use of a distinct characterization of each agent, and that GitHub users change their behavior relatively slowly

    BotSlayer: DIY Real-Time Influence Campaign Detection

    No full text
    BotSlayer is an application that helps track and detect potential manipulation of information spreading on Twitter. It can be used by journalists, corporations, political candidates, and civil society organizations to discover online coordinated campaigns in real time. BotSlayer uses an anomaly detection algorithm to flag hashtags, links, accounts, phrases, and media that are trending and amplified in a coordinated fashion by likely bots. A Web dashboard lets users explore the tweets and accounts associated with suspicious campaigns, visualize their spread, and search related content on multiple search engines and social media platforms. BotSlayer is easily installed and configured in the cloud. It will aid in the study and early detection of social media manipulation phenomena

    The Hoaxy Misinformation and Fact-Checking Diffusion Network

    No full text
    Massive amounts of misinformation flood social media like Twitter and Facebook. Digital misinformation includes articles about hoaxes, conspiracy theories, fake news, and other misleading claims. This content has been alleged to disrupt the public debate, leading to questions about its impact on the real world. A number of research questions have been formulated around the ways misinformation spreads, who are its main purveyors, and whether fact-checking efforts can be helpful at mitigating its diffusion. Here we release a large longitudinal dataset from Twitter, consisting of retweeted messages with links to misinformation and fact-checking articles. These data have been collected using Hoaxy (hoaxy.iuni.iu.edu), an open social media analytics platform whose goal is to provide a comprehensive picture of how digital misinformation spreads and competes with fact-checking efforts. The released dataset contains over 20 million retweets, spanning the period from May 2016 to the end of 2017. We provide basic statistics about the data and the associated diffusion networks

    Anatomy of an online misinformation network

    No full text
    Massive amounts of fake news and conspiratorial content have spread over social media before and after the 2016 US Presidential Elections despite intense fact-checking efforts. How do the spread of misinformation and fact-checking compete? What are the structural and dynamic characteristics of the core of the misinformation diffusion network, and who are its main purveyors? How to reduce the overall amount of misinformation? To explore these questions we built Hoaxy, an open platform that enables large-scale, systematic studies of how misinformation and fact-checking spread and compete on Twitter. Hoaxy captures public tweets that include links to articles from low-credibility and fact-checking sources. We perform k-core decomposition on a diffusion network obtained from two million retweets produced by several hundred thousand accounts over the six months before the election. As we move from the periphery to the core of the network, fact-checking nearly disappears, while social bots proliferate. The number of users in the main core reaches equilibrium around the time of the election, with limited churn and increasingly dense connections. We conclude by quantifying how effectively the network can be disrupted by penalizing the most central nodes. These findings provide a first look at the anatomy of a massive online misinformation diffusion network
    corecore