1,304 research outputs found
Better Safe Than Sorry: An Adversarial Approach to Improve Social Bot Detection
The arm race between spambots and spambot-detectors is made of several cycles
(or generations): a new wave of spambots is created (and new spam is spread),
new spambot filters are derived and old spambots mutate (or evolve) to new
species. Recently, with the diffusion of the adversarial learning approach, a
new practice is emerging: to manipulate on purpose target samples in order to
make stronger detection models. Here, we manipulate generations of Twitter
social bots, to obtain - and study - their possible future evolutions, with the
aim of eventually deriving more effective detection techniques. In detail, we
propose and experiment with a novel genetic algorithm for the synthesis of
online accounts. The algorithm allows to create synthetic evolved versions of
current state-of-the-art social bots. Results demonstrate that synthetic bots
really escape current detection techniques. However, they give all the needed
elements to improve such techniques, making possible a proactive approach for
the design of social bot detection systems.Comment: This is the pre-final version of a paper accepted @ 11th ACM
Conference on Web Science, June 30-July 3, 2019, Boston, U
RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter
Within OSNs, many of our supposedly online friends may instead be fake
accounts called social bots, part of large groups that purposely re-share
targeted content. Here, we study retweeting behaviors on Twitter, with the
ultimate goal of detecting retweeting social bots. We collect a dataset of 10M
retweets. We design a novel visualization that we leverage to highlight benign
and malicious patterns of retweeting activity. In this way, we uncover a
'normal' retweeting pattern that is peculiar of human-operated accounts, and 3
suspicious patterns related to bot activities. Then, we propose a bot detection
technique that stems from the previous exploration of retweeting behaviors. Our
technique, called Retweet-Buster (RTbust), leverages unsupervised feature
extraction and clustering. An LSTM autoencoder converts the retweet time series
into compact and informative latent feature vectors, which are then clustered
with a hierarchical density-based algorithm. Accounts belonging to large
clusters characterized by malicious retweeting patterns are labeled as bots.
RTbust obtains excellent detection results, with F1 = 0.87, whereas competitors
achieve F1 < 0.76. Finally, we apply RTbust to a large dataset of retweets,
uncovering 2 previously unknown active botnets with hundreds of accounts
A General Language for Modeling Social Media Account Behavior
Malicious actors exploit social media to inflate stock prices, sway
elections, spread misinformation, and sow discord. To these ends, they employ
tactics that include the use of inauthentic accounts and campaigns. Methods to
detect these abuses currently rely on features specifically designed to target
suspicious behaviors. However, the effectiveness of these methods decays as
malicious behaviors evolve. To address this challenge, we propose a general
language for modeling social media account behavior. Words in this language,
called BLOC, consist of symbols drawn from distinct alphabets representing user
actions and content. The language is highly flexible and can be applied to
model a broad spectrum of legitimate and suspicious online behaviors without
extensive fine-tuning. Using BLOC to represent the behaviors of Twitter
accounts, we achieve performance comparable to or better than state-of-the-art
methods in the detection of social bots and coordinated inauthentic behavior
Do you really follow them? Automatic detection of credulous Twitter users
Online Social Media represent a pervasive source of information able to reach
a huge audience. Sadly, recent studies show how online social bots (automated,
often malicious accounts, populating social networks and mimicking genuine
users) are able to amplify the dissemination of (fake) information by orders of
magnitude. Using Twitter as a benchmark, in this work we focus on what we
define credulous users, i.e., human-operated accounts with a high percentage of
bots among their followings. Being more exposed to the harmful activities of
social bots, credulous users may run the risk of being more influenced than
other users; even worse, although unknowingly, they could become spreaders of
misleading information (e.g., by retweeting bots). We design and develop a
supervised classifier to automatically recognize credulous users. The best
tested configuration achieves an accuracy of 93.27% and AUC-ROC of 0.93, thus
leading to positive and encouraging results.Comment: 8 pages, 2 tables. Accepted for publication at IDEAL 2019 (20th
International Conference on Intelligent Data Engineering and Automated
Learning, Manchester, UK, 14-16 November, 2019). The present version is the
accepted version, and it is not the final published versio
Unmasking the Web of Deceit: Uncovering Coordinated Activity to Expose Information Operations on Twitter
Social media platforms, particularly Twitter, have become pivotal arenas for
influence campaigns, often orchestrated by state-sponsored information
operations (IOs). This paper delves into the detection of key players driving
IOs by employing similarity graphs constructed from behavioral pattern data. We
unveil that well-known, yet underutilized network properties can help
accurately identify coordinated IO drivers. Drawing from a comprehensive
dataset of 49 million tweets from six countries, which includes multiple
verified IOs, our study reveals that traditional network filtering techniques
do not consistently pinpoint IO drivers across campaigns. We first propose a
framework based on node pruning that emerges superior, particularly when
combining multiple behavioral indicators across different networks. Then, we
introduce a supervised machine learning model that harnesses a vector
representation of the fused similarity network. This model, which boasts a
precision exceeding 0.95, adeptly classifies IO drivers on a global scale and
reliably forecasts their temporal engagements. Our findings are crucial in the
fight against deceptive influence campaigns on social media, helping us better
understand and detect them.Comment: Accepted at the 2024 ACM Web Conferenc
A Large Scale Behavioural Analysis of Bots and Humans on Twitter
Recent research has shown a substantial active presence of bots in online social networks (OSNs). In this
paper we perform a comparative analysis of the usage and impact of bots and humans on Twitter — one
of the largest OSNs in the world. We collect a large-scale Twitter dataset and define various metrics based
on tweet metadata. Using a human annotation task we assign ‘bot’ and ‘human’ ground-truth labels to the
dataset, and compare the annotations against an online bot detection tool for evaluation. We then ask a series
of questions to discern important behavioural characteristics of bots and humans using metrics within and
among four popularity groups. From the comparative analysis we draw clear differences and interesting
similarities between the two entitie
- …