113 research outputs found
Detecting Political Bots on Twitter during the 2019 Finnish Parliamentary Election
In recent years, the political discussion has been dominated by the impact of bots used for manipulating public opinion. A number of sources have reported a widespread presence of political bots in social media sites such as Twitter. Compared to other countries, the influence of bots in Finnish politics have received little attention from media and researchers. This study aims to investigate the influence of bots on Finnish political Twitter, based on a dataset consisting of the accounts following major Finnish politicians before the Finnish parliamentary election of 2019. To identify the bots, we extend the existing models with the use of user-level metadata and state-of-art classification models. The results support our model as a suitable instrument for detecting Twitter bots. We found that, albeit there is a huge amount of bot accounts following major Finnish politicians, it is unlikely resulting from foreign entitiesâ attempt to influence the Finnish parliamentary election
Detecting and analyzing bots on Finnish political twitter
This masterâs thesis develops a machine learning model for detecting Twitter bots and applying the model to assess if bots were used to influence the 2019 Finnish parliamentary election. The aim of the thesis is to contribute to the growing information systems science literature on the use of social media and information systems to influence voters as well as to increase the general awareness in Finland of the effects of bots on Twitter.
The thesis relies primarily on quantitative analysis of a dataset consisting of 550,000 unique Twitter accounts. The data was collected from Twitter during March 2019. The accounts in the dataset belong to humans and bots that were following 14 prominent Finnish politicians on Twitter. To determine which accounts are bots and to assess the feasibility of a new method for Twitter bot detection, a machine learning model that utilizes metadata-based features for classifying Twitter accounts as bots or humans is developed and tested on the dataset.
The findings of this thesis indicate that a metadata-based approach is suitable for detecting bots and that there are several large botnets in the Finnish Twittersphere. Over 30% of the 550,000 accounts are labeled as bots by the model, which implies that the prevalence of bots is much higher than previously suggested by Twitterâs official estimates. Furthermore, a majority of the accounts seem inactive and either no longer being used or dormant and waiting for activation. The purpose of most of the bot accounts is obscure, and it is not certain how many of them are following and inflating the politiciansâ popularity on purpose. Although the bots clearly increase the visibility of certain politicians, the effects of the bots on Finnish political Twitter are deemed negligible
BotArtist: Twitter bot detection Machine Learning model based on Twitter suspension
Twitter as one of the most popular social networks, offers a means for
communication and online discourse, which unfortunately has been the target of
bots and fake accounts, leading to the manipulation and spreading of false
information. Towards this end, we gather a challenging, multilingual dataset of
social discourse on Twitter, originating from 9M users regarding the recent
Russo-Ukrainian war, in order to detect the bot accounts and the conversation
involving them. We collect the ground truth for our dataset through the Twitter
API suspended accounts collection, containing approximately 343K of bot
accounts and 8M of normal users. Additionally, we use a dataset provided by
Botometer-V3 with 1,777 Varol, 483 German accounts, and 1,321 US accounts.
Besides the publicly available datasets, we also manage to collect 2
independent datasets around popular discussion topics of the 2022 energy crisis
and the 2022 conspiracy discussions. Both of the datasets were labeled
according to the Twitter suspension mechanism. We build a novel ML model for
bot detection using the state-of-the-art XGBoost model. We combine the model
with a high volume of labeled tweets according to the Twitter suspension
mechanism ground truth. This requires a limited set of profile features
allowing labeling of the dataset in different time periods from the collection,
as it is independent of the Twitter API. In comparison with Botometer our
methodology achieves an average 11% higher ROC-AUC score over two real-case
scenario datasets
BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts
Twitter bot detection has become a crucial task in efforts to combat online
misinformation, mitigate election interference, and curb malicious propaganda.
However, advanced Twitter bots often attempt to mimic the characteristics of
genuine users through feature manipulation and disguise themselves to fit in
diverse user communities, posing challenges for existing Twitter bot detection
models. To this end, we propose BotMoE, a Twitter bot detection framework that
jointly utilizes multiple user information modalities (metadata, textual
content, network structure) to improve the detection of deceptive bots.
Furthermore, BotMoE incorporates a community-aware Mixture-of-Experts (MoE)
layer to improve domain generalization and adapt to different Twitter
communities. Specifically, BotMoE constructs modal-specific encoders for
metadata features, textual content, and graphical structure, which jointly
model Twitter users from three modal-specific perspectives. We then employ a
community-aware MoE layer to automatically assign users to different
communities and leverage the corresponding expert networks. Finally, user
representations from metadata, text, and graph perspectives are fused with an
expert fusion layer, combining all three modalities while measuring the
consistency of user information. Extensive experiments demonstrate that BotMoE
significantly advances the state-of-the-art on three Twitter bot detection
benchmarks. Studies also confirm that BotMoE captures advanced and evasive
bots, alleviates the reliance on training data, and better generalizes to new
and previously unseen user communities.Comment: Accepted at SIGIR 202
Should we agree to disagree about Twitter's bot problem?
Bots, simply defined as accounts controlled by automation, can be used as a
weapon for online manipulation and pose a threat to the health of platforms.
Researchers have studied online platforms to detect, estimate, and characterize
bot accounts. Concerns about the prevalence of bots were raised following Elon
Musk's bid to acquire Twitter. Twitter's recent estimate that 5\% of
monetizable daily active users being bot accounts raised questions about their
methodology. This estimate is based on a specific number of active users and
relies on Twitter's criteria for bot accounts. In this work, we want to stress
that crucial questions need to be answered in order to make a proper estimation
and compare different methodologies. We argue how assumptions on bot-likely
behavior, the detection approach, and the population inspected can affect the
estimation of the percentage of bots on Twitter. Finally, we emphasize the
responsibility of platforms to be vigilant, transparent, and unbiased in
dealing with threats that may affect their users.Comment: 22 pages, 5 figure
BotPercent: Estimating Bot Populations in Twitter Communities
Twitter bot detection is vital in combating misinformation and safeguarding
the integrity of social media discourse. While malicious bots are becoming more
and more sophisticated and personalized, standard bot detection approaches are
still agnostic to social environments (henceforth, communities) the bots
operate at. In this work, we introduce community-specific bot detection,
estimating the percentage of bots given the context of a community. Our method
-- BotPercent -- is an amalgamation of Twitter bot detection datasets and
feature-, text-, and graph-based models, adjusted to a particular community on
Twitter. We introduce an approach that performs confidence calibration across
bot detection models, which addresses generalization issues in existing
community-agnostic models targeting individual bots and leads to more accurate
community-level bot estimations. Experiments demonstrate that BotPercent
achieves state-of-the-art performance in community-level Twitter bot detection
across both balanced and imbalanced class distribution settings, %outperforming
existing approaches and presenting a less biased estimator of Twitter bot
populations within the communities we analyze. We then analyze bot rates in
several Twitter groups, including users who engage with partisan news media,
political communities in different countries, and more. Our results reveal that
the presence of Twitter bots is not homogeneous, but exhibiting a
spatial-temporal distribution with considerable heterogeneity that should be
taken into account for content moderation and social media policy making. The
implementation of BotPercent is available at
https://github.com/TamSiuhin/BotPercent.Comment: Accepted to findings of EMNLP 202
HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention
Twitter bot detection has become an increasingly important and challenging
task to combat online misinformation, facilitate social content moderation, and
safeguard the integrity of social platforms. Though existing graph-based
Twitter bot detection methods achieved state-of-the-art performance, they are
all based on the homophily assumption, which assumes users with the same label
are more likely to be connected, making it easy for Twitter bots to disguise
themselves by following a large number of genuine users. To address this issue,
we proposed HOFA, a novel graph-based Twitter bot detection framework that
combats the heterophilous disguise challenge with a homophily-oriented graph
augmentation module (Homo-Aug) and a frequency adaptive attention module
(FaAt). Specifically, the Homo-Aug extracts user representations and computes a
k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN
graph. For the FaAt, we propose an attention mechanism that adaptively serves
as a low-pass filter along a homophilic edge and a high-pass filter along a
heterophilic edge, preventing user features from being over-smoothed by their
neighborhood. We also introduce a weight guidance loss to guide the frequency
adaptive attention module. Our experiments demonstrate that HOFA achieves
state-of-the-art performance on three widely-acknowledged Twitter bot detection
benchmarks, which significantly outperforms vanilla graph-based bot detection
techniques and strong heterophilic baselines. Furthermore, extensive studies
confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability
to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure
- âŠ