4 research outputs found
A Deep Learning Approach for Robust Detection of Bots in Twitter Using Transformers
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksDuring the last decades, the volume of multimedia content posted in social networks has grown exponentially and such information is immediately propagated and consumed by a significant number of users. In this scenario, the disruption of fake news providers and bot accounts for spreading propaganda information as well as sensitive content throughout the network has fostered applied researh to automatically measure the reliability of social networks accounts via Artificial Intelligence (AI). In this paper, we present a multilingual approach for addressing the bot identification task in Twitter via Deep learning (DL) approaches to support end-users when checking the credibility of a certain Twitter account. To do so, several experiments were conducted using state-of-the-art Multilingual Language Models to generate an encoding of the text-based features of the user account that are later on concatenated with the rest of the metadata to build a potential input vector on top of a Dense Network denoted as Bot-DenseNet. Consequently, this paper assesses the language constraint from previous studies where the encoding of the user account only considered either the metadatainformation or the metadata information together with some basic semantic text features. Moreover, the Bot-DenseNet produces a low-dimensional representation of the user account which can be used for any application within the Information Retrieval (IR) framewor
HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention
Twitter bot detection has become an increasingly important and challenging
task to combat online misinformation, facilitate social content moderation, and
safeguard the integrity of social platforms. Though existing graph-based
Twitter bot detection methods achieved state-of-the-art performance, they are
all based on the homophily assumption, which assumes users with the same label
are more likely to be connected, making it easy for Twitter bots to disguise
themselves by following a large number of genuine users. To address this issue,
we proposed HOFA, a novel graph-based Twitter bot detection framework that
combats the heterophilous disguise challenge with a homophily-oriented graph
augmentation module (Homo-Aug) and a frequency adaptive attention module
(FaAt). Specifically, the Homo-Aug extracts user representations and computes a
k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN
graph. For the FaAt, we propose an attention mechanism that adaptively serves
as a low-pass filter along a homophilic edge and a high-pass filter along a
heterophilic edge, preventing user features from being over-smoothed by their
neighborhood. We also introduce a weight guidance loss to guide the frequency
adaptive attention module. Our experiments demonstrate that HOFA achieves
state-of-the-art performance on three widely-acknowledged Twitter bot detection
benchmarks, which significantly outperforms vanilla graph-based bot detection
techniques and strong heterophilic baselines. Furthermore, extensive studies
confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability
to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure
Birdspotter: A Tool for Analyzing and Labeling Twitter Users
The impact of online social media on societal events and institutions is
profound; and with the rapid increases in user uptake, we are just starting to
understand its ramifications. Social scientists and practitioners who model
online discourse as a proxy for real-world behavior, often curate large social
media datasets. A lack of available tooling aimed at non-data science experts
frequently leaves this data (and the insights it holds) underutilized. Here, we
propose birdspotter -- a tool to analyze and label Twitter users --, and
birdspotter.ml -- an exploratory visualizer for the computed metrics.
birdspotter provides an end-to-end analysis pipeline, from the processing of
pre-collected Twitter data, to general-purpose labeling of users, and
estimating their social influence, within a few lines of code. The package
features tutorials and detailed documentation. We also illustrate how to train
birdspotter into a fully-fledged bot detector that achieves better than
state-of-the-art performances without making any Twitter API online calls, and
we showcase its usage in an exploratory analysis of a topical COVID-19 dataset
Análisis y detección de bots en Twitter
En la actualidad las redes sociales son una parte fundamental de la vida diaria de las personas. En
los últimos años hemos observado como estas han tomado funciones políticas y mediáticas, y se han
convertido en herramientas fundamentales para muchas empresas. No es de extrañar, por tanto, que
se haya desencadenado un aumento en el pensamiento conspiranoico y la cantidad de noticias falsas
distribuidas.
Numerosos estudios tratan de atajar este problema centrándose en analizar el contenido publicado,
y contrastarlo con diferentes tipos de fuentes verídicas, sin embargo, esta tarea ha probado ser extremadamente compleja y acarrea un importante riesgo de resultar sesgada. Sin embargo, dado que a
menudo la popularidad de este tipo de noticias es debida al uso de perfles automatizados o bots, la
tendencia en los últimos años ha sido atacar al medio de propagación de dicha información.
En este trabajo se detallarán diferentes métodos de detección de usuarios falsos, comparando sus
aproximaciones y en algunos casos midiendo sus resultados. Con este fn, se empleará el dataset
“cresci-2017”, desarrollado por MIB (My Information Bubble). Este dataset destaca por ser uno de
los mas completos hasta la fecha en el ámbito de la detección de bots. Además, se hará uso de
diferentes algoritmos y técnicas conocidas en el panorama del aprendizaje automático, tales como
Random Forest (RF), Redes Neuronales (MLP), o Clustering no Supervisado (KMeans). En concreto,
nos centraremos en analizar si el uso de métodos no supervisados puede ser una alternativa a los
actualmente empleados.
Finalmente, se propondrán soluciones a los problemas mas relevantes de cara al desarrollo de una
solución sostenible y generalizable en el tiempo, y se discutirán las posibles alternativas a explorar en
el futur