4 research outputs found

    A Deep Learning Approach for Robust Detection of Bots in Twitter Using Transformers

    Full text link
    © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksDuring the last decades, the volume of multimedia content posted in social networks has grown exponentially and such information is immediately propagated and consumed by a significant number of users. In this scenario, the disruption of fake news providers and bot accounts for spreading propaganda information as well as sensitive content throughout the network has fostered applied researh to automatically measure the reliability of social networks accounts via Artificial Intelligence (AI). In this paper, we present a multilingual approach for addressing the bot identification task in Twitter via Deep learning (DL) approaches to support end-users when checking the credibility of a certain Twitter account. To do so, several experiments were conducted using state-of-the-art Multilingual Language Models to generate an encoding of the text-based features of the user account that are later on concatenated with the rest of the metadata to build a potential input vector on top of a Dense Network denoted as Bot-DenseNet. Consequently, this paper assesses the language constraint from previous studies where the encoding of the user account only considered either the metadatainformation or the metadata information together with some basic semantic text features. Moreover, the Bot-DenseNet produces a low-dimensional representation of the user account which can be used for any application within the Information Retrieval (IR) framewor

    HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention

    Full text link
    Twitter bot detection has become an increasingly important and challenging task to combat online misinformation, facilitate social content moderation, and safeguard the integrity of social platforms. Though existing graph-based Twitter bot detection methods achieved state-of-the-art performance, they are all based on the homophily assumption, which assumes users with the same label are more likely to be connected, making it easy for Twitter bots to disguise themselves by following a large number of genuine users. To address this issue, we proposed HOFA, a novel graph-based Twitter bot detection framework that combats the heterophilous disguise challenge with a homophily-oriented graph augmentation module (Homo-Aug) and a frequency adaptive attention module (FaAt). Specifically, the Homo-Aug extracts user representations and computes a k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN graph. For the FaAt, we propose an attention mechanism that adaptively serves as a low-pass filter along a homophilic edge and a high-pass filter along a heterophilic edge, preventing user features from being over-smoothed by their neighborhood. We also introduce a weight guidance loss to guide the frequency adaptive attention module. Our experiments demonstrate that HOFA achieves state-of-the-art performance on three widely-acknowledged Twitter bot detection benchmarks, which significantly outperforms vanilla graph-based bot detection techniques and strong heterophilic baselines. Furthermore, extensive studies confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure

    Birdspotter: A Tool for Analyzing and Labeling Twitter Users

    Full text link
    The impact of online social media on societal events and institutions is profound; and with the rapid increases in user uptake, we are just starting to understand its ramifications. Social scientists and practitioners who model online discourse as a proxy for real-world behavior, often curate large social media datasets. A lack of available tooling aimed at non-data science experts frequently leaves this data (and the insights it holds) underutilized. Here, we propose birdspotter -- a tool to analyze and label Twitter users --, and birdspotter.ml -- an exploratory visualizer for the computed metrics. birdspotter provides an end-to-end analysis pipeline, from the processing of pre-collected Twitter data, to general-purpose labeling of users, and estimating their social influence, within a few lines of code. The package features tutorials and detailed documentation. We also illustrate how to train birdspotter into a fully-fledged bot detector that achieves better than state-of-the-art performances without making any Twitter API online calls, and we showcase its usage in an exploratory analysis of a topical COVID-19 dataset

    Análisis y detección de bots en Twitter

    Full text link
    En la actualidad las redes sociales son una parte fundamental de la vida diaria de las personas. En los últimos años hemos observado como estas han tomado funciones políticas y mediáticas, y se han convertido en herramientas fundamentales para muchas empresas. No es de extrañar, por tanto, que se haya desencadenado un aumento en el pensamiento conspiranoico y la cantidad de noticias falsas distribuidas. Numerosos estudios tratan de atajar este problema centrándose en analizar el contenido publicado, y contrastarlo con diferentes tipos de fuentes verídicas, sin embargo, esta tarea ha probado ser extremadamente compleja y acarrea un importante riesgo de resultar sesgada. Sin embargo, dado que a menudo la popularidad de este tipo de noticias es debida al uso de perfles automatizados o bots, la tendencia en los últimos años ha sido atacar al medio de propagación de dicha información. En este trabajo se detallarán diferentes métodos de detección de usuarios falsos, comparando sus aproximaciones y en algunos casos midiendo sus resultados. Con este fn, se empleará el dataset “cresci-2017”, desarrollado por MIB (My Information Bubble). Este dataset destaca por ser uno de los mas completos hasta la fecha en el ámbito de la detección de bots. Además, se hará uso de diferentes algoritmos y técnicas conocidas en el panorama del aprendizaje automático, tales como Random Forest (RF), Redes Neuronales (MLP), o Clustering no Supervisado (KMeans). En concreto, nos centraremos en analizar si el uso de métodos no supervisados puede ser una alternativa a los actualmente empleados. Finalmente, se propondrán soluciones a los problemas mas relevantes de cara al desarrollo de una solución sostenible y generalizable en el tiempo, y se discutirán las posibles alternativas a explorar en el futur
    corecore