24 research outputs found
DNA-inspired online behavioral modeling and its application to spambot detection
We propose a strikingly novel, simple, and effective approach to model online
user behavior: we extract and analyze digital DNA sequences from user online
actions and we use Twitter as a benchmark to test our proposal. We obtain an
incisive and compact DNA-inspired characterization of user actions. Then, we
apply standard DNA analysis techniques to discriminate between genuine and
spambot accounts on Twitter. An experimental campaign supports our proposal,
showing its effectiveness and viability. To the best of our knowledge, we are
the first ones to identify and adapt DNA-inspired techniques to online user
behavioral modeling. While Twitter spambot detection is a specific use case on
a specific social media, our proposed methodology is platform and technology
agnostic, hence paving the way for diverse behavioral characterization tasks
The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race
Recent studies in social media spam and automation provide anecdotal
argumentation of the rise of a new generation of spambots, so-called social
spambots. Here, for the first time, we extensively study this novel phenomenon
on Twitter and we provide quantitative evidence that a paradigm-shift exists in
spambot design. First, we measure current Twitter's capabilities of detecting
the new social spambots. Later, we assess the human performance in
discriminating between genuine accounts, social spambots, and traditional
spambots. Then, we benchmark several state-of-the-art techniques proposed by
the academic literature. Results show that neither Twitter, nor humans, nor
cutting-edge applications are currently capable of accurately detecting the new
social spambots. Our results call for new approaches capable of turning the
tide in the fight against this raising phenomenon. We conclude by reviewing the
latest literature on spambots detection and we highlight an emerging common
research trend based on the analysis of collective behaviors. Insights derived
from both our extensive experimental campaign and survey shed light on the most
promising directions of research and lay the foundations for the arms race
against the novel social spambots. Finally, to foster research on this novel
phenomenon, we make publicly available to the scientific community all the
datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science
Track, Perth, Australia, 3-7 April, 2017
On the need of opening up crowdsourced emergency management systems
Nowadays,socialmediaanalysissystemsarefeedingonusercontributed data, either for beneficial purposes, such as emergency management, or for user profiling and mass surveillance. Here, we carry out a discussion about the power and pitfalls of public accessibility to social media-based systems, with specific regards to the emergency management application EARS (Earthquake Alert and Report System). We investigate whether opening such systems to the population at large would further strengthen the link between communities of volunteer citizens, intelligent systems, and decision makers, thus going in the direction of developing more sustainable and resilient societies. Our analysis highlights fundamental chal- lenges and provides interesting insights into a number of research directions with the aim of developing human-centered social media-based systems
A Socio-Informatic Approach to Automated Account Classification on Social Media
Automated accounts on social media have become increasingly problematic. We
propose a key feature in combination with existing methods to improve machine
learning algorithms for bot detection. We successfully improve classification
performance through including the proposed feature.Comment: International Conference on Social Media and Societ
Scalable and Generalizable Social Bot Detection through Data Selection
Efficient and reliable social bot classification is crucial for detecting
information manipulation on social media. Despite rapid development,
state-of-the-art bot detection models still face generalization and scalability
challenges, which greatly limit their applications. In this paper we propose a
framework that uses minimal account metadata, enabling efficient analysis that
scales up to handle the full stream of public tweets of Twitter in real time.
To ensure model accuracy, we build a rich collection of labeled datasets for
training and validation. We deploy a strict validation system so that model
performance on unseen datasets is also optimized, in addition to traditional
cross-validation. We find that strategically selecting a subset of training
data yields better model accuracy and generalization than exhaustively training
on all available data. Thanks to the simplicity of the proposed model, its
logic can be interpreted to provide insights into social bot characteristics.Comment: AAAI 202
Uncovering Coordinated Networks on Social Media
Coordinated campaigns are used to influence and manipulate social media
platforms and their users, a critical challenge to the free exchange of
information online. Here we introduce a general network-based framework to
uncover groups of accounts that are likely coordinated. The proposed method
construct coordination networks based on arbitrary behavioral traces shared
among accounts. We present five case studies of influence campaigns in the
diverse contexts of U.S. elections, Hong Kong protests, the Syrian civil war,
and cryptocurrencies. In each of these cases, we detect networks of coordinated
Twitter accounts by examining their identities, images, hashtag sequences,
retweets, and temporal patterns. The proposed framework proves to be broadly
applicable to uncover different kinds of coordination across information
warfare scenarios
Identifying Bots on Twitter with Benford’s Law
Over time Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. Due to this, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. The biggest example of this was during the 2016 American Presidential Elections, where Russian bots on Twitter pumped out fake news to influence the election results.
Identifying bots and botnets on Twitter is not just based on visual analysis and can require complex statistical methods to score a profile based on multiple features and compute a result. Benford’s Law or the Law of Anomalous Numbers states that in any naturally occurring sequence of numbers, the first significant leading digit frequency follows a particular pattern such that they are unevenly distributed and reducing. This principle can be applied to the first- degree egocentric network of a Twitter profile to assess its conformity to Benford’s Law and classify it as a bot profile or normal profile.
This project focuses on leveraging Benford’s Law in combination with various Machine Learning (ML) classifiers to identify bot profiles on Twitter. In addition, the project also discusses various statistical methods that are used to verify the classification results
Towards a Digital Ecosystem of Trust: Ethical, Legal and Societal Implications
The European vision of a digital ecosystem of trust rests on innovation, powerful technological solutions, a comprehensive regulatory framework and respect for the core values and principles of ethics. Innovation in the digital domain strongly relies on data, as has become obvious during the current pandemic. Successful data science,
especially where health data are concerned, necessitates establishing a framework where data subjects can feel safe to share their data. In this paper, methods for facilitating data sharing, privacy-preserving technologies, decentralization, data altruism, as well as the interplay between the Data Governance Act and the GDPR, are presented and discussed by reference to use cases from the largest pan-European social science data research project, SoBigData++. In doing so, we argue that innovation can be turned into responsible innovation and Europe can make its ethics work in digital practice