24 research outputs found

    DNA-inspired online behavioral modeling and its application to spambot detection

    Get PDF
    We propose a strikingly novel, simple, and effective approach to model online user behavior: we extract and analyze digital DNA sequences from user online actions and we use Twitter as a benchmark to test our proposal. We obtain an incisive and compact DNA-inspired characterization of user actions. Then, we apply standard DNA analysis techniques to discriminate between genuine and spambot accounts on Twitter. An experimental campaign supports our proposal, showing its effectiveness and viability. To the best of our knowledge, we are the first ones to identify and adapt DNA-inspired techniques to online user behavioral modeling. While Twitter spambot detection is a specific use case on a specific social media, our proposed methodology is platform and technology agnostic, hence paving the way for diverse behavioral characterization tasks

    The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race

    Full text link
    Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots. Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design. First, we measure current Twitter's capabilities of detecting the new social spambots. Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots. Then, we benchmark several state-of-the-art techniques proposed by the academic literature. Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots. Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon. We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors. Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots. Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science Track, Perth, Australia, 3-7 April, 2017

    On the need of opening up crowdsourced emergency management systems

    Get PDF
    Nowadays,socialmediaanalysissystemsarefeedingonusercontributed data, either for beneficial purposes, such as emergency management, or for user profiling and mass surveillance. Here, we carry out a discussion about the power and pitfalls of public accessibility to social media-based systems, with specific regards to the emergency management application EARS (Earthquake Alert and Report System). We investigate whether opening such systems to the population at large would further strengthen the link between communities of volunteer citizens, intelligent systems, and decision makers, thus going in the direction of developing more sustainable and resilient societies. Our analysis highlights fundamental chal- lenges and provides interesting insights into a number of research directions with the aim of developing human-centered social media-based systems

    A Socio-Informatic Approach to Automated Account Classification on Social Media

    Full text link
    Automated accounts on social media have become increasingly problematic. We propose a key feature in combination with existing methods to improve machine learning algorithms for bot detection. We successfully improve classification performance through including the proposed feature.Comment: International Conference on Social Media and Societ

    Scalable and Generalizable Social Bot Detection through Data Selection

    Full text link
    Efficient and reliable social bot classification is crucial for detecting information manipulation on social media. Despite rapid development, state-of-the-art bot detection models still face generalization and scalability challenges, which greatly limit their applications. In this paper we propose a framework that uses minimal account metadata, enabling efficient analysis that scales up to handle the full stream of public tweets of Twitter in real time. To ensure model accuracy, we build a rich collection of labeled datasets for training and validation. We deploy a strict validation system so that model performance on unseen datasets is also optimized, in addition to traditional cross-validation. We find that strategically selecting a subset of training data yields better model accuracy and generalization than exhaustively training on all available data. Thanks to the simplicity of the proposed model, its logic can be interpreted to provide insights into social bot characteristics.Comment: AAAI 202

    Uncovering Coordinated Networks on Social Media

    Full text link
    Coordinated campaigns are used to influence and manipulate social media platforms and their users, a critical challenge to the free exchange of information online. Here we introduce a general network-based framework to uncover groups of accounts that are likely coordinated. The proposed method construct coordination networks based on arbitrary behavioral traces shared among accounts. We present five case studies of influence campaigns in the diverse contexts of U.S. elections, Hong Kong protests, the Syrian civil war, and cryptocurrencies. In each of these cases, we detect networks of coordinated Twitter accounts by examining their identities, images, hashtag sequences, retweets, and temporal patterns. The proposed framework proves to be broadly applicable to uncover different kinds of coordination across information warfare scenarios

    Identifying Bots on Twitter with Benford’s Law

    Get PDF
    Over time Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. Due to this, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. The biggest example of this was during the 2016 American Presidential Elections, where Russian bots on Twitter pumped out fake news to influence the election results. Identifying bots and botnets on Twitter is not just based on visual analysis and can require complex statistical methods to score a profile based on multiple features and compute a result. Benford’s Law or the Law of Anomalous Numbers states that in any naturally occurring sequence of numbers, the first significant leading digit frequency follows a particular pattern such that they are unevenly distributed and reducing. This principle can be applied to the first- degree egocentric network of a Twitter profile to assess its conformity to Benford’s Law and classify it as a bot profile or normal profile. This project focuses on leveraging Benford’s Law in combination with various Machine Learning (ML) classifiers to identify bot profiles on Twitter. In addition, the project also discusses various statistical methods that are used to verify the classification results

    Towards a Digital Ecosystem of Trust: Ethical, Legal and Societal Implications

    Get PDF
    The European vision of a digital ecosystem of trust rests on innovation, powerful technological solutions, a comprehensive regulatory framework and respect for the core values and principles of ethics. Innovation in the digital domain strongly relies on data, as has become obvious during the current pandemic. Successful data science, especially where health data are concerned, necessitates establishing a framework where data subjects can feel safe to share their data. In this paper, methods for facilitating data sharing, privacy-preserving technologies, decentralization, data altruism, as well as the interplay between the Data Governance Act and the GDPR, are presented and discussed by reference to use cases from the largest pan-European social science data research project, SoBigData++. In doing so, we argue that innovation can be turned into responsible innovation and Europe can make its ethics work in digital practice
    corecore