20 research outputs found
DNA-inspired online behavioral modeling and its application to spambot detection
We propose a strikingly novel, simple, and effective approach to model online
user behavior: we extract and analyze digital DNA sequences from user online
actions and we use Twitter as a benchmark to test our proposal. We obtain an
incisive and compact DNA-inspired characterization of user actions. Then, we
apply standard DNA analysis techniques to discriminate between genuine and
spambot accounts on Twitter. An experimental campaign supports our proposal,
showing its effectiveness and viability. To the best of our knowledge, we are
the first ones to identify and adapt DNA-inspired techniques to online user
behavioral modeling. While Twitter spambot detection is a specific use case on
a specific social media, our proposed methodology is platform and technology
agnostic, hence paving the way for diverse behavioral characterization tasks
The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race
Recent studies in social media spam and automation provide anecdotal
argumentation of the rise of a new generation of spambots, so-called social
spambots. Here, for the first time, we extensively study this novel phenomenon
on Twitter and we provide quantitative evidence that a paradigm-shift exists in
spambot design. First, we measure current Twitter's capabilities of detecting
the new social spambots. Later, we assess the human performance in
discriminating between genuine accounts, social spambots, and traditional
spambots. Then, we benchmark several state-of-the-art techniques proposed by
the academic literature. Results show that neither Twitter, nor humans, nor
cutting-edge applications are currently capable of accurately detecting the new
social spambots. Our results call for new approaches capable of turning the
tide in the fight against this raising phenomenon. We conclude by reviewing the
latest literature on spambots detection and we highlight an emerging common
research trend based on the analysis of collective behaviors. Insights derived
from both our extensive experimental campaign and survey shed light on the most
promising directions of research and lay the foundations for the arms race
against the novel social spambots. Finally, to foster research on this novel
phenomenon, we make publicly available to the scientific community all the
datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science
Track, Perth, Australia, 3-7 April, 2017
LSSL-SSD: Social spammer detection with Laplacian score and semi-supervised learning
© Springer International Publishing AG 2016. The rapid development of social networks makes it easy for people to communicate online. However, social networks usually suffer from social spammers due to their openness. Spammers deliver information for economic purposes, and they pose threats to the security of social networks. To maintain the long-term running of online social networks, many detection methods are proposed. But current methods normally use high dimension features with supervised learning algorithms to find spammers, resulting in low detection performance. To solve this problem, in this paper, we first apply the Laplacian score method, which is an unsupervised feature selection method, to obtain useful features. Based on the selected features, the semi-supervised ensemble learning is then used to train the detection model. Experimental results on the Twitter dataset show the efficiency of our approach after feature selection. Moreover, the proposed method remains high detection performance in the face of limited labeled data
Social spammer detection: A multi-relational embedding approach
© Springer International Publishing AG, part of Springer Nature 2018. Since the relation is the main data shape of social networks, social spammer detection desperately needs a relation-dependent but content-independent framework. Some recent detection method transforms the social relations into a set of topological features, such as degree, k-core, etc. However, the multiple heterogeneous relations and the direction within each relation have not been fully explored for identifying social spammers. In this paper, we make an attempt to adopt the Multi-Relational Embedding (MRE) approach for learning latent features of the social network. The MRE model is able to fuse multiple kinds of different relations and also learn two latent vectors for each relation indicating both sending role and receiving role of every user, respectively. Experimental results on a real-world multi-relational social network demonstrate the latent features extracted by our MRE model can improve the detection performance remarkably
Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling
Spambot detection in online social networks is a long-lasting challenge
involving the study and design of detection techniques capable of efficiently
identifying ever-evolving spammers. Recently, a new wave of social spambots has
emerged, with advanced human-like characteristics that allow them to go
undetected even by current state-of-the-art algorithms. In this paper, we show
that efficient spambots detection can be achieved via an in-depth analysis of
their collective behaviors exploiting the digital DNA technique for modeling
the behaviors of social network users. Inspired by its biological counterpart,
in the digital DNA representation the behavioral lifetime of a digital account
is encoded in a sequence of characters. Then, we define a similarity measure
for such digital DNA sequences. We build upon digital DNA and the similarity
between groups of users to characterize both genuine accounts and spambots.
Leveraging such characterization, we design the Social Fingerprinting
technique, which is able to discriminate among spambots and genuine accounts in
both a supervised and an unsupervised fashion. We finally evaluate the
effectiveness of Social Fingerprinting and we compare it with three
state-of-the-art detection algorithms. Among the peculiarities of our approach
is the possibility to apply off-the-shelf DNA analysis techniques to study
online users behaviors and to efficiently rely on a limited number of
lightweight account characteristics
Detecting video spammers in YouTube social media
Social media is any site that provides a network of people with a place to make connections.An example of the media is YouTube that connects people through video sharing.Unfortunately, due to the explosive number of users and various content sharing, there exist malicious users who aim to self-promote their videos or broadcast unrelated content. Even though the detection of malicious users is based on various features such as content details, social activity, social network analyzing, or hybrid, the detection rate is still considered low (i.e. 46%).This study proposes a new set of features by constructing features based on the Edge Rank algorithm.Experiments were performed using nine classifiers of different learning; decision tree, function-based and Bayesian. The results showed that the proposed
video spammers detection feature set is beneficial as the highest accuracy (i.e average) is as high as 98% and the lowest was 74%.The proposed work
would benefit YouTube users as malicious users who are sharing non relevant content can be automatically detected.This is because system resources can be optimized as YouTube users are presented with the required
content only
GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction
Graph Anomaly Detection (GAD) is a technique used to identify abnormal nodes
within graphs, finding applications in network security, fraud detection,
social media spam detection, and various other domains. A common method for GAD
is Graph Auto-Encoders (GAEs), which encode graph data into node
representations and identify anomalies by assessing the reconstruction quality
of the graphs based on these representations. However, existing GAE models are
primarily optimized for direct link reconstruction, resulting in nodes
connected in the graph being clustered in the latent space. As a result, they
excel at detecting cluster-type structural anomalies but struggle with more
complex structural anomalies that do not conform to clusters. To address this
limitation, we propose a novel solution called GAD-NR, a new variant of GAE
that incorporates neighborhood reconstruction for graph anomaly detection.
GAD-NR aims to reconstruct the entire neighborhood of a node, encompassing the
local structure, self-attributes, and neighbor attributes, based on the
corresponding node representation. By comparing the neighborhood reconstruction
loss between anomalous nodes and normal nodes, GAD-NR can effectively detect
any anomalies. Extensive experimentation conducted on six real-world datasets
validates the effectiveness of GAD-NR, showcasing significant improvements (by
up to 30% in AUC) over state-of-the-art competitors. The source code for GAD-NR
is openly available. Importantly, the comparative analysis reveals that the
existing methods perform well only in detecting one or two types of anomalies
out of the three types studied. In contrast, GAD-NR excels at detecting all
three types of anomalies across the datasets, demonstrating its comprehensive
anomaly detection capabilities.Comment: Accepted at the 17th ACM International Conference on Web Search and
Data Mining (WSDM-2024