8,593 research outputs found
POISED: Spotting Twitter Spam Off the Beaten Paths
Cybercriminals have found in online social networks a propitious medium to
spread spam and malicious content. Existing techniques for detecting spam
include predicting the trustworthiness of accounts and analyzing the content of
these messages. However, advanced attackers can still successfully evade these
defenses.
Online social networks bring people who have personal connections or share
common interests to form communities. In this paper, we first show that users
within a networked community share some topics of interest. Moreover, content
shared on these social network tend to propagate according to the interests of
people. Dissemination paths may emerge where some communities post similar
messages, based on the interests of those communities. Spam and other malicious
content, on the other hand, follow different spreading patterns.
In this paper, we follow this insight and present POISED, a system that
leverages the differences in propagation between benign and malicious messages
on social networks to identify spam and other unwanted content. We test our
system on a dataset of 1.3M tweets collected from 64K users, and we show that
our approach is effective in detecting malicious messages, reaching 91%
precision and 93% recall. We also show that POISED's detection is more
comprehensive than previous systems, by comparing it to three state-of-the-art
spam detection systems that have been proposed by the research community in the
past. POISED significantly outperforms each of these systems. Moreover, through
simulations, we show how POISED is effective in the early detection of spam
messages and how it is resilient against two well-known adversarial machine
learning attacks
Multilevel User Credibility Assessment in Social Networks
Online social networks are one of the largest platforms for disseminating
both real and fake news. Many users on these networks, intentionally or
unintentionally, spread harmful content, fake news, and rumors in fields such
as politics and business. As a result, numerous studies have been conducted in
recent years to assess the credibility of users. A shortcoming of most of
existing methods is that they assess users by placing them in one of two
categories, real or fake. However, in real-world applications it is usually
more desirable to consider several levels of user credibility. Another
shortcoming is that existing approaches only use a portion of important
features, which downgrades their performance. In this paper, due to the lack of
an appropriate dataset for multilevel user credibility assessment, first we
design a method to collect data suitable to assess credibility at multiple
levels. Then, we develop the MultiCred model that places users at one of
several levels of credibility, based on a rich and diverse set of features
extracted from users' profile, tweets and comments. MultiCred exploits deep
language models to analyze textual data and deep neural models to process
non-textual features. Our extensive experiments reveal that MultiCred
considerably outperforms existing approaches, in terms of several accuracy
measures
- …