47,419 research outputs found
POISED: Spotting Twitter Spam Off the Beaten Paths
Cybercriminals have found in online social networks a propitious medium to
spread spam and malicious content. Existing techniques for detecting spam
include predicting the trustworthiness of accounts and analyzing the content of
these messages. However, advanced attackers can still successfully evade these
defenses.
Online social networks bring people who have personal connections or share
common interests to form communities. In this paper, we first show that users
within a networked community share some topics of interest. Moreover, content
shared on these social network tend to propagate according to the interests of
people. Dissemination paths may emerge where some communities post similar
messages, based on the interests of those communities. Spam and other malicious
content, on the other hand, follow different spreading patterns.
In this paper, we follow this insight and present POISED, a system that
leverages the differences in propagation between benign and malicious messages
on social networks to identify spam and other unwanted content. We test our
system on a dataset of 1.3M tweets collected from 64K users, and we show that
our approach is effective in detecting malicious messages, reaching 91%
precision and 93% recall. We also show that POISED's detection is more
comprehensive than previous systems, by comparing it to three state-of-the-art
spam detection systems that have been proposed by the research community in the
past. POISED significantly outperforms each of these systems. Moreover, through
simulations, we show how POISED is effective in the early detection of spam
messages and how it is resilient against two well-known adversarial machine
learning attacks
Recommended from our members
Statistical analysis of identity risk of exposure and cost using the ecosystem of identity attributes
Personally Identifiable Information (PII) is often called the "currency of the Internet" as identity assets are collected, shared, sold, and used for almost every transaction on the Internet. PII is used for all types of applications from access control to credit score calculations to targeted advertising. Every market sector relies on PII to know and authenticate their customers and their employees. With so many businesses and government agencies relying on PII to make important decisions and so many people being asked to share personal data, it is critical to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive Identity Ecosystem utilizes graphs to model PII assets and their relationships and is powered by empirical data from almost 6,000 real-world identity theft and fraud news reports to populate the UT CID Identity Ecosystem. We analyze UT CID Identity Ecosystem using graph theory and report numerous novel statistics using identity asset content, structure, value, accessibility, and impact. Our work sheds light on how identity is used and paves the way for improving identity protection.Electrical and Computer Engineerin
Seminar Users in the Arabic Twitter Sphere
We introduce the notion of "seminar users", who are social media users
engaged in propaganda in support of a political entity. We develop a framework
that can identify such users with 84.4% precision and 76.1% recall. While our
dataset is from the Arab region, omitting language-specific features has only a
minor impact on classification performance, and thus, our approach could work
for detecting seminar users in other parts of the world and in other languages.
We further explored a controversial political topic to observe the prevalence
and potential potency of such users. In our case study, we found that 25% of
the users engaged in the topic are in fact seminar users and their tweets make
nearly a third of the on-topic tweets. Moreover, they are often successful in
affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201
- …