558 research outputs found
Machine Learning Approaches for Modeling Spammer Behavior
Spam is commonly known as unsolicited or unwanted email messages in the
Internet causing potential threat to Internet Security. Users spend a valuable
amount of time deleting spam emails. More importantly, ever increasing spam
emails occupy server storage space and consume network bandwidth. Keyword-based
spam email filtering strategies will eventually be less successful to model
spammer behavior as the spammer constantly changes their tricks to circumvent
these filters. The evasive tactics that the spammer uses are patterns and these
patterns can be modeled to combat spam. This paper investigates the
possibilities of modeling spammer behavioral patterns by well-known
classification algorithms such as Na\"ive Bayesian classifier (Na\"ive Bayes),
Decision Tree Induction (DTI) and Support Vector Machines (SVMs). Preliminary
experimental results demonstrate a promising detection rate of around 92%,
which is considerably an enhancement of performance compared to similar spammer
behavior modeling research.Comment: 12 pages, 3 figures, 5 tables, Submitted to AIRS 201
Probabilistic Matching: Causal Inference under Measurement Errors
The abundance of data produced daily from large variety of sources has
boosted the need of novel approaches on causal inference analysis from
observational data. Observational data often contain noisy or missing entries.
Moreover, causal inference studies may require unobserved high-level
information which needs to be inferred from other observed attributes. In such
cases, inaccuracies of the applied inference methods will result in noisy
outputs. In this study, we propose a novel approach for causal inference when
one or more key variables are noisy. Our method utilizes the knowledge about
the uncertainty of the real values of key variables in order to reduce the bias
induced by noisy measurements. We evaluate our approach in comparison with
existing methods both on simulated and real scenarios and we demonstrate that
our method reduces the bias and avoids false causal inference conclusions in
most cases.Comment: In Proceedings of International Joint Conference Of Neural Networks
(IJCNN) 201
Modeling Spammer Behavior: Artificial Neural Network vs. Naïve Bayesian Classifier
The exponential growth of spam emails in recent years is a fact of life. Internet subscribers world-wide are unwittingly paying an estimated €10 billion a year in connection costs just to receive “junk” emails, according to a study undertaken for the European Commission. Though there is no universal definition of spam, unwanted and unsolicited commercial email as a mass mailing to a large number of recipients is basically known as the junk email or spam to the internet community. Spams are considered to be potential threat to Internet Security. Spam's direct effects include the consumption of computer and network resources and the cost in human time and attention of dismissing unwanted messages. More importantly, these ever increasing spams are taking various forms and finding home not only in mailboxes but also in newsgroups, discussion forums etc without the consent of the recipients. Overflowing mailboxes are overwhelming users, causing newsgroups and discussion forums to be flooded with irrelevant or inappropriate messages. As a consequence, users are getting discouraged not to use them anymore though these systems can provide numerous benefits to them.Full Tex
The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race
Recent studies in social media spam and automation provide anecdotal
argumentation of the rise of a new generation of spambots, so-called social
spambots. Here, for the first time, we extensively study this novel phenomenon
on Twitter and we provide quantitative evidence that a paradigm-shift exists in
spambot design. First, we measure current Twitter's capabilities of detecting
the new social spambots. Later, we assess the human performance in
discriminating between genuine accounts, social spambots, and traditional
spambots. Then, we benchmark several state-of-the-art techniques proposed by
the academic literature. Results show that neither Twitter, nor humans, nor
cutting-edge applications are currently capable of accurately detecting the new
social spambots. Our results call for new approaches capable of turning the
tide in the fight against this raising phenomenon. We conclude by reviewing the
latest literature on spambots detection and we highlight an emerging common
research trend based on the analysis of collective behaviors. Insights derived
from both our extensive experimental campaign and survey shed light on the most
promising directions of research and lay the foundations for the arms race
against the novel social spambots. Finally, to foster research on this novel
phenomenon, we make publicly available to the scientific community all the
datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science
Track, Perth, Australia, 3-7 April, 2017
DNA-inspired online behavioral modeling and its application to spambot detection
We propose a strikingly novel, simple, and effective approach to model online
user behavior: we extract and analyze digital DNA sequences from user online
actions and we use Twitter as a benchmark to test our proposal. We obtain an
incisive and compact DNA-inspired characterization of user actions. Then, we
apply standard DNA analysis techniques to discriminate between genuine and
spambot accounts on Twitter. An experimental campaign supports our proposal,
showing its effectiveness and viability. To the best of our knowledge, we are
the first ones to identify and adapt DNA-inspired techniques to online user
behavioral modeling. While Twitter spambot detection is a specific use case on
a specific social media, our proposed methodology is platform and technology
agnostic, hence paving the way for diverse behavioral characterization tasks
- …