Search CORE

12,357 research outputs found

Making the Most of Tweet-Inherent Features for Social Spam Detection on Twitter

Author: Liakata Maria
Procter Rob
Wang Bo
Zubiaga Arkaitz
Publication venue
Publication date: 01/01/2015
Field of study

Social spam produces a great amount of noise on social media services such as Twitter, which reduces the signal-to-noise ratio that both end users and data mining applications observe. Existing techniques on social spam detection have focused primarily on the identification of spam accounts by using extensive historical and network-based data. In this paper we focus on the detection of spam tweets, which optimises the amount of data that needs to be gathered by relying only on tweet-inherent features. This enables the application of the spam detection system to a large set of tweets in a timely fashion, potentially applicable in a real-time or near real-time setting. Using two large hand-labelled datasets of tweets containing spam, we study the suitability of five classification algorithms and four different feature sets to the social spam detection task. Our results show that, by using the limited set of features readily available in a tweet, we can achieve encouraging results which are competitive when compared against existing spammer detection systems that make use of additional, costly user features. Our study is the first that attempts at generalising conclusions on the optimal classifiers and sets of features for social spam detection over different datasets

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Analyzing the Social Structure and Dynamics of E-mail and Spam in Massive Backbone Internet Traffic

Author: Moradi Farnaz
Olovsson Tomas
Tsigas Philippas
Publication venue
Publication date: 01/01/2010
Field of study

E-mail is probably the most popular application on the Internet, with everyday business and personal communications dependent on it. Spam or unsolicited e-mail has been estimated to cost businesses significant amounts of money. However, our understanding of the network-level behavior of legitimate e-mail traffic and how it differs from spam traffic is limited. In this study, we have passively captured SMTP packets from a 10 Gbit/s Internet backbone link to construct a social network of e-mail users based on their exchanged e-mails. The focus of this paper is on the graph metrics indicating various structural properties of e-mail networks and how they evolve over time. This study also looks into the differences in the structural and temporal characteristics of spam and non-spam networks. Our analysis on the collected data allows us to show several differences between the behavior of spam and legitimate e-mail traffic, which can help us to understand the behavior of spammers and give us the knowledge to statistically model spam traffic on the network-level in order to complement current spam detection techniques.Comment: 15 pages, 20 figures, technical repor

arXiv.org e-Print Archive

Chalmers Research

Chalmers Publication Library

Evaluating Third-Party Bad Neighborhood Blacklists for Spam Detection

Author: Moreira Moura Giovane C.M.
Pras Aiko
Sadre Ramin
Sperotto Anna
Publication venue: IEEE Communications Society
Publication date: 01/01/2013
Field of study

The distribution of malicious hosts over the IP address space is far from being uniform. In fact, malicious hosts tend to be concentrate in certain portions of the IP address space, forming the so-called Bad Neighborhoods. This phenomenon has been previously exploited to filter Spam by means of Bad Neighborhood blacklists. In this paper, we evaluate how much a network administrator can rely upon different Bad Neighborhood blacklists generated by third-party sources to fight Spam. One could expect that Bad Neighborhood blacklists generated from different sources contain, to a varying degree, disjoint sets of entries. Therefore, we investigate (i) how specific a blacklist is to its source, and (ii) whether different blacklists can be interchangeably used to protect a target from Spam. We analyze five Bad Neighborhood blacklists generated from real-world measurements and study their effectiveness in protecting three production mail servers from Spam. Our findings lead to several operational considerations on how a network administrator could best benefit from Bad Neighborhood-based Spam filtering

University of Twente Research Information

Artificial intelligence in the cyber domain: Offense and defense

Author: Diep Quoc Bao
Truong Thanh Cong
Zelinka Ivan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

Multidisciplinary Digital Publishing Institute

DSpace at VSB Technical University of Ostrava