4,454 research outputs found
An ontology enhanced parallel SVM for scalable spam filter training
This is the post-print version of the final paper published in Neurocomputing. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart
Social Turing Tests: Crowdsourcing Sybil Detection
As popular tools for spreading spam and malware, Sybils (or fake accounts)
pose a serious threat to online communities such as Online Social Networks
(OSNs). Today, sophisticated attackers are creating realistic Sybils that
effectively befriend legitimate users, rendering most automated Sybil detection
techniques ineffective. In this paper, we explore the feasibility of a
crowdsourced Sybil detection system for OSNs. We conduct a large user study on
the ability of humans to detect today's Sybil accounts, using a large corpus of
ground-truth Sybil accounts from the Facebook and Renren networks. We analyze
detection accuracy by both "experts" and "turkers" under a variety of
conditions, and find that while turkers vary significantly in their
effectiveness, experts consistently produce near-optimal results. We use these
results to drive the design of a multi-tier crowdsourcing Sybil detection
system. Using our user study data, we show that this system is scalable, and
can be highly effective either as a standalone system or as a complementary
technique to current tools
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning
Learning-based pattern classifiers, including deep networks, have shown
impressive performance in several application domains, ranging from computer
vision to cybersecurity. However, it has also been shown that adversarial input
perturbations carefully crafted either at training or at test time can easily
subvert their predictions. The vulnerability of machine learning to such wild
patterns (also referred to as adversarial examples), along with the design of
suitable countermeasures, have been investigated in the research field of
adversarial machine learning. In this work, we provide a thorough overview of
the evolution of this research area over the last ten years and beyond,
starting from pioneering, earlier work on the security of non-deep learning
algorithms up to more recent work aimed to understand the security properties
of deep learning algorithms, in the context of computer vision and
cybersecurity tasks. We report interesting connections between these
apparently-different lines of work, highlighting common misconceptions related
to the security evaluation of machine-learning algorithms. We review the main
threat models and attacks defined to this end, and discuss the main limitations
of current work, along with the corresponding future challenges towards the
design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201
Survey on Security Enhancement at the Design Phase
Pattern classification is a branch of machine learning that focuses on recognition of patterns and regularities in data. In adversarial applications like biometric authentication, spam filtering, network intrusion detection the pattern classification systems are used [6]. In this paper, we have to evaluate the security pattern by classifications based on the files uploaded by the users. We have also proposed the method of spam filtering to prevent the attack of the files from other users. We evaluate our approach for security task of uploading word files and pdf files.
DOI: 10.17762/ijritcc2321-8169.150314
- …