Search CORE

732 research outputs found

Let Your CyberAlter Ego Share Information and Manage Spam

Author: Boykin P. Oscar
Kong Joseph S.
Rezaei Behnam A.
Roychowdhury Vwani P.
Sarshar Nima
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/05/2005
Field of study

Almost all of us have multiple cyberspace identities, and these {\em cyber}alter egos are networked together to form a vast cyberspace social network. This network is distinct from the world-wide-web (WWW), which is being queried and mined to the tune of billions of dollars everyday, and until recently, has gone largely unexplored. Empirically, the cyberspace social networks have been found to possess many of the same complex features that characterize its real counterparts, including scale-free degree distributions, low diameter, and extensive connectivity. We show that these topological features make the latent networks particularly suitable for explorations and management via local-only messaging protocols. {\em Cyber}alter egos can communicate via their direct links (i.e., using only their own address books) and set up a highly decentralized and scalable message passing network that can allow large-scale sharing of information and data. As one particular example of such collaborative systems, we provide a design of a spam filtering system, and our large-scale simulations show that the system achieves a spam detection rate close to 100%, while the false positive rate is kept around zero. This system has several advantages over other recent proposals (i) It uses an already existing network, created by the same social dynamics that govern our daily lives, and no dedicated peer-to-peer (P2P) systems or centralized server-based systems need be constructed; (ii) It utilizes a percolation search algorithm that makes the query-generated traffic scalable; (iii) The network has a built in trust system (just as in social networks) that can be used to thwart malicious attacks; iv) It can be implemented right now as a plugin to popular email programs, such as MS Outlook, Eudora, and Sendmail.Comment: 13 pages, 10 figure

arXiv.org e-Print Archive

Crossref

"May I borrow Your Filter?" Exchanging Filters to Combat Spam in a Community

Author: Battiti Roberto
Cascella Roberto G.
Garg Anurag
Publication venue
Publication date: 01/11/2005
Field of study

Leveraging social networks in computer systems can be effective in dealing with a number of trust and security issues. Spam is one such issue where the "wisdom of crowds" can be harnessed by mining the collective knowledge of ordinary individuals. In this paper, we present a mechanism through which members of a virtual community can exchange information to combat spam. Previous attempts at collaborative spam filtering have concentrated on digest-based indexing techniques to share digests or fingerprints of emails that are known to be spam. We take a different approach and allow users to share their spam filters instead, thus dramatically reducing the amount of traffic generated in the network. The resultant diversity in the filters and cooperation in a community allows it to respond to spam in an autonomic fashion. As a test case for exchanging filters we use the popular SpamAssassin spam filtering software and show that exchanging spam filters provides an alternative method to improve spam filtering performance

Unitn-eprints Research

A collaborative approach for spam detection

Author: Cortez Paulo
Machado Artur
Rio Miguel
Rocha Miguel
Sousa Pedro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Electronic mail is nowadays one of the most important Internet networking services. However, there are still many challenges that should be faced in order to provide a better e-mail service quality, such as the growing dissemination of unsolicited e-mail (spam) over the Internet. This work aims to foster new research efforts giving ground to the development of novel collaborative approaches to deal with spam proliferation. Using the proposed system, which is able to complement other anti-spam solutions, end-users are allowed to share and combine spam filters in a flexible way, increasing the accuracy and resilience levels of anti-spam techniques.(undefined

Universidade do Minho: RepositoriUM

UCL Discovery

Symbiotic filtering for spam email detection

Author: Cortez Paulo
Lopes Clotilde
Rio Miguel
Rocha Miguel
Sousa Pedro
Publication venue: 'Elsevier BV'
Publication date: 01/08/2011
Field of study

This paper presents a novel spam filtering technique called Symbiotic Filtering (SF) that aggregates distinct local filters from several users to improve the overall perfor- mance of spam detection. SF is an hybrid approach combining some features from both Collaborative (CF) and Content-Based Filtering (CBF). It allows for the use of social networks to personalize and tailor the set of filters that serve as input to the filtering. A comparison is performed against the commonly used Naive Bayes CBF algorithm. Several experiments were held with the well-known Enron data, under both fixed and incremental symbiotic groups. We show that our system is competitive in performance and is robust against both dictionary and focused con- tamination attacks. Moreover, it can be implemented and deployed with few effort and low communication costs, while assuring privacy.Fundação para a Ciência e a Tecnologia (FCT) - bolsa PTDC/EIA/64541/200

Universidade do Minho: RepositoriUM

UCL Discovery

Recommended from our members

MapReduce based RDF assisted distributed SVM for high throughput spam filtering

Author: Caruana Godwin
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses. Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart. Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure. The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin

Brunel University Research Archive

Spam filtering based on preference ranking

Author: Lan Mingjun
Zhou Wanlei
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

When the average number of spam messages received is continually increasing exponentially, both the Internet service provider and the end user suffer. The lack of an efficient solution may threaten the usability of the email as a communication means. In this paper we present a filtering mechanism applying the idea of preference ranking. This filtering mechanism will distinguish spam emails from other email on the Internet. The preference ranking gives the similarity values for nominated emails and spam emails specified by users, so that the ISP/end users can deal with spam emails at filtering points. We designed three filtering points to classify nominated emails into spam email, unsure email and legitimate email. This filtering mechanism can be applied on both middleware and at the client-side. The experiments show that high precision, recall and TCR (total cost ratio) of spam emails can be predicted for the preference based filtering mechanisms. <br /

Deakin Research Online

Cognitive Spam Recognition Using Hadoop and Multicast-Update

Author: K. Chandrasekaran
Mukund YR
Sunil Sandeep Nayak
Publication venue: RonPub
Publication date: 01/01/2015
Field of study

In today's world of exponentially growing technology, spam is a very common issue faced by users on the internet. Spam not only hinders the performance of a network, but it also wastes space and time, and causes general irritation and presents a multitude of dangers - of viruses, malware, spyware and consequent system failure, identity theft, and other cyber criminal activity. In this context, cognition provides us with a method to help improve the performance of the distributed system. It enables the system to learn what it is supposed to do for different input types as different classifications are made over time and this learning helps it increase its accuracy as time passes. Each system on its own can only do so much learning, because of the limited sample set of inputs that it gets to process. However, in a network, we can make sure that every system knows the different kinds of inputs available and learns what it is supposed to do with a better success rate. Thus, distribution and combination of this cognition across different components of the network leads to an overall improvement in the performance of the system. In this paper, we describe a method to make machines cognitively label spam using Machine Learning and the Naive Bayesian approach. We also present two possible methods of implementation - using a MapReduce Framework (hadoop), and also using messages coupled with a multicast-send based network - with their own subtypes, and the pros and cons of each. We finally present a comparative analysis of the two main methods and provide a basic idea about the usefulness of the two in various different scenarios

RonPub -- Research Online Publishing

On the use of Locality for Improving SVM-Based Spam Filtering

Author: Longe O. B.
Ojo F. O.
Okesola J. O.
Publication venue
Publication date: 01/01/2015
Field of study

Recent growths in the use of email for communication and the corresponding growths in the volume of email received have made automatic processing of emails desirable. In tandem is the prevailing problem of Advance Fee fraud E-mails that pervades inboxes globally. These genres of e-mails solicit for financial transactions and funds transfers from unsuspecting users. Most modern mail-reading software packages provide some forms of programmable automatic filtering, typically in the form of sets of rules that file or otherwise dispose mails based on keywords detected in the headers or message body. Unfortunately programming these filters is an arcane and sometimes inefficient process. An adaptive mail system which can learn its users’ mail sorting preferences would therefore be more desirable. Premised on the work of Blanzieri & Bryl (2007), we proposes a framework dedicated to the phenomenon of locality in email data analysis of advance fee fraud e-mails which engages Support Vector Machines (SVM) classifier for building local decision rules into the classification process of the spam filter design for this genre of e-mails

Covenant University Repository

Towards symbiotic spam e-mail filtering

Author: Cortez Paulo
Lopes Clotilde
Sousa Pedro
Publication venue
Publication date: 01/01/2010
Field of study

This position paper discusses the use of symbiotic filtering, a novel distributed data mining approach that combines contentbased and collaborative filtering for spam detection

Universidade do Minho: RepositoriUM