4 research outputs found

    AntispamLab - A Tool for Realistic Evaluation of Email Spam Filters

    Get PDF
    The existing tools for testing spam filters evaluate a filter instance by simply feeding it with a stream of emails, possibly also providing a feedback to the filter about the correctness of the detection. In such a scenario the evaluated filter is disconnected from the network of email servers, filters, and users, which makes the approach inappropriate for testing many of the filters that exploit some of the information about spam bulkiness, users' actions and social relations among the users. Corresponding evaluation results might be wrong, because the information that is normally used by the filter is missing, incomplete or inappropriate. In this paper we present a tool for testing spam filters in a very realistic scenario. Our tool consists of a set of Python scripts for unix/linux environment. The tool takes as inputs the filter to be tested and an affordable set of interconnected machines (e.g., PlanetLab machines, or locally created virtual machines). When started from a central place, the tool uses the provided machines to build a network of real email servers, installs instances of the filter, deploys and runs simulated email users and spammers, and computes the detection results statistic. Email servers are implemented using Postfix, a standard linux email server. Only per-email-server filters are currently supported, whereas per-email-client filters testing would require additional tool development. The size of the created emailing network is constrained only by the number of available PlanetLab or virtual machines. The run time is much shorter then the simulated system time, due to a time scaling mechanism. Testing a new filter is as simple as installing one copy of it in a real emailing network, which unifies the jobs of a new filter development, testing and prototyping. As a usage example, we test the SpamAssassin filter

    Artificial Immune Systems for Spam Detection

    Get PDF
    Tato práce se zabývá tvorbou hybridního systému, založeného na agregaci umělého imunitního systému s vhodnými heuristikami, aby co nejúspěšněji detekoval nevyžádanou poštu. V práci jsou popsány hlavní principy biologického a umělého imunitního systému,  klasické techniky pro rozpoznávání spamu včetně několika klasifikátorů. Navržený systém je testován na známých datových korpusech a je provedeno srovnání výsledných experimentů.This work deals with creating a hybrid system based on the aggregation of artificial immune system with appropriate heuristics to make the most effective spam detection. This work describes the main principles of biological and artificial immune system and conventional techniques to detect spam including several classifiers. The developed system is tested using well known database corpuses and a comparison of the final experiments is made.

    Artificial immune system for the Internet

    Get PDF
    We investigate the usability of the Artificial Immune Systems (AIS) approach for solving selected problems in computer networks. Artificial immune systems are created by using the concepts and algorithms inspired by the theory of how the Human Immune System (HIS) works. We consider two applications: detection of routing misbehavior in mobile ad hoc networks, and email spam filtering. In mobile ad hoc networks the multi-hop connectivity is provided by the collaboration of independent nodes. The nodes follow a common protocol in order to build their routing tables and forward the packets of other nodes. As there is no central control, some nodes may defect to follow the common protocol, which would have a negative impact on the overall connectivity in the network. We build an AIS for the detection of routing misbehavior by directly mapping the standard concepts and algorithms used for explaining how the HIS works. The implementation and evaluation in a simulator shows that the AIS mimics well most of the effects observed in the HIS, e.g. the faster secondary reaction to the already encountered misbehavior. However, its effectiveness and practical usability are very constrained, because some particularities of the problem cannot be accounted for by the approach, and because of the computational constrains (reported also in AIS literature) of the used negative selection algorithm. For the spam filtering problem, we apply the AIS concepts and algorithms much more selectively and in a less standard way, and we obtain much better results. We build the AIS for antispam on top of a standard technique for digest-based collaborative email spam filtering. We notice un advantageous and underemphasized technological difference between AISs and the HIS, and we exploit this difference to incorporate the negative selection in an innovative and computationally efficient way. We also improve the representation of the email digests used by the standard collaborative spam filtering scheme. We show that this new representation and the negative selection, when used together, improve significantly the filtering performance of the standard scheme on top of which we build our AIS. Our complete AIS for antispam integrates various innate and adaptive AIS mechanisms, including the mentioned specific use of the negative selection and the use of innate signalling mechanisms (PAMP and danger signals). In this way the AIS takes into account users' profiles, implicit or explicit feedback from the users, and the bulkiness of spam. We show by simulations that the overall AIS is very good both in detecting spam and in avoiding misdetection of good emails. Interestingly, both the innate and adaptive mechanisms prove to be crucial for achieving the good overall performance. We develop and test (within a simulator) our AIS for collaborative spam filtering in the case of email communications. The solution however seems to be well applicable to other types of Internet communications: Internet telephony, chat/sms, forum, news, blog, or web. In all these cases, the aim is to allow the wanted communications (content) and prevent those unwanted from reaching the end users and occupying their time and communication resources. The filtering problems, faced or likely to be faced in the near future by these applications, have or are likely to have the settings similar to those that we have in the email case: need for openness to unknown senders (creators of content, initiators of the communication), bulkiness in receiving spam (many recipients are usually affected by the same spam content), tolerance of the system to a small damage (to small amounts of unfiltered spam), possibility to implicitly or explicitly and in a cheap way obtain a feedback from the recipients about the damage (about spam that they receive), need for strong tolerance to wanted (non-spam) content. Our experiments with the email spam filtering show that our AIS, i.e. the way how we build it, is well fitted to such problem settings
    corecore