118 research outputs found
Camouflages and Token Manipulations-The Changing Faces of the Nigerian Fraudulent 419 Spammers
The inefficiencies of current spam filters against fraudulent (419) mails is not unrelated to the use by spammers of good-word
attacks, topic drifts, parasitic spamming, wrong categorization and recategorization of electronic mails by e-mail clients and of
course the fuzzy factors of greed and gullibility on the part of the recipients who responds to fraudulent spam mail offers. In this
paper, we establish that mail token manipulations remain, above any other tactics, the most potent tool used by Nigerian
scammers to fool statistical spam filters. While hoping that the uncovering of these manipulative evidences will prove useful in
future antispam research, our findings also sensitize spam filter developers on the need to inculcate within their antispam
architecture robust modules that can deal with the identified camouflages
Effective Mechanism for Social Recommendation of News
Recommendation systems represent an important tool for news distribution on
the Internet. In this work we modify a recently proposed social recommendation
model in order to deal with no explicit ratings of users on news. The model
consists of a network of users which continually adapts in order to achieve an
efficient news traffic. To optimize network's topology we propose different
stochastic algorithms that are scalable with respect to the network's size.
Agent-based simulations reveal the features and the performance of these
algorithms. To overcome the resultant drawbacks of each method we introduce two
improved algorithms and show that they can optimize network's topology almost
as fast and effectively as other not-scalable methods that make use of much
more information
Preventing Distributed Denial-of-Service Attacks on the IMS Emergency Services Support through Adaptive Firewall Pinholing
Emergency services are vital services that Next Generation Networks (NGNs)
have to provide. As the IP Multimedia Subsystem (IMS) is in the heart of NGNs,
3GPP has carried the burden of specifying a standardized IMS-based emergency
services framework. Unfortunately, like any other IP-based standards, the
IMS-based emergency service framework is prone to Distributed Denial of Service
(DDoS) attacks. We propose in this work, a simple but efficient solution that
can prevent certain types of such attacks by creating firewall pinholes that
regular clients will surely be able to pass in contrast to the attackers
clients. Our solution was implemented, tested in an appropriate testbed, and
its efficiency was proven.Comment: 17 Pages, IJNGN Journa
Spam Classification Using Machine Learning Techniques - Sinespam
Most e-mail readers spend a non-trivial amount of time regularly deleting junk e-mail (spam)
messages, even as an expanding volume of such e-mail occupies server storage space and
consumes network bandwidth. An ongoing challenge, therefore, rests within the development
and refinement of automatic classifiers that can distinguish legitimate e-mail from spam. Some
published studies have examined spam detectors using Naïve Bayesian approaches and large
feature sets of binary attributes that determine the existence of common keywords in spam,
and many commercial applications also use Naïve Bayesian techniques.
Spammers recognize these attempts to prevent their messages and have developed tactics to
circumvent these filters, but these evasive tactics are themselves patterns that human readers
can often identify quickly. This work had the objectives of developing an alternative approach
using a neural network (NN) classifier brained on a corpus of e-mail messages from several
users. The features selection used in this work is one of the major improvements, because the
feature set uses descriptive characteristics of words and messages similar to those that a
human reader would use to identify spam, and the model to select the best feature set, was
based on forward feature selection. Another objective in this work was to improve the spam
detection near 95% of accuracy using Artificial Neural Networks; actually nobody has reached
more than 89% of accuracy using ANN
Distributed Mail Transfer Agent
Technological advances have provided society with the means to easily communicate through several channels, starting off in radio and television stations, moving on through E-mail and SMS, and nowadays targeting Internet surfing through channels such as Google Ads and Webpush notifications. Digital marketing has flooded these channels for product promotion and customer engaging purposes in order to provide the customers with the best the organizations have to offer. E-goi is a web platform whose main objective is to facilitate digital marketing to all its customers, ranging from SMB to Corporate/Enterprise, and aid them to strengthen their relationships with its customers through digital communication. The platform’s most widely used channel is E-mail which is responsible for about fifteen million deliveries per day. The email delivery system currently employed by E-goi is functional and fault-tolerant to a certain degree, however, it has several flaws, such as its monolithic architecture, which is responsible for high hardware usage and lack of layer centralization, and the lack of deliverability related functionalities. This thesis aims to analyze and improve the E-goi’s e-mail delivery system architecture, which represents a critical system and of most importance and value for the product and the company. Business analysis tools will be used in this analysis to prove the value created for the company and its product, aiming at maintenance and infrastructure cost reduction as well as the increment in functionalities, both of which comprise valid points for creating business value. The project main objectives comprise an extensive analysis of the currently employed solution and the context to which it belongs to, followed up by a comparative discussion of currently existent competitors and technologies which may be of aid in the development of a new solution. Moving on, the solution’s functional and non-functional requirements gathering will take place. These requirements will dictate how the solution shall be developed. A thorough analysis of the project’s value will follow, discussing which solution will bring the most value to E-goi as a product and organization. Upon deciding on the best solution, its design will be developed based on the previously gathered requirements and the best software design patterns, and will support the implementation phase which follows. Once implemented, the solution will need to surpass several defined tests and hypothesis which will ensure its performance and robustness. Finally, the conclusion will summarize all the project results and define future work for the newly created solution.O avanço tecnológico forneceu à sociedade a facilidade de comunicação através dos demais canais, começando em rádios e televisões, passando pelo E-mail e SMS, atingindo, hoje em dia, a própria navegação na Internet através dos mais diversos canais como o Google Ads e notificações Webpush. Todos estes canais de comunicação são hoje em dia usados como base da promoção, o marketing digital invadiu estes canais de maneira a conseguir alcançar os mais diversos tipos de clientes e lhes proporcionar o melhor que as organizações têm para oferecer. A E-goi é uma plataforma web que pretende facilitar o marketing digital a todos os seus clientes, desde a PME à Enterprise, e ajudá-los a fortalecer as relações com os seus clientes através de comunicação digital. O canal mais usado da plataforma é o E-mail, totalizando, hoje em dia, cerca de quinze milhões de entregas por dia. O sistema de envio de e-mails usado hoje em dia pelo produto E-goi é funcional e tolerante a falhas até um certo nível, no entanto, apresenta diversas lacunas tanto na arquitetura monolítica do mesmo, responsável por um uso de hardware elevado e falta de centralização de camadas, como em funcionalidades ligadas à entregabilidade. O presente projeto visa a análise e melhoria da arquitetura do sistema de envio de e-mails da plataforma E-goi, um sistema crítico e de alta importância e valor para a empresa. Ao longo desta análise, serão usadas ferramentas de análise de negócio para provar o valor criado para a organização e para o produto com vista à redução de custos de manutenção e infraestrutura bem como o aumento de funcionalidades, ambos pontos válidos na adição de valor organizacional. Os objetivos do projeto passarão por uma análise extensiva da solução presente e do contexto em que a mesma se insere, passando a uma comparação com soluções concorrentes e tecnologias, existentes no mercado de hoje em dia, que possam ajudar no desenvolvimento de uma nova solução. Seguir-se-á um levantamento dos requisitos, tanto funcionais como não-funcionais do sistema que ditarão os moldes sobre os quais o novo sistema deverá ser desenvolvido. Após isto, dar-se-á uma extensa análise do valor do projecto e da solução que mais valor adicionará à E-goi, quer como produto e como organização. De seguida efectuar-se-á o Design da solução com base nos requisitos definidos e nas melhores práticas de engenharia informática, design este que servirá de base à implementação que se dará de seguida e será provada através da elaboração de diversos testes que garantirão a performance, robustez e validade do sistema criado. Finalmente seguir-se-á a conclusão que visa resumir os resultados do projecto e definir trabalho futuro para a solução criada
SNARE: Spatio-temporal Network-level Automatic Reputation Engine
Current spam filtering techniques classify email based on
content and IP reputation blacklists or whitelists. Unfortunately,
spammers can alter spam content to evade content based
filters, and spammers continually change the IP addresses
from which they send spam. Previous work has suggested
that filters based on network-level behavior might be
more efficient and robust, by making decisions based on how
messages are sent, as opposed to what is being sent or who
is sending them.
This paper presents a technique to identify spammers
based on features that exploit the network-level spatio temporal
behavior of email senders to differentiate the spamming
IPs from legitimate senders. Our behavioral classifier
has two benefits: (1) it is early (i.e., it can automatically
detect spam without seeing a large amount of email from
a sending IP address-sometimes even upon seeing only a
single packet); (2) it is evasion-resistant (i.e., it is based on
spatial and temporal features that are difficult for a sender
to change). We build classifiers based on these features using
two different machine learning methods, support vector
machine and decision trees, and we study the efficacy
of these classifiers using labeled data from a deployed commercial
spam-filtering system. Surprisingly, using only features
from a single IP packet header (i.e., without looking at
packet contents), our classifier can identify spammers with
about 93% accuracy and a reasonably low false-positive rate
(about 7%). After looking at a single message spammer
identification accuracy improves to more than 94% with a
false rate of just over 5%. These suggest an effective sender
reputation mechanism
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Towards secure message systems
Message systems, which transfer information from sender to recipient via communication networks, are indispensable to our modern society. The enormous user base of message systems and their critical role in information delivery make it the top priority to secure message systems. This dissertation focuses on securing the two most representative and dominant messages systems---e-mail and instant messaging (IM)---from two complementary aspects: defending against unwanted messages and ensuring reliable delivery of wanted messages.;To curtail unwanted messages and protect e-mail and instant messaging users, this dissertation proposes two mechanisms DBSpam and HoneyIM, which can effectively thwart e-mail spam laundering and foil malicious instant message spreading, respectively. DBSpam exploits the distinct characteristics of connection correlation and packet symmetry embedded in the behavior of spam laundering and utilizes a simple statistical method, Sequential Probability Ratio Test, to detect and break spam laundering activities inside a customer network in a timely manner. The experimental results demonstrate that DBSpam is effective in quickly and accurately capturing and suppressing e-mail spam laundering activities and is capable of coping with high speed network traffic. HoneyIM leverages the inherent characteristic of spreading of IM malware and applies the honey-pot technology to the detection of malicious instant messages. More specifically, HoneyIM uses decoy accounts in normal users\u27 contact lists as honey-pots to capture malicious messages sent by IM malware and suppresses the spread of malicious instant messages by performing network-wide blocking. The efficacy of HoneyIM has been validated through both simulations and real experiments.;To improve e-mail reliability, that is, prevent losses of wanted e-mail, this dissertation proposes a collaboration-based autonomous e-mail reputation system called CARE. CARE introduces inter-domain collaboration without central authority or third party and enables each e-mail service provider to independently build its reputation database, including frequently contacted and unacquainted sending domains, based on the local e-mail history and the information exchanged with other collaborating domains. The effectiveness of CARE on improving e-mail reliability has been validated through a number of experiments, including a comparison of two large e-mail log traces from two universities, a real experiment of DNS snooping on more than 36,000 domains, and extensive simulation experiments in a large-scale environment
- …