Search CORE

7 research outputs found

Detecting spam relays by SMTP traffic characteristics using an autonomous detection system

Author: Hao Wu (65943)
Publication venue
Publication date: 01/01/2011
Field of study

Spam emails are flooding the Internet. Research to prevent spam is an ongoing concern. SMTP traffic was collected from different sources in real networks and analyzed to determine the difference regarding SMTP traffic characteristics of legitimate email clients, legitimate email servers and spam relays. It is found that SMTP traffic from legitimate sites and non-legitimate sites are different and could be distinguished from each other. Some methods, which are based on analyzing SMTP traffic characteristics, were purposed to identify spam relays in the network in this thesis. An autonomous combination system, in which machine learning technologies were employed, was developed to identify spam relays in this thesis. This system identifies spam relays in real time before spam emails get to an end user by using SMTP traffic characteristics never involving email real content. A series of tests were conducted to evaluate the performance of this system. And results show that the system can identify spam relays with a high spam relay detection rate and an acceptable ratio of false positive errors

Loughborough University Institutional Repository

Spam Classification Using Machine Learning Techniques - Sinespam

Author: Norte Sosa José
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2010
Field of study

Most e-mail readers spend a non-trivial amount of time regularly deleting junk e-mail (spam) messages, even as an expanding volume of such e-mail occupies server storage space and consumes network bandwidth. An ongoing challenge, therefore, rests within the development and refinement of automatic classifiers that can distinguish legitimate e-mail from spam. Some published studies have examined spam detectors using Naïve Bayesian approaches and large feature sets of binary attributes that determine the existence of common keywords in spam, and many commercial applications also use Naïve Bayesian techniques. Spammers recognize these attempts to prevent their messages and have developed tactics to circumvent these filters, but these evasive tactics are themselves patterns that human readers can often identify quickly. This work had the objectives of developing an alternative approach using a neural network (NN) classifier brained on a corpus of e-mail messages from several users. The features selection used in this work is one of the major improvements, because the feature set uses descriptive characteristics of words and messages similar to those that a human reader would use to identify spam, and the model to select the best feature set, was based on forward feature selection. Another objective in this work was to improve the spam detection near 95% of accuracy using Artificial Neural Networks; actually nobody has reached more than 89% of accuracy using ANN

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Detecting Abnormal Behavior in Web Applications

Author: Chu Zi
Publication venue: W&M ScholarWorks
Publication date: 01/01/2012
Field of study

The rapid advance of web technologies has made the Web an essential part of our daily lives. However, network attacks have exploited vulnerabilities of web applications, and caused substantial damages to Internet users. Detecting network attacks is the first and important step in network security. A major branch in this area is anomaly detection. This dissertation concentrates on detecting abnormal behaviors in web applications by employing the following methodology. For a web application, we conduct a set of measurements to reveal the existence of abnormal behaviors in it. We observe the differences between normal and abnormal behaviors. By applying a variety of methods in information extraction, such as heuristics algorithms, machine learning, and information theory, we extract features useful for building a classification system to detect abnormal behaviors.;In particular, we have studied four detection problems in web security. The first is detecting unauthorized hotlinking behavior that plagues hosting servers on the Internet. We analyze a group of common hotlinking attacks and web resources targeted by them. Then we present an anti-hotlinking framework for protecting materials on hosting servers. The second problem is detecting aggressive behavior of automation on Twitter. Our work determines whether a Twitter user is human, bot or cyborg based on the degree of automation. We observe the differences among the three categories in terms of tweeting behavior, tweet content, and account properties. We propose a classification system that uses the combination of features extracted from an unknown user to determine the likelihood of being a human, bot or cyborg. Furthermore, we shift the detection perspective from automation to spam, and introduce the third problem, namely detecting social spam campaigns on Twitter. Evolved from individual spammers, spam campaigns manipulate and coordinate multiple accounts to spread spam on Twitter, and display some collective characteristics. We design an automatic classification system based on machine learning, and apply multiple features to classifying spam campaigns. Complementary to conventional spam detection methods, our work brings efficiency and robustness. Finally, we extend our detection research into the blogosphere to capture blog bots. In this problem, detecting the human presence is an effective defense against the automatic posting ability of blog bots. We introduce behavioral biometrics, mainly mouse and keyboard dynamics, to distinguish between human and bot. By passively monitoring user browsing activities, this detection method does not require any direct user participation, and improves the user experience

College of William & Mary: W&M Publish

Pieniä tietojenkäsittelytieteellisiä tutkimuksia - Kevät 2007

Author
Publication venue: Tampereen yliopisto
Publication date: 01/01/2007
Field of study

Trepo - Institutional Repository of Tampere University

Towards secure message systems

Author: Xie Mengjun
Publication venue: W&M ScholarWorks
Publication date: 01/01/2010
Field of study

Message systems, which transfer information from sender to recipient via communication networks, are indispensable to our modern society. The enormous user base of message systems and their critical role in information delivery make it the top priority to secure message systems. This dissertation focuses on securing the two most representative and dominant messages systems---e-mail and instant messaging (IM)---from two complementary aspects: defending against unwanted messages and ensuring reliable delivery of wanted messages.;To curtail unwanted messages and protect e-mail and instant messaging users, this dissertation proposes two mechanisms DBSpam and HoneyIM, which can effectively thwart e-mail spam laundering and foil malicious instant message spreading, respectively. DBSpam exploits the distinct characteristics of connection correlation and packet symmetry embedded in the behavior of spam laundering and utilizes a simple statistical method, Sequential Probability Ratio Test, to detect and break spam laundering activities inside a customer network in a timely manner. The experimental results demonstrate that DBSpam is effective in quickly and accurately capturing and suppressing e-mail spam laundering activities and is capable of coping with high speed network traffic. HoneyIM leverages the inherent characteristic of spreading of IM malware and applies the honey-pot technology to the detection of malicious instant messages. More specifically, HoneyIM uses decoy accounts in normal users\u27 contact lists as honey-pots to capture malicious messages sent by IM malware and suppresses the spread of malicious instant messages by performing network-wide blocking. The efficacy of HoneyIM has been validated through both simulations and real experiments.;To improve e-mail reliability, that is, prevent losses of wanted e-mail, this dissertation proposes a collaboration-based autonomous e-mail reputation system called CARE. CARE introduces inter-domain collaboration without central authority or third party and enables each e-mail service provider to independently build its reputation database, including frequently contacted and unacquainted sending domains, based on the local e-mail history and the information exchanged with other collaborating domains. The effectiveness of CARE on improving e-mail reliability has been validated through a number of experiments, including a comparison of two large e-mail log traces from two universities, a real experiment of DNS snooping on more than 36,000 domains, and extensive simulation experiments in a large-scale environment

College of William & Mary: W&M Publish

Cybersecurity and safety analysis in online social networks

Author: Liu Donghai
Publication venue: Deakin University, Faculty of Science Engineering and Built Environment, School of Information Technology
Publication date: 10/02/2018
Field of study

The research work deal with the security and safety issues related to the use of online social networks and it successfully presented AI-based solutions to address these issues in online social networks

Deakin Research Online

Recommended from our members

An Approach Towards Self-Supervised Classification Using Cyc

Author: Coursey Kino High
Publication venue: 'University of North Texas Libraries'
Publication date: 01/12/2006
Field of study

Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work

UNT Digital Library