4 research outputs found

    Support Vector Machine for Network Intrusion and Cyber-Attack Detection

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Cyber-security threats are a growing concern in networked environments. The development of Intrusion Detection Systems (IDSs) is fundamental in order to provide extra level of security. We have developed an unsupervised anomaly-based IDS that uses statistical techniques to conduct the detection process. Despite providing many advantages, anomaly-based IDSs tend to generate a high number of false alarms. Machine Learning (ML) techniques have gained wide interest in tasks of intrusion detection. In this work, Support Vector Machine (SVM) is deemed as an ML technique that could complement the performance of our IDS, providing a second line of detection to reduce the number of false alarms, or as an alternative detection technique. We assess the performance of our IDS against one-class and two-class SVMs, using linear and non-linear forms. The results that we present show that linear two-class SVM generates highly accurate results, and the accuracy of the linear one-class SVM is very comparable, and it does not need training datasets associated with malicious data. Similarly, the results evidence that our IDS could benefit from the use of ML techniques to increase its accuracy when analysing datasets comprising of non-homogeneous features

    Intrusion Detection Using Honeypot and Support Vector Machine Classifier

    Get PDF
    The rapid growth of internet and web based applications has given rise to the number of attacks on the network. The way the attacker attacks the system differs from one attacker to the other. The sequence of attack or the signature of an attacker should be stored, analyzed and used to generate rules for mitigating future attack attempts. We have deployed honeypot to record the activities of the attacker. While the attacker prepares for an attack, the IDS redirects him to the honeypot. We make the attacker believe that he is working with the actual system. The activities related to the attack are recorded by the honeypot by interacting with the intruder. The recorded activities are analyzed by the network administrator, and the rule database is updated. As a result, we improve the detection accuracy and security of the system using honeypot without any loss or damage to the original system. As the number of threats to the information is increasing, there is a need for a powerful intrusion detection system that can actually fulfil the requirement of security against the threat. This type of security can be achieved by identifying the particular type of attack. The classification of attack activities ensures the efficient countermeasure for the attack. The work focuses on the classification of attack using multiclass support vector machine approach. The support vector machine is used for binary classification. This approach is extended to the multiclass classification of attack with improved accuracy of classification. We have used three benchmark datasets for training and testing purpose: KDD corrected dataset, NSLKDD dataset, Gure KDD dataset. We have also compared the results with existing work. The evaluation gives better accuracy for detection of attack than the existing approach. The evaluation provides better accuracy for detection of attack than the existing approac

    Abstraction, aggregation and recursion for generating accurate and simple classifiers

    Get PDF
    An important goal of inductive learning is to generate accurate and compact classifiers from data. In a typical inductive learning scenario, instances in a data set are simply represented as ordered tuples of attribute values. In our research, we explore three methodologies to improve the accuracy and compactness of the classifiers: abstraction, aggregation, and recursion;Firstly, abstraction is aimed at the design and analysis of algorithms that generate and deal with taxonomies for the construction of compact and robust classifiers. In many applications of the data-driven knowledge discovery process, taxonomies have been shown to be useful in constructing compact, robust, and comprehensible classifiers. However, in many application domains, human-designed taxonomies are unavailable. We introduce algorithms for automated construction of taxonomies inductively from both structured (such as UCI Repository) and unstructured (such as text and biological sequences) data. We introduce AVT-Learner, an algorithm for automated construction of attribute value taxonomies (AVT) from data, and Word Taxonomy Learner (WTL), an algorithm for automated construction of word taxonomy from text and sequence data. We describe experiments on the UCI data sets and compare the performance of AVT-NBL (an AVT-guided Naive Bayes Learner) with that of the standard Naive Bayes Learner (NBL). Our results show that the AVTs generated by AVT-Learner are compeitive with human-generated AVTs (in cases where such AVTs are available). AVT-NBL using AVTs generated by AVT-Learner achieves classification accuracies that are comparable to or higher than those obtained by NBL; and the resulting classifiers are significantly more compact than those generated by NBL. Similarly, our experimental results of WTL and WTNBL on protein localization sequences and Reuters newswire text categorization data sets show that the proposed algorithms can generate Naive Bayes classifiers that are more compact and often more accurate than those produced by standard Naive Bayes learner for the Multinomial Model;Secondly, we apply aggregation to construct features as a multiset of values for the intrusion detection task. For this task, we propose a bag of system calls representation for system call traces and describe misuse and anomaly detection results on the University of New Mexico (UNM) and MIT Lincoln Lab (MIT LL) system call sequences with the proposed representation. With the feature representation as input, we compare the performance of several machine learning techniques for misuse detection and show experimental results on anomaly detection. The results show that standard machine learning and clustering techniques using the simple bag of system calls representation based on the system call traces generated by the operating system\u27s kernel is effective and often performs better than approaches that use foreign contiguous sequences in detecting intrusive behaviors of compromised processes;Finally, we construct a set of classifiers by recursive application of the Naive Bayes learning algorithms. Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe recursive Naive Bayes learner (RNBL), which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on an event model (one model for each class at each node in the tree). In our experiments on protein sequences, Reuters newswire documents and UC-Irvine benchmark data sets, we observe that RNBL substantially outperforms NB classifier. Furthermore, our experiments on the protein sequences and the text documents show that RNBL outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information

    Analysis and Classification of Current Trends in Malicious HTTP Traffic

    Get PDF
    Web applications are highly prone to coding imperfections which lead to hacker-exploitable vulnerabilities. The contribution of this thesis includes detailed analysis of malicious HTTP traffic based on data collected from four advertised high-interaction honeypots, which hosted different Web applications, each in duration of almost four months. We extract features from Web server logs that characterize malicious HTTP sessions in order to present them as data vectors in four fully labeled datasets. Our results show that the supervised learning methods, Support Vector Machines (SVM) and Decision Trees based J48 and PART, can be used to efficiently distinguish attack sessions from vulnerability scan sessions, as well as efficiently classify twenty-two different types of malicious activities with high probability of detection and very low probability of false alarms for most cases. Furthermore, feature selection methods can be used to select important features in order to improve the computational complexity of the learners
    corecore