2,945 research outputs found

    Application of bagging, boosting and stacking to intrusion detection

    No full text
    This paper investigates the possibility of using ensemble algorithms to improve the performance of network intrusion detection systems. We use an ensemble of three different methods, bagging, boosting and stacking, in order to improve the accuracy and reduce the false positive rate. We use four different data mining algorithms, naïve bayes, J48 (decision tree), JRip (rule induction) and iBK( nearest neighbour), as base classifiers for those ensemble methods. Our experiment shows that the prototype which implements four base classifiers and three ensemble algorithms achieves an accuracy of more than 99% in detecting known intrusions, but failed to detect novel intrusions with the accuracy rates of around just 60%. The use of bagging, boosting and stacking is unable to significantly improve the accuracy. Stacking is the only method that was able to reduce the false positive rate by a significantly high amount (46.84%); unfortunately, this method has the longest execution time and so is insufficient to implement in the intrusion detection fiel

    TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-based Intrusion Detection System

    Get PDF
    Intrusion detection systems (IDS) play a pivotal role in computer security by discovering and repealing malicious activities in computer networks. Anomaly-based IDS, in particular, rely on classification models trained using historical data to discover such malicious activities. In this paper, an improved IDS based on hybrid feature selection and two-level classifier ensembles is proposed. An hybrid feature selection technique comprising three methods, i.e. particle swarm optimization, ant colony algorithm, and genetic algorithm, is utilized to reduce the feature size of the training datasets (NSL-KDD and UNSW-NB15 are considered in this paper). Features are selected based on the classification performance of a reduced error pruning tree (REPT) classifier. Then, a two-level classifier ensembles based on two meta learners, i.e., rotation forest and bagging, is proposed. On the NSL-KDD dataset, the proposed classifier shows 85.8% accuracy, 86.8% sensitivity, and 88.0% detection rate, which remarkably outperform other classification techniques recently proposed in the literature. Results regarding the UNSW-NB15 dataset also improve the ones achieved by several state of the art techniques. Finally, to verify the results, a two-step statistical significance test is conducted. This is not usually considered by IDS research thus far and, therefore, adds value to the experimental results achieved by the proposed classifier

    In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats

    Get PDF
    This paper describes the process and results of analyzing CICIDS2017, a modern, labeled data set for testing intrusion detection systems. The data set is divided into several days, each pertaining to different attack classes (Dos, DDoS, infiltration, botnet, etc.). A pipeline has been created that includes nine supervised learning algorithms. The goal was binary classification of benign versus attack traffic. Cross-validated parameter optimization, using a voting mechanism that includes five classification metrics, was employed to select optimal parameters. These results were interpreted to discover whether certain parameter choices were dominant for most (or all) of the attack classes. Ultimately, every algorithm was retested with optimal parameters to obtain the final classification scores. During the review of these results, execution time, both on consumerand corporate-grade equipment, was taken into account as an additional requirement. The work detailed in this paper establishes a novel supervised machine learning performance baseline for CICIDS2017

    A novel ensemble modeling for intrusion detection system

    Get PDF
    Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs. And hence our system is robust and efficient

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
    corecore