208 research outputs found

    Intensive pre-processing of KDD Cup 99 for network intrusion classification using machine learning techniques

    Get PDF
    © 2019, International Association of Online Engineering. Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanism that used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity and availability of the services. The speed of the IDS is very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The techniques J48, Random Forest, Random Tree, MLP, Naïve Bayes and Bayes Network classifiers have been chosen for this study. It has been proven that the Random forest classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type (DOS, R2L, U2R, and PROBE)

    Recurrent Neural Network Architectures Toward Intrusion Detection

    Get PDF
    Recurrent Neural Networks (RNN) show a remarkable result in sequence learning, particularly in architectures with gated unit structures such as Long Short-term Memory (LSTM). In recent years, several permutations of LSTM architecture have been proposed mainly to overcome the computational complexity of LSTM. In this dissertation, a novel study is presented that will empirically investigate and evaluate LSTM architecture variants such as Gated Recurrent Unit (GRU), Bi-Directional LSTM, and Dynamic-RNN for LSTM and GRU specifically on detecting network intrusions. The investigation is designed to identify the learning time required for each architecture algorithm and to measure the intrusion prediction accuracy. RNN was evaluated on the DARPA/KDD Cup’99 intrusion detection dataset for each architecture. Feature selection mechanisms were also implemented to help in identifying and removing nonessential variables from data that do not affect the accuracy of the prediction models, in this case Principle Component Analysis (PCA) and the RandomForest (RF) algorithm. The results showed that RF captured more significant features over PCA when the accuracy for RF 97.86% for LSTM and 96.59% for GRU, were PCA 64.34% for LSTM and 67.97% for GRU. In terms of RNN architectures, prediction accuracy of each variant exhibited improvement at specific parameters, yet with a large dataset and a suitable time training, the standard vanilla LSTM tended to lead among all other RNN architectures which scored 99.48%. Although Dynamic RNN’s offered better performance with accuracy, Dynamic-RNN GRU scored 99.34%, however they tended to take a longer time to be trained with high training cycles, Dynamic-RNN LSTM needs 25284.03 seconds at 1000 training cycle. GRU architecture had one variant introduced to reduce LSTM complexity, which developed with fewer parameters resulting in a faster-trained model compared to LSTM needs 1903.09 seconds when LSTM required 2354.93 seconds for the same training cycle. It also showed equivalent performance with respect to the parameters such as hidden layers and time-step. BLSTM offered impressive training time as 190 seconds at 100 training cycle, though the accuracy was below that of the other RNN architectures which didn’t exceed 90%

    A Convolutional Neural Network for Network Intrusion Detection System

    Get PDF
    System administrators can benefit from deploying Network Intrusion Detection Systems (NIDS) to find potential security breaches. However, security attacks tend to be unpredictable. There are many challenges to develop a flexible and effective NIDS in order to prevent high false alarm rates and low detection accuracy against unknown attacks. In this paper, we propose a deep learning method to implement an effective and flexible NIDS. We used a convolutional neural network (CNN), an advanced deep learning technique, on NSL-KDD, a benchmark dataset for network intrusion. Our experimental results of a 99.79% detection rate when compared against the NSL-KDD test dataset show that CNNs can be applied as a learning method for Intrusion Detection Systems (IDSs)

    Intrusion detection by machine learning = Behatolás detektálás gépi tanulás által

    Get PDF
    Since the early days of information technology, there have been many stakeholders who used the technological capabilities for their own benefit, be it legal operations, or illegal access to computational assets and sensitive information. Every year, businesses invest large amounts of effort into upgrading their IT infrastructure, yet, even today, they are unprepared to protect their most valuable assets: data and knowledge. This lack of protection was the main reason for the creation of this dissertation. During this study, intrusion detection, a field of information security, is evaluated through the use of several machine learning models performing signature and hybrid detection. This is a challenging field, mainly due to the high velocity and imbalanced nature of network traffic. To construct machine learning models capable of intrusion detection, the applied methodologies were the CRISP-DM process model designed to help data scientists with the planning, creation and integration of machine learning models into a business information infrastructure, and design science research interested in answering research questions with information technology artefacts. The two methodologies have a lot in common, which is further elaborated in the study. The goals of this dissertation were two-fold: first, to create an intrusion detector that could provide a high level of intrusion detection performance measured using accuracy and recall and second, to identify potential techniques that can increase intrusion detection performance. Out of the designed models, a hybrid autoencoder + stacking neural network model managed to achieve detection performance comparable to the best models that appeared in the related literature, with good detections on minority classes. To achieve this result, the techniques identified were synthetic sampling, advanced hyperparameter optimization, model ensembles and autoencoder networks. In addition, the dissertation set up a soft hierarchy among the different detection techniques in terms of performance and provides a brief outlook on potential future practical applications of network intrusion detection models as well

    Intelligent and Improved Self-Adaptive Anomaly based Intrusion Detection System for Networks

    Get PDF
    With the advent of digital technology, computer networks have developed rapidly at an unprecedented pace contributing tremendously to social and economic development. They have become the backbone for all critical sectors and all the top Multi-National companies. Unfortunately, security threats for computer networks have increased dramatically over the last decade being much brazen and bolder. Intrusions or attacks on computers and networks are activities or attempts to jeopardize main system security objectives, which called as confidentiality, integrity and availability. They lead mostly in great financial losses, massive sensitive data leaks, thereby decreasing efficiency and the quality of productivity of an organization. There is a great need for an effective Network Intrusion Detection System (NIDS), which are security tools designed to interpret the intrusion attempts in incoming network traffic, thereby achieving a solid line of protection against inside and outside intruders. In this work, we propose to optimize a very popular soft computing tool prevalently used for intrusion detection namely Back Propagation Neural Network (BPNN) using a novel machine learning framework called “ISAGASAA”, based on Improved Self-Adaptive Genetic Algorithm (ISAGA) and Simulated Annealing Algorithm (SAA). ISAGA is our variant of standard Genetic Algorithm (GA), which is developed based on GA improved through an Adaptive Mutation Algorithm (AMA) and optimization strategies. The optimization strategies carried out are Parallel Processing (PP) and Fitness Value Hashing (FVH) that reduce execution time, convergence time and save processing power. While, SAA was incorporated to ISAGA in order to optimize its heuristic search. Experimental results based on Kyoto University benchmark dataset version 2015 demonstrate that our optimized NIDS based BPNN called “ANID BPNN-ISAGASAA” outperforms several state-of-art approaches in terms of detection rate and false positive rate. Moreover, improvement of GA through FVH and PP saves processing power and execution time. Thus, our model is very much convenient for network anomaly detection.

    Enabling intrusion detection systems with dueling double deep Q-learning

    Get PDF
    Purpose – In this research, the authors demonstrate the advantage of reinforcement learning (RL) based intrusion detection systems (IDS) to solve very complex problems (e.g. selecting input features, considering scarce resources and constrains) that cannot be solved by classical machine learning. The authors include a comparative study to build intrusion detection based on statistical machine learning and representational learning, using knowledge discovery in databases (KDD) Cup99 and Installation Support Center of Expertise (ISCX) 2012. Design/methodology/approach – The methodology applies a data analytics approach, consisting of data exploration and machine learning model training and evaluation. To build a network-based intrusion detection system, the authors apply dueling double deep Q-networks architecture enabled with costly features, k-nearest neighbors (K-NN), support-vector machines (SVM) and convolution neural networks (CNN). Findings – Machine learning-based intrusion detection are trained on historical datasets which lead to model drift and lack of generalization whereas RL is trained with data collected through interactions. RL is bound to learn from its interactions with a stochastic environment in the absence of a training dataset whereas supervised learning simply learns from collected data and require less computational resources. Research limitations/implications – All machine learning models have achieved high accuracy values and performance. One potential reason is that both datasets are simulated, and not realistic. It was not clear whether a validation was ever performed to show that data were collected from real network traffics. Practical implications – The study provides guidelines to implement IDS with classical supervised learning, deep learning and RL. Originality/value – The research applied the dueling double deep Q-networks architecture enabled with costly features to build network-based intrusion detection from network traffics. This research presents a comparative study of reinforcement-based instruction detection with counterparts built with statistical and representational machine learning
    • …
    corecore