Network Intrusion Detection System using Deep Learning Technique

Abstract

The rise in the usage of the internet in this recent time had led to tremendous development in computer networks with large volumes of information transported daily. This development has generated lots of security threats and privacy concerns on networks and data. To tackle these issues, several protective measures have been developed including the Intrusion Detection Systems (IDSs). IDS plays a major backbone in network security and provides an extra layer of security to other security defence mechanisms in a network. However, existing IDS built on a signature base such as snort and the likes are unable to detect unknown and novel threats. Anomaly detection-based IDSs that use Machine Learning (ML) approaches are not scalable when enormous data are presented, and during modelling, the runtime increases as the dataset size increases which needs high computational resources to fulfil the runtime requirements. This thesis proposes a Feedforward Deep Neural Network (FFDNN) for an intrusion detection system that performs a binary classification on the popular NSL-Knowledge discovery and data mining (NSL-KDD) dataset. The model was developed from Keras API integrated into TensorFlow in Google's colaboratory software environment. Three variants of FFDNNs were trained using the NSL-KDD dataset and the network architecture consisted of two hidden layers with 64 and 32; 32 and 16; 512 and 256 neurons respectively, and each with the ReLu activation function. The sigmoid activation function for binary classification was used in the output layer and the prediction loss function used was the binary cross-entropy. Regularization was set to a dropout rate of 0.2 and the Adam optimizer was used. The deep neural networks were trained for 16, 20, 20 epochs respectively for batch sizes of 256, 64, and 128. After evaluating the performances of the FFDNNs on the training data, the prediction was made on test data, and accuracies of 89%, 84%, and 87% were achieved. The experiment was also conducted on the same training dataset (NSL-KDD) using the conventional machine learning algorithms (Random Forest; K-nearest neighbor; Logistic regression; Decision tree; and NaΓ―ve Bayes) and predictions of each algorithm on the test data gave different performance accuracies of 81%, 76%, 77%, 77%, 77%, respectively. The performance results of the FFDNNs were calculated based on some important metrics (FPR, FAR, F1 Measure, Precision), and these were compared to the conventional ML algorithms and the outcome shows that the deep neural networks performed best due to their dense architecture that made it scalable with the large size of the dataset and also offered a faster run time during training in contrast to the slow run time of the Conventional ML. This implies that when the dataset is large and a faster computation is required, then FFDNN is a better choice for best performance accuracy

    Similar works