Transferability of Intrusion Detection Systems Using Machine Learning between Networks

Abstract

Intrusion detection systems (IDS) using machine learning is a next generation tool to strengthen the cyber security of networks. Such systems possess the potential to detect zero-day attacks, attacks that are unknown to researchers and are occurring for the first time in history. This thesis tackles novel ideas in this research domain and solves foreseeable issues of a practical deployment of such tool. The main issue addressed in this thesis are situations where an entity intends to implement an IDS using machine learning onto their network, but do not have attack data available from their own network to train the IDS. A solution is to train the IDS using attack data from other networks. However, there is a degree of uncertainty whether this is feasible as different networks use different applications and have different uses. Such IDS may not be able to adequately operate on a network when trained on data from an entirely different network. The proposed methodology in this research recommends the training set should combine attack data collected from other networks with benign traffic which originates from the network the IDS is to be implemented on. This method is compared with a training set which is completely composed of both attack and benign data from a completely different network. The best performing model implemented with both training sets demonstrated the feasibility of both scenarios. Both versions of that model achieved an F1 score of 0.82 and 0.81 respectively, and both versions detected roughly 70% of attacks and 99% of benign traffic. However, most IDSs trained on the former training set listed yielded the best results. The main benefit of training a model on target network benign data is to minimize false positive classifications. The average model witnessed a 113% boost in precision, compared to their counterparts trained on foreign network benign data. Another issue addressed in this thesis is the detection scope of attacks. The IDS scope of detection is limited to the attacks it is trained on. Using the proposed IDS training set, an intuitive feature selection scheme and classification threshold adjustment, this thesis improves the IDS scope of detection to detect attacks outside of its training data. Feature selection can manipulate an IDS to detect specific attacks not included in its training data. Using threshold tuning, the IDSs in this thesis detected up to 200% more attacks. Both issues and solutions are simulated and verified in two separate scenarios using neural networks and random forest

    Similar works