1 research outputs found
Data-Driven Network Intrusion Detection: A Taxonomy of Challenges and Methods
Data-driven methods have been widely used in network intrusion detection
(NID) systems. However, there are currently a number of challenges derived from
how the datasets are being collected. Most attack classes in network intrusion
datasets are considered the minority compared to normal traffic and many
datasets are collected through virtual machines or other simulated environments
rather than real-world networks. These challenges undermine the performance of
intrusion detection machine learning models by fitting models such as random
forests or support vector machines to unrepresentative "sandbox" datasets. This
survey presents a carefully designed taxonomy highlighting eight main
challenges and solutions and explores common datasets from 1999 to 2020. Trends
are analyzed on the distribution of challenges addressed for the past decade
and future directions are proposed on expanding NID into cloud-based
environments, devising scalable models for larger amount of network intrusion
data, and creating labeled datasets collected in real-world networks.Comment: 38 pages, 14 figure