Intrusion detection in IoT networks using machine learning

Abstract

The exponential growth of Internet of Things (IoT) infrastructure has introduced significant security challenges due to the large-scale deployment of interconnected devices. IoT devices are present in every aspect of our modern life; they are essential components of Industry 4.0, smart cities, and critical infrastructures. Therefore, the detection of attacks on this platform becomes necessary through an Intrusion Detection Systems (IDS). These tools are dedicated hardware devices or software that monitors a network to detect and automatically alert the presence of malicious activity. This study aimed to assess the viability of Machine Learning Models for IDS within IoT infrastructures. Five classifiers, encompassing a spectrum from linear models like Logistic Regression, Decision Trees from Trees Algorithms, Gaussian Naïve Bayes from Probabilistic models, Random Forest from ensemble family and Multi-Layer Perceptron from Artificial Neural Networks, were analysed. These models were trained using supervised methods on a public IoT attacks dataset, with three tasks ranging from binary classification (determining if a sample was part of an attack) to multiclassification of 8 groups of attack categories and the multiclassification of 33 individual attacks. Various metrics were considered, from performance to execution times and all models were trained and tuned using cross-validation of 10 k-folds. On the three classification tasks, Random Forest was found to be the model with best performance, at expenses of time consumption. Gaussian Naïve Bayes was the fastest algorithm in all classification¿s tasks, but with a lower performance detecting attacks. Whereas Decision Trees shows a good balance between performance and processing speed. Classifying among 8 attack categories, most models showed vulnerabilities to specific attack types, especially those in minority classes due to dataset imbalances. In more granular 33 attack type classifications, all models generally faced challenges, but Random Forest remained the most reliable, despite vulnerabilities. In conclusion, Machine Learning algorithms proves to be effective for IDS in IoT infrastructure, with Random Forest model being the most robust, but with Decision Trees offering a good balance between speed and performance.Objectius de Desenvolupament Sostenible::9 - Indústria, Innovació i Infraestructur

    Similar works