3 research outputs found

    Malicious URLs Detection Using Data Streaming Algorithms

    No full text
    As a result of the advancement in technology and technological devices, data is now spawned at an infinite rate, emanating from a vast array of networks, devices as well daily operations like credit card transactions and mobile phones. Data stream entails sequential and real-time continuous data in the inform of evolving stream. However, the traditional machine learning approach is characterized by a batch learning model in which labelled training data are given apriori to train a model based on some machine learning algorithms. This technique necessitates the entire training samples to be readily accessible before the learning process. In this setting, the training procedure is mostly done in an offline environment owing to the high cost of training. Consequently, traditional batch learning technique suffers from some serious drawbacks, such as poor scalability for the real-time phishing websites detection, because the model mostly requires re-training from scratch using new training samples. Thus, this paper presents the application of streaming algorithms for detecting malicious URLs based on some selected online learners which include: Hoeffding Tree (HT), Naïve Bayes (NB), and Ozabag. Hence, experimental results on two prominent phishing datasets showed that Ozabag produced promising results in terms of accuracy, Kappa and Kappa Temp on the dataset with large samples while HT and NB have the least prediction time with comparable accuracy and Kappa with Ozabag algorithm for the real-time detection of phishing websites

    Empirical Analysis of Data Streaming and Batch Learning Models for Network Intrusion Detection

    No full text
    Network intrusion, such as denial of service, probing attacks, and phishing, comprises some of the complex threats that have put the online community at risk. The increase in the number of these attacks has given rise to a serious interest in the research community to curb the menace. One of the research efforts is to have an intrusion detection mechanism in place. Batch learning and data streaming are approaches used for processing the huge amount of data required for proper intrusion detection. Batch learning, despite its advantages, has been faulted for poor scalability due to the constant re-training of new training instances. Hence, this paper seeks to conduct a comparative study using selected batch learning and data streaming algorithms. The batch learning and data streaming algorithms considered are J48, projective adaptive resonance theory (PART), Hoeffding tree (HT) and OzaBagAdwin (OBA). Furthermore, binary and multiclass classification problems are considered for the tested algorithms. Experimental results show that data streaming algorithms achieved considerably higher performance in binary classification problems when compared with batch learning algorithms. Specifically, binary classification produced J48 (94.73), PART (92.83), HT (98.38), and OBA (99.67), and multiclass classification produced J48 (87.66), PART (87.05), HT (71.98), OBA (82.80) based on accuracy. Hence, the use of data streaming algorithms to solve the scalability issue and allow real-time detection of network intrusion is highly recommended

    Empirical Analysis of Data Streaming and Batch Learning Models for Network Intrusion Detection

    No full text
    Network intrusion, such as denial of service, probing attacks, and phishing, comprises some of the complex threats that have put the online community at risk. The increase in the number of these attacks has given rise to a serious interest in the research community to curb the menace. One of the research efforts is to have an intrusion detection mechanism in place. Batch learning and data streaming are approaches used for processing the huge amount of data required for proper intrusion detection. Batch learning, despite its advantages, has been faulted for poor scalability due to the constant re-training of new training instances. Hence, this paper seeks to conduct a comparative study using selected batch learning and data streaming algorithms. The batch learning and data streaming algorithms considered are J48, projective adaptive resonance theory (PART), Hoeffding tree (HT) and OzaBagAdwin (OBA). Furthermore, binary and multiclass classification problems are considered for the tested algorithms. Experimental results show that data streaming algorithms achieved considerably higher performance in binary classification problems when compared with batch learning algorithms. Specifically, binary classification produced J48 (94.73), PART (92.83), HT (98.38), and OBA (99.67), and multiclass classification produced J48 (87.66), PART (87.05), HT (71.98), OBA (82.80) based on accuracy. Hence, the use of data streaming algorithms to solve the scalability issue and allow real-time detection of network intrusion is highly recommended
    corecore