381 research outputs found

    Anomaly Detection Algorithms and Techniques for Network Intrusion Detection Systems

    Get PDF
    In recent years, many deep learning-based models have been proposed for anomaly detection. This thesis presents a comparison of selected deep autoencoding models and classical anomaly detection methods on three modern network intrusion detection datasets. We experiment with different configurations and architectures of the selected models, as well as aggregation techniques for input preprocessing and output postprocessing. We propose a methodology for creating benchmark datasets for the evaluation of the methods in different settings. We provide a statistical comparison of the performance of the selected techniques. We conclude that the deep autoencoding models, in particular AE and VAE, systematically outperform the classic methods. Furthermore, we show that aggregating input network flow data improves the overall performance. In general, the tested techniques are promising regarding their application in network intrusion detection systems. However, secondary techniques must be employed to reduce the high numbers of generated false alarms

    K-Means+ID3 and dependence tree methods for supervised anomaly detection

    Get PDF
    In this dissertation, we present two novel methods for supervised anomaly detection. The first method K-Means+ID3 performs supervised anomaly detection by partitioning the training data instances into k clusters using Euclidean distance similarity. Then, on each cluster representing a density region of normal or anomaly instances, an ID3 decision tree is built. The ID3 decision tree on each cluster refines the decision boundaries by learning the subgroups within a cluster. To obtain a final decision on detection, the k-Means and ID3 decision trees are combined using two rules: (1) the nearest neighbor rule; and (2) the nearest consensus rule. The performance of the K-Means+ID3 is demonstrated over three data sets: (1) network anomaly data, (2) Duffing equation data, and (3) mechanical system data, which contain measurements drawn from three distinct application domains of computer networks, an electronic circuit implementing a forced Duffing equation, and a mechanical mass beam system subjected to fatigue stress, respectively. Results show that the detection accuracy of the K-Means+ID3 method is as high as 96.24 percent on network anomaly data; the total accuracy is as high as 80.01 percent on mechanical system data; and 79.9 percent on Duffing equation data. Further, the performance of K-Means+ID3 is compared with individual k-Means and ID3 methods implemented for anomaly detection. The second method dependence tree based anomaly detection performs supervised anomaly detection using the Bayes classification rule. The class conditional probability densities in the Bayes classification rule are approximated by dependence trees, which represent second-order product approximations of probability densities. We derive the theoretical relationship between dependence tree classification error and Bayes error rate and show that the dependence tree approximation minimizes an upper bound on the Bayes error rate. To improve the classification performance of dependence tree based anomaly detection, we use supervised and unsupervised Maximum Relevance Minimum Redundancy (MRMR) feature selection method to select a set of features that optimally characterize class information. We derive the theoretical relationship between the Bayes error rate and the MRMR feature selection criterion and show that MRMR feature selection criterion minimizes an upper bound on the Bayes error rate. The performance of the dependence tree based anomaly detection method is demonstrated on the benchmark KDD Cup 1999 intrusion detection data set. Results show that the detection accuracies of the dependence tree based anomaly detection method are as high as 99.76 percent in detecting normal traffic, 93.88 percent in detecting denial-of-service attacks, 94.88 percent in detecting probing attacks, 86.40 percent in detecting user-to-root attacks, and 24.44 percent in detecting remote-to-login attacks. Further, the performance of dependence tree based anomaly detection method is compared with the performance of naïve Bayes and ID3 decision tree methods as well as with the performance of two anomaly detection methods reported in recent literature

    A Framework for Hybrid Intrusion Detection Systems

    Get PDF
    Web application security is a definite threat to the world’s information technology infrastructure. The Open Web Application Security Project (OWASP), generally defines web application security violations as unauthorized or unintentional exposure, disclosure, or loss of personal information. These breaches occur without the company’s knowledge and it often takes a while before the web application attack is revealed to the public, specifically because the security violations are fixed. Due to the need to protect their reputation, organizations have begun researching solutions to these problems. The most widely accepted solution is the use of an Intrusion Detection System (IDS). Such systems currently rely on either signatures of the attack used for the data breach or changes in the behavior patterns of the system to identify an intruder. These systems, either signature-based or anomaly-based, are readily understood by attackers. Issues arise when attacks are not noticed by an existing IDS because the attack does not fit the pre-defined attack signatures the IDS is implemented to discover. Despite current IDSs capabilities, little research has identified a method to detect all potential attacks on a system. This thesis intends to address this problem. A particular emphasis will be placed on detecting advanced attacks, such as those that take place at the application layer. These types of attacks are able to bypass existing IDSs, increase the potential for a web application security breach to occur and not be detected. In particular, the attacks under study are all web application layer attacks. Those included in this thesis are SQL injection, cross-site scripting, directory traversal and remote file inclusion. This work identifies common and existing data breach detection methods as well as the necessary improvements for IDS models. Ultimately, the proposed approach combines an anomaly detection technique measured by cross entropy and a signature-based attack detection framework utilizing genetic algorithm. The proposed hybrid model for data breach detection benefits organizations by increasing security measures and allowing attacks to be identified in less time and more efficiently

    Cyber-attack detection in network traffic using machine learning

    Get PDF
    Rapid shifting by government sectors and companies to provide their services and products over the internet, has immensely increased internet usage by individuals. Through extranets to network services or corporate networks used for personal purposes, computer hackers can lead to financial losses and manpower/time consumption. Therefore, it is vital to take all necessary measures to minimize losses by detecting attacks preemptively. Due to learning algorithms in cyberspace security challenges, deep learning-based cyber defense has lately become a hot topic. Penetration testing, malware categorization and identification, spam filtering, and spoofing detection are just a few of the key concerns in cyber defense that were tackled using Machine Learning (ML) approaches (Somme, 2020). Result, effective adaptive approaches, such as machine learning approaches could result in increased response times, reduced probability of false alerts, as well as cheaper computing and communication expenses. Our primary point is to demonstrate that the problem of detecting malware is distinct from other technologies, making it far more difficult for the access control group to properly use machine learning

    Network anomaly detection research: a survey

    Get PDF
    Data analysis to identifying attacks/anomalies is a crucial task in anomaly detection and network anomaly detection itself is an important issue in network security. Researchers have developed methods and algorithms for the improvement of the anomaly detection system. At the same time, survey papers on anomaly detection researches are available. Nevertheless, this paper attempts to analyze futher and to provide alternative taxonomy on anomaly detection researches focusing on methods, types of anomalies, data repositories, outlier identity and the most used data type. In addition, this paper summarizes information on application network categories of the existing studies

    Anomaly Detection in Sequential Data: A Deep Learning-Based Approach

    Get PDF
    Anomaly Detection has been researched in various domains with several applications in intrusion detection, fraud detection, system health management, and bio-informatics. Conventional anomaly detection methods analyze each data instance independently (univariate or multivariate) and ignore the sequential characteristics of the data. Anomalies in the data can be detected by grouping the individual data instances into sequential data and hence conventional way of analyzing independent data instances cannot detect anomalies. Currently: (1) Deep learning-based algorithms are widely used for anomaly detection purposes. However, significant computational overhead time is incurred during the training process due to static constant batch size and learning rate parameters for each epoch, (2) the threshold to decide whether an event is normal or malicious is often set as static. This can drastically increase the false alarm rate if the threshold is set low or decrease the True Alarm rate if it is set to a remarkably high value, (3) Real-life data is messy. It is impossible to learn the data features by training just one algorithm. Therefore, several one-class-based algorithms need to be trained. The final output is the ensemble of the output from all the algorithms. The prediction accuracy can be increased by giving a proper weight to each algorithm\u27s output. By extending the state-of-the-art techniques in learning-based algorithms, this dissertation provides the following solutions: (i) To address (1), we propose a hybrid, dynamic batch size and learning rate tuning algorithm that reduces the overall training time of the neural network. (ii) As a solution for (2), we present an adaptive thresholding algorithm that reduces high false alarm rates. (iii) To overcome (3), we propose a multilevel hybrid ensemble anomaly detection framework that increases the anomaly detection rate of the high dimensional dataset
    • …
    corecore