683 research outputs found

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    Anomaly intrusion detection: a distributed ARTMAP neural network approach

    Get PDF
    Computer network and information security are becoming more important as the use and availability of the Internet continues to increase. Intrusion detection systems are being integrated with computer network security to monitor for misuse or anomalous behavior within a computer network. In this thesis, a Distributed ARTMAP (dARTMAP) artificial neural network is used to analyze network packet data for anomalies. The dARTMAP artificial neural network can perform fast, stable on-line learning in real-time with novelty patterns and has the ability to encode rare events. These features make dARTMAP a strong candidate for intrusion detection. The dARTMAP\u27s on-line learning capability combined with noisy input data can cause the neural network to over-learn resulting in an abundance of category nodes and an increase in computations. The architecture of the Distributed ARTMAP was modified to incorporate category node pruning and category node hibernation to overcome issues associated with category proliferation and computational complexity. Pruning attempts to reduce the number of category nodes required to encode the information. Hibernation attempts to reduce the catastrophic forgetting introduced by pruning. Three pruning techniques and one hibernation technique were examined for their effectiveness in maximizing the accuracy of the neural network and minimizing the time required to train the network data. Since the purpose of the dARTMAP artificial neural network is to detect intrusions, the supervised neural network has to autonomously predict the level of intrusion and refine this prediction through the learning process. Triggers of the neural network\u27s dynamic learning process are used as indicators for intrusive events to predict training outputs for the neural network. These triggers correspond to the three fundamental properties of security: confidentiality, integrity, and availability

    Self-organizing maps in computer security

    Get PDF

    Performance Metrics for Network Intrusion Systems

    Get PDF
    Intrusion systems have been the subject of considerable research during the past 33 years, since the original work of Anderson. Much has been published attempting to improve their performance using advanced data processing techniques including neural nets, statistical pattern recognition and genetic algorithms. Whilst some significant improvements have been achieved they are often the result of assumptions that are difficult to justify and comparing performance between different research groups is difficult. The thesis develops a new approach to defining performance focussed on comparing intrusion systems and technologies. A new taxonomy is proposed in which the type of output and the data scale over which an intrusion system operates is used for classification. The inconsistencies and inadequacies of existing definitions of detection are examined and five new intrusion levels are proposed from analogy with other detection-based technologies. These levels are known as detection, recognition, identification, confirmation and prosecution, each representing an increase in the information output from, and functionality of, the intrusion system. These levels are contrasted over four physical data scales, from application/host through to enterprise networks, introducing and developing the concept of a footprint as a pictorial representation of the scope of an intrusion system. An intrusion is now defined as “an activity that leads to the violation of the security policy of a computer system”. Five different intrusion technologies are illustrated using the footprint with current challenges also shown to stimulate further research. Integrity in the presence of mixed trust data streams at the highest intrusion level is identified as particularly challenging. Two metrics new to intrusion systems are defined to quantify performance and further aid comparison. Sensitivity is introduced to define basic detectability of an attack in terms of a single parameter, rather than the usual four currently in use. Selectivity is used to describe the ability of an intrusion system to discriminate between attack types. These metrics are quantified experimentally for network intrusion using the DARPA 1999 dataset and SNORT. Only nine of the 58 attack types present were detected with sensitivities in excess of 12dB indicating that detection performance of the attack types present in this dataset remains a challenge. The measured selectivity was also poor indicting that only three of the attack types could be confidently distinguished. The highest value of selectivity was 3.52, significantly lower than the theoretical limit of 5.83 for the evaluated system. Options for improving selectivity and sensitivity through additional measurements are examined.Stochastic Systems Lt

    Online Adaboost-based parameterized methods for dynamic distributed network intrusion detection

    Get PDF
    Current network intrusion detection systems lack adaptability to the frequently changing network environments. Furthermore, intrusion detection in the new distributed archi- tectures is now a major requirement. In this paper, we propose two online Adaboost-based intrusion detection algorithms. In the first algorithm, a traditional online Adaboost process is used where decision stumps are used as weak classifiers. In the second algorithm, an improved online Adaboost process is proposed, and online Gaussian mixture models (GMMs) are used as weak classifiers. We further propose a distributed intrusion detection framework, in which a local parameterized detection model is constructed in each node using the online Adaboost algorithm. A global detection model is constructed in each node by combining the local parametric models using a small number of samples in the node. This combination is achieved using an algorithm based on particle swarm optimization (PSO) and support vector ma- chines. The global model in each node is used to detect intrusions. Experimental results show that the improved online Adaboost process with GMMs obtains a higher detection rate and a lower false alarm rate than the traditional online Adaboost process that uses decision stumps. Both the algorithms outperform existing intrusion detection algorithms. It is also shown that our PSO, and SVM-based algorithm effectively combines the local detection models into the global model in each node; the global model in a node can handle the intrusion types that are found in other nodes, without sharing the samples of these intrusion types

    Self-organizing maps in computer security

    Get PDF

    Feature Subset Selection in Intrusion Detection Using Soft Computing Techniques

    Get PDF
    Intrusions on computer network systems are major security issues these days. Therefore, it is of utmost importance to prevent such intrusions. The prevention of such intrusions is entirely dependent on their detection that is a main part of any security tool such as Intrusion Detection System (IDS), Intrusion Prevention System (IPS), Adaptive Security Alliance (ASA), checkpoints and firewalls. Therefore, accurate detection of network attack is imperative. A variety of intrusion detection approaches are available but the main problem is their performance, which can be enhanced by increasing the detection rates and reducing false positives. Such weaknesses of the existing techniques have motivated the research presented in this thesis. One of the weaknesses of the existing intrusion detection approaches is the usage of a raw dataset for classification but the classifier may get confused due to redundancy and hence may not classify correctly. To overcome this issue, Principal Component Analysis (PCA) has been employed to transform raw features into principal features space and select the features based on their sensitivity. The sensitivity is determined by the values of eigenvalues. The recent approaches use PCA to project features space to principal feature space and select features corresponding to the highest eigenvalues, but the features corresponding to the highest eigenvalues may not have the optimal sensitivity for the classifier due to ignoring many sensitive features. Instead of using traditional approach of selecting features with the highest eigenvalues such as PCA, this research applied a Genetic Algorithm (GA) to search the principal feature space that offers a subset of features with optimal sensitivity and the highest discriminatory power. Based on the selected features, the classification is performed. The Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used for classification purpose due to their proven ability in classification. This research work uses the Knowledge Discovery and Data mining (KDD) cup dataset, which is considered benchmark for evaluating security detection mechanisms. The performance of this approach was analyzed and compared with existing approaches. The results show that proposed method provides an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates

    High Performance Data Mining Techniques For Intrusion Detection

    Get PDF
    The rapid growth of computers transformed the way in which information and data was stored. With this new paradigm of data access, comes the threat of this information being exposed to unauthorized and unintended users. Many systems have been developed which scrutinize the data for a deviation from the normal behavior of a user or system, or search for a known signature within the data. These systems are termed as Intrusion Detection Systems (IDS). These systems employ different techniques varying from statistical methods to machine learning algorithms. Intrusion detection systems use audit data generated by operating systems, application softwares or network devices. These sources produce huge amount of datasets with tens of millions of records in them. To analyze this data, data mining is used which is a process to dig useful patterns from a large bulk of information. A major obstacle in the process is that the traditional data mining and learning algorithms are overwhelmed by the bulk volume and complexity of available data. This makes these algorithms impractical for time critical tasks like intrusion detection because of the large execution time. Our approach towards this issue makes use of high performance data mining techniques to expedite the process by exploiting the parallelism in the existing data mining algorithms and the underlying hardware. We will show that how high performance and parallel computing can be used to scale the data mining algorithms to handle large datasets, allowing the data mining component to search a much larger set of patterns and models than traditional computational platforms and algorithms would allow. We develop parallel data mining algorithms by parallelizing existing machine learning techniques using cluster computing. These algorithms include parallel backpropagation and parallel fuzzy ARTMAP neural networks. We evaluate the performances of the developed models in terms of speedup over traditional algorithms, prediction rate and false alarm rate. Our results showed that the traditional backpropagation and fuzzy ARTMAP algorithms can benefit from high performance computing techniques which make them well suited for time critical tasks like intrusion detection
    corecore