203 research outputs found

    Developing Cyberspace Data Understanding: Using CRISP-DM for Host-based IDS Feature Mining

    Get PDF
    Current intrusion detection systems generate a large number of specific alerts, but do not provide actionable information. Many times, these alerts must be analyzed by a network defender, a time consuming and tedious task which can occur hours or days after an attack occurs. Improved understanding of the cyberspace domain can lead to great advancements in Cyberspace situational awareness research and development. This thesis applies the Cross Industry Standard Process for Data Mining (CRISP-DM) to develop an understanding about a host system under attack. Data is generated by launching scans and exploits at a machine outfitted with a set of host-based data collectors. Through knowledge discovery, features are identified within the data collected which can be used to enhance host-based intrusion detection. By discovering relationships between the data collected and the events, human understanding of the activity is shown. This method of searching for hidden relationships between sensors greatly enhances understanding of new attacks and vulnerabilities, bolstering our ability to defend the cyberspace domain

    Anomaly-based Correlation of IDS Alarms

    Get PDF
    An Intrusion Detection System (IDS) is one of the major techniques for securing information systems and keeping pace with current and potential threats and vulnerabilities in computing systems. It is an indisputable fact that the art of detecting intrusions is still far from perfect, and IDSs tend to generate a large number of false IDS alarms. Hence human has to inevitably validate those alarms before any action can be taken. As IT infrastructure become larger and more complicated, the number of alarms that need to be reviewed can escalate rapidly, making this task very difficult to manage. The need for an automated correlation and reduction system is therefore very much evident. In addition, alarm correlation is valuable in providing the operators with a more condensed view of potential security issues within the network infrastructure. The thesis embraces a comprehensive evaluation of the problem of false alarms and a proposal for an automated alarm correlation system. A critical analysis of existing alarm correlation systems is presented along with a description of the need for an enhanced correlation system. The study concludes that whilst a large number of works had been carried out in improving correlation techniques, none of them were perfect. They either required an extensive level of domain knowledge from the human experts to effectively run the system or were unable to provide high level information of the false alerts for future tuning. The overall objective of the research has therefore been to establish an alarm correlation framework and system which enables the administrator to effectively group alerts from the same attack instance and subsequently reduce the volume of false alarms without the need of domain knowledge. The achievement of this aim has comprised the proposal of an attribute-based approach, which is used as a foundation to systematically develop an unsupervised-based two-stage correlation technique. From this formation, a novel SOM K-Means Alarm Reduction Tool (SMART) architecture has been modelled as the framework from which time and attribute-based aggregation technique is offered. The thesis describes the design and features of the proposed architecture, focusing upon the key components forming the underlying architecture, the alert attributes and the way they are processed and applied to correlate alerts. The architecture is strengthened by the development of a statistical tool, which offers a mean to perform results or alert analysis and comparison. The main concepts of the novel architecture are validated through the implementation of a prototype system. A series of experiments were conducted to assess the effectiveness of SMART in reducing false alarms. This aimed to prove the viability of implementing the system in a practical environment and that the study has provided appropriate contribution to knowledge in this field

    Holistic Network Defense: Fusing Host and Network Features for Attack Classification

    Get PDF
    This work presents a hybrid network-host monitoring strategy, which fuses data from both the network and the host to recognize malware infections. This work focuses on three categories: Normal, Scanning, and Infected. The network-host sensor fusion is accomplished by extracting 248 features from network traffic using the Fullstats Network Feature generator and from the host using text mining, looking at the frequency of the 500 most common strings and analyzing them as word vectors. Improvements to detection performance are made by synergistically fusing network features obtained from IP packet flows and host features, obtained from text mining port, processor, logon information among others. In addition, the work compares three different machine learning algorithms and updates the script required to obtain network features. Hybrid method results outperformed host only classification by 31.7% and network only classification by 25%. The new approach also reduces the number of alerts while remaining accurate compared with the commercial IDS SNORT. These results make it such that even the most typical users could understand alert classification messages

    Performance Metrics for Network Intrusion Systems

    Get PDF
    Intrusion systems have been the subject of considerable research during the past 33 years, since the original work of Anderson. Much has been published attempting to improve their performance using advanced data processing techniques including neural nets, statistical pattern recognition and genetic algorithms. Whilst some significant improvements have been achieved they are often the result of assumptions that are difficult to justify and comparing performance between different research groups is difficult. The thesis develops a new approach to defining performance focussed on comparing intrusion systems and technologies. A new taxonomy is proposed in which the type of output and the data scale over which an intrusion system operates is used for classification. The inconsistencies and inadequacies of existing definitions of detection are examined and five new intrusion levels are proposed from analogy with other detection-based technologies. These levels are known as detection, recognition, identification, confirmation and prosecution, each representing an increase in the information output from, and functionality of, the intrusion system. These levels are contrasted over four physical data scales, from application/host through to enterprise networks, introducing and developing the concept of a footprint as a pictorial representation of the scope of an intrusion system. An intrusion is now defined as “an activity that leads to the violation of the security policy of a computer system”. Five different intrusion technologies are illustrated using the footprint with current challenges also shown to stimulate further research. Integrity in the presence of mixed trust data streams at the highest intrusion level is identified as particularly challenging. Two metrics new to intrusion systems are defined to quantify performance and further aid comparison. Sensitivity is introduced to define basic detectability of an attack in terms of a single parameter, rather than the usual four currently in use. Selectivity is used to describe the ability of an intrusion system to discriminate between attack types. These metrics are quantified experimentally for network intrusion using the DARPA 1999 dataset and SNORT. Only nine of the 58 attack types present were detected with sensitivities in excess of 12dB indicating that detection performance of the attack types present in this dataset remains a challenge. The measured selectivity was also poor indicting that only three of the attack types could be confidently distinguished. The highest value of selectivity was 3.52, significantly lower than the theoretical limit of 5.83 for the evaluated system. Options for improving selectivity and sensitivity through additional measurements are examined.Stochastic Systems Lt

    A Critical Analysis of Payload Anomaly-Based Intrusion Detection Systems

    Get PDF
    Examining payload content is an important aspect of network security, particularly in today\u27s volatile computing environment. An Intrusion Detection System (IDS) that simply analyzes packet header information cannot adequately secure a network from malicious attacks. The alternative is to perform deep-packet analysis using n-gram language parsing and neural network technology. Self Organizing Map (SOM), PAYL over Self-Organizing Maps for Intrusion Detection (POSEIDON), Anomalous Payload-based Network Intrusion Detection (PAYL), and Anagram are next-generation unsupervised payload anomaly-based IDSs. This study examines the efficacy of each system using the design-science research methodology. A collection of quantitative data and qualitative features exposes their strengths and weaknesses

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated

    A graph oriented approach for network forensic analysis

    Get PDF
    Network forensic analysis is a process that analyzes intrusion evidence captured from networked environment to identify suspicious entities and stepwise actions in an attack scenario. Unfortunately, the overwhelming amount and low quality of output from security sensors make it difficult for analysts to obtain a succinct high-level view of complex multi-stage intrusions. This dissertation presents a novel graph based network forensic analysis system. The evidence graph model provides an intuitive representation of collected evidence as well as the foundation for forensic analysis. Based on the evidence graph, we develop a set of analysis components in a hierarchical reasoning framework. Local reasoning utilizes fuzzy inference to infer the functional states of an host level entity from its local observations. Global reasoning performs graph structure analysis to identify the set of highly correlated hosts that belong to the coordinated attack scenario. In global reasoning, we apply spectral clustering and Pagerank methods for generic and targeted investigation respectively. An interactive hypothesis testing procedure is developed to identify hidden attackers from non-explicit-malicious evidence. Finally, we introduce the notion of target-oriented effective event sequence (TOEES) to semantically reconstruct stealthy attack scenarios with less dependency on ad-hoc expert knowledge. Well established computation methods used in our approach provide the scalability needed to perform post-incident analysis in large networks. We evaluate the techniques with a number of intrusion detection datasets and the experiment results show that our approach is effective in identifying complex multi-stage attacks
    • …
    corecore