5 research outputs found

    Survey of Network Intrusion Detection Methods from the Perspective of the Knowledge Discovery in Databases Process

    Full text link
    The identification of cyberattacks which target information and communication systems has been a focus of the research community for years. Network intrusion detection is a complex problem which presents a diverse number of challenges. Many attacks currently remain undetected, while newer ones emerge due to the proliferation of connected devices and the evolution of communication technology. In this survey, we review the methods that have been applied to network data with the purpose of developing an intrusion detector, but contrary to previous reviews in the area, we analyze them from the perspective of the Knowledge Discovery in Databases (KDD) process. As such, we discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods. In addition, we also present the characteristics and motivations behind the use of each of these techniques and propose more adequate and up-to-date taxonomies and definitions for intrusion detectors based on the terminology used in the area of data mining and KDD. Special importance is given to the evaluation procedures followed to assess the different detectors, discussing their applicability in current real networks. Finally, as a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security

    A distributed approach to network anomaly detection based on independent component analysis

    No full text
    Network anomalies, circumstances in which the network behavior deviates from its normal operational baseline, can be due to various factors such as network overload conditions, malicious/hostile activities, denial of service attacks, and network intrusions. New detection schemes based on machine learning principles are therefore desirable as they can learn the nature of normal traffic behavior and autonomously adapt to variations in the structure of 'normality' as well as recognize the significant deviations as suspicious or anomalous events. The main advantages of these techniques are that, in principle, they are not restricted to any specific environment and that they can provide a way of detecting unknown attacks. Detection performance is directly correlated with the traffic model quality, in terms of ability of representing the traffic behavior from its most characterizing internal dynamics. Starting from these ideas, we developed a two-stage anomaly detection strategy based on multiple distributed sensors located throughout the network. By using Independent Component Analysis, the first step, modeled as a Blind Source Separation problem, extracts the fundamental traffic components (the 'source' signals), corresponding to the independent traffic dynamics, from the multidimensional time series incoming from the sensors, corresponding to the perceived 'mixed/aggregate' effect of traffic on their interfaces. These components will be used to build the baseline traffic profiles needed in the second supervised phase, based on a binary classification scheme (detection is casted into an anomalous/normal classification problem) driven by machine learning-inferred decision trees. Copyright © 2013 John Wiley & Sons, Ltd

    Anomalous behaviour detection using heterogeneous data

    Get PDF
    Anomaly detection is one of the most important methods to process and find abnormal data, as this method can distinguish between normal and abnormal behaviour. Anomaly detection has been applied in many areas such as the medical sector, fraud detection in finance, fault detection in machines, intrusion detection in networks, surveillance systems for security, as well as forensic investigations. Abnormal behaviour can give information or answer questions when an investigator is performing an investigation. Anomaly detection is one way to simplify big data by focusing on data that have been grouped or clustered by the anomaly detection method. Forensic data usually consists of heterogeneous data which have several data forms or types such as qualitative or quantitative, structured or unstructured, and primary or secondary. For example, when a crime takes place, the evidence can be in the form of various types of data. The combination of all the data types can produce rich information insights. Nowadays, data has become ‘big’ because it is generated every second of every day and processing has become time-consuming and tedious. Therefore, in this study, a new method to detect abnormal behaviour is proposed using heterogeneous data and combining the data using data fusion technique. Vast challenge data and image data are applied to demonstrate the heterogeneous data. The first contribution in this study is applying the heterogeneous data to detect an anomaly. The recently introduced anomaly detection technique which is known as Empirical Data Analytics (EDA) is applied to detect the abnormal behaviour based on the data sets. Standardised eccentricity (a newly introduced within EDA measure offering a new simplified form of the well-known Chebyshev Inequality) can be applied to any data distribution. Then, the second contribution is applying image data. The image data is processed using pre-trained deep learning network, and classification is done using a support vector machine (SVM). After that, the last contribution is combining anomaly result from heterogeneous data and image recognition using new data fusion technique. There are five types of data with three different modalities and different dimensionalities. The data cannot be simply combined and integrated. Therefore, the new data fusion technique first analyses the abnormality in each data type separately and determines the degree of suspicious between 0 and 1 and sums up all the degrees of suspicion data afterwards. This method is not intended to be a fully automatic system that resolves investigations, which would likely be unacceptable in any case. The aim is rather to simplify the role of the humans so that they can focus on a small number of cases to be looked in more detail. The proposed approach does simplify the processing of such huge amounts of data. Later, this method can assist human experts in their investigations and making final decisions

    Theoretical and Applied Foundations for Intrusion Detection in Single and Federated Clouds

    Get PDF
    Les systèmes infonuagiques deviennent de plus en plus complexes, plus dynamiques et hétérogènes. Un tel environnement produit souvent des données complexes et bruitées, empêchant les systèmes de détection d’intrusion (IDS) de détecter des variantes d’attaques connues. Une seule intrusion ou une attaque dans un tel système hétérogène peut se présenter sous des formes différentes, logiquement mais non synthétiquement similaires. Les IDS traditionnels sont incapables d’identifier ces attaques, car ils sont conçus pour des infrastructures spécifiques et limitées. Par conséquent, une détection précise dans le nuage ne sera absolument pas identifiée. Outre le problème de l’infonuagique, les cyber-attaques sont de plus en plus sophistiquées et difficiles à détecter. Il est donc extrêmement compliqué pour un unique IDS d’un nuage de détecter toutes les attaques, en raison de leurs implications, et leurs connaissances limitées et insuffisantes de celles-ci. Les solutions IDS actuelles de l’infonuagique résident dans le fait qu’elles ne tiennent pas compte des aspects dynamiques et hétérogènes de l’infonuagique. En outre, elles s’appuient fondamentalement sur les connaissances et l’expérience locales pour identifier les attaques et les modèles existants. Cela rend le nuage vulnérable aux attaques «Zero-Day». À cette fin, nous résolvons dans cette thèse deux défis associés à l’IDS de l’infonuagique : la détection des cyberattaques dans des environnements complexes, dynamiques et hétérogènes, et la détection des cyberattaques ayant des informations limitées et/ou incomplètes sur les intrusions et leurs conséquences. Dans cette thèse, nous sommes intéressés aux IDS génériques de l’infonuagique afin d’identifier les intrusions qui sont indépendantes de l’infrastructure utilisée. Par conséquent, à chaque fois qu’un pressentiment d’attaque est identifié, le système de détection d’intrusion doit être capable de reconnaître toutes les variantes d’une telle attaque, quelle que soit l’infrastructure utilisée. De plus, les IDS de l’infonuagique coopèrent et échangent des informations afin de faire bénéficier chacun des expertises des autres, pour identifier des modèles d’attaques inconnues.----------ABSTRACT: Cloud Computing systems are becoming more and more complex, dynamic and heterogeneous. Such an environment frequently produces complex and noisy data that make Intrusion Detection Systems (IDSs) unable to detect unknown variants of known attacks. A single intrusion or an attack in such a heterogeneous system could take various forms that are logically but not synthetically similar. This, in turn, makes traditional IDSs unable to identify these attacks, since they are designed for specific and limited infrastructures. Therefore, the accuracy of the detection in the cloud will be very negatively affected. In addition to the problem of the cloud computing environment, cyber attacks are getting more sophisticated and harder to detect. Thus, it is becoming increasingly difficult for a single cloud-based IDS to detect all attacks, because of limited and incomplete knowledge about attacks and implications. The problem of the existing cloud-based IDS solutions is that they overlook the dynamic and changing nature of the cloud. Moreover, they are fundamentally based on the local knowledge and experience to perform the classification of attacks and normal patterns. This renders the cloud vulnerable to “Zero-Day” attacks. To this end, we address throughout this thesis two challenges associated with the cloud-based IDS which are: the detection of cyber attacks under complex, dynamic and heterogeneous environments; and the detection of cyber attacks under limited and/or incomplete information about intrusions and implications. We are interested in this thesis in allowing cloud-based IDSs to be generic, in order to identify intrusions regardless of the infrastructure used. Therefore, whenever an intrusion has been identified, an IDS should be able to recognize all the different structures of such an attack, regardless of the infrastructure that is being used. Moreover, we are interested in allowing cloud-based IDSs to cooperate and share knowledge with each other, in order to make them benefit from each other’s expertise to cover unknown attack patterns. The originality of this thesis lies within two aspects: 1) the design of a generic cloud-based IDS that allows the detection under changing and heterogeneous environments and 2) the design of a multi-cloud cooperative IDS that ensures trustworthiness, fairness and sustainability. By trustworthiness, we mean that the cloud-based IDS should be able to ensure that it will consult, cooperate and share knowledge with trusted parties (i.e., cloud-based IDSs). By fairness, we mean that the cloud-based IDS should be able to guarantee that mutual benefits will be achieved through minimising the chance of cooperating with selfish IDSs. This is useful to give IDSs the motivation to participate in the community
    corecore