236 research outputs found

    The time-varying dependency patterns of NetFlow statistics

    Get PDF
    We investigate where and how key dependency structure between measures of network activity change throughout the course of daily activity. Our approach to data-mining is probabilistic in nature, we formulate the identification of dependency patterns as a regularised statistical estimation problem. The resulting model can be interpreted as a set of time-varying graphs and provides a useful visual interpretation of network activity. We believe this is the first application of dynamic graphical modelling to network traffic of this kind. Investigations are performed on 9 days of real-world network traffic across a subset of IP's. We demonstrate that dependency between features may change across time and discuss how these change at an intra and inter-day level. Such variation in feature dependency may have important consequences for the design and implementation of probabilistic intrusion detection systems

    The Time-Varying Dependency Patterns of NetFlow Statistics

    Get PDF
    We investigate where and how key dependency structure between measures of network activity change throughout the course of daily activity. Our approach to data-mining is probabilistic in nature, we formulate the identification of dependency patterns as a regularised statistical estimation problem. The resulting model can be interpreted as a set of time-varying graphs and provides a useful visual interpretation of network activity. We believe this is the first application of dynamic graphical modelling to network traffic of this kind. Investigations are performed on 9 days of real-world network traffic across a subset of IP's. We demonstrate that dependency between features may change across time and discuss how these change at an intra and inter-day level. Such variation in feature dependency may have important consequences for the design and implementation of probabilistic intrusion detection systems

    A Cyber Threat Intelligence Sharing Scheme based on Federated Learning for Network Intrusion Detection

    Full text link
    The uses of Machine Learning (ML) in detection of network attacks have been effective when designed and evaluated in a single organisation. However, it has been very challenging to design an ML-based detection system by utilising heterogeneous network data samples originating from several sources. This is mainly due to privacy concerns and the lack of a universal format of datasets. In this paper, we propose a collaborative federated learning scheme to address these issues. The proposed framework allows multiple organisations to join forces in the design, training, and evaluation of a robust ML-based network intrusion detection system. The threat intelligence scheme utilises two critical aspects for its application; the availability of network data traffic in a common format to allow for the extraction of meaningful patterns across data sources. Secondly, the adoption of a federated learning mechanism to avoid the necessity of sharing sensitive users' information between organisations. As a result, each organisation benefits from other organisations cyber threat intelligence while maintaining the privacy of its data internally. The model is trained locally and only the updated weights are shared with the remaining participants in the federated averaging process. The framework has been designed and evaluated in this paper by using two key datasets in a NetFlow format known as NF-UNSW-NB15-v2 and NF-BoT-IoT-v2. Two other common scenarios are considered in the evaluation process; a centralised training method where the local data samples are shared with other organisations and a localised training method where no threat intelligence is shared. The results demonstrate the efficiency and effectiveness of the proposed framework by designing a universal ML model effectively classifying benign and intrusive traffic originating from multiple organisations without the need for local data exchange

    APIC: A method for automated pattern identification and classification

    Get PDF
    Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML

    Characterising Dependency in Computer Networks using Spectral Coherence

    Get PDF
    The quantification of normal and anomalous traffic flows across computer networks is a topic of pervasive interest in network se- curity, and requires the timely application of time-series methods. The transmission or reception of packets passing between computers can be represented in terms of time-stamped events and the resulting activity understood in terms of point-processes. Interestingly, in the disparate do- main of neuroscience, models for describing dependent point-processes are well developed. In particular, spectral methods which decompose second-order dependency across different frequencies allow for a rich characterisation of point-processes. In this paper, we investigate using the spectral coherence statistic to characterise computer network activ- ity, and determine if, and how, device messaging may be dependent. We demonstrate on real data, that for many devices there appears to be very little dependency between device messaging channels. However, when sig- nificant coherence is detected it appears highly structured, a result which suggests coherence may prove useful for discriminating between types of activity at the network level

    BotGM: Unsupervised Graph Mining to Detect Botnets in Traffic Flows

    Get PDF
    International audienceBotnets are one of the most dangerous and serious cybersecurity threats since they are a major vector of large-scale attack campaigns such as phishing, distributed denial-of-service (DDoS) attacks, trojans, spams, etc. A large body of research has been accomplished on botnet detection, but recent security incidents show that there are still several challenges remaining to be addressed, such as the ability to develop detectors which can cope with new types of botnets. In this paper, we propose BotGM, a new approach to detect botnet activities based on behavioral analysis of network traffic flow. BotGM identifies network traffic behavior using graph-based mining techniques to detect botnets behaviors and model the dependencies among flows to trace-back the root causes then. We applied BotGM on a publicly available large dataset of Botnet network flows, where it detects various botnet behaviors with a high accuracy without any prior knowledge of them

    Symmetry degree measurement and its applications to anomaly detection

    Get PDF
    IEEE Anomaly detection is an important technique used to identify patterns of unusual network behavior and keep the network under control. Today, network attacks are increasing in terms of both their number and sophistication. To avoid causing significant traffic patterns and being detected by existing techniques, many new attacks tend to involve gradual adjustment of behaviors, which always generate incomplete sessions due to their running mechanisms. Accordingly, in this work, we employ the behavior symmetry degree to profile the anomalies and further identify unusual behaviors. We first proposed a symmetry degree to identify the incomplete sessions generated by unusual behaviors; we then employ a sketch to calculate the symmetry degree of internal hosts to improve the identification efficiency for online applications. To reduce the memory cost and probability of collision, we divide the IP addresses into four segments that can be used as keys of the hash functions in the sketch. Moreover, to further improve detection accuracy, a threshold selection method is proposed for dynamic traffic pattern analysis. The hash functions in the sketch are then designed using Chinese remainder theory, which can analytically trace the IP addresses associated with the anomalies. We tested the proposed techniques based on traffic data collected from the northwest center of CERNET (China Education and Research Network); the results show that the proposed methods can effectively detect anomalies in large-scale networks
    • …
    corecore