828 research outputs found

    APIC: A method for automated pattern identification and classification

    Get PDF
    Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML

    Hierarchical TCP network traffic classification with adaptive optimisation

    Get PDF
    Nowadays, with the increasing deployment of modern packet-switching networks, traffic classification is playing an important role in network administration. To identify what kinds of traffic transmitting across networks can improve network management in various ways, such as traffic shaping, differential services, enhanced security, etc. By applying different policies to different kinds of traffic, Quality of Service (QoS) can be achieved and the granularity can be as fine as flow-level. Since illegal traffic can be identified and filtered, network security can be enhanced by employing advanced traffic classification. There are various traditional techniques for traffic classification. However, some of them cannot handle traffic generated by applications using non-registered ports or forged ports, some of them cannot deal with encrypted traffic and some techniques require too much computational resources. The newly proposed technique by other researchers, which uses statistical methods, gives an alternative approach. It requires less resources, does not rely on ports and can deal with encrypted traffic. Nevertheless, the performance of the classification using statistical methods can be further improved. In this thesis, we are aiming for optimising network traffic classification based on the statistical approach. Because of the popularity of the TCP protocol, and the difficulties for classification introduced by TCP traffic controls, our work is focusing on classifying network traffic based on TCP protocol. An architecture has been proposed for improving the classification performance, in terms of accuracy and response time. Experiments have been taken and results have been evaluated for proving the improved performance of the proposed optimised classifier. In our work, network packets are reassembled into TCP flows. Then, the statistical characteristics of flows are extracted. Finally the classes of input flows can be determined by comparing them with the profiled samples. Instead of using only one algorithm for classifying all traffic flows, our proposed system employs a series of binary classifiers, which use optimised algorithms to detect different traffic classes separately. There is a decision making mechanism for dealing with controversial results from the binary classifiers. Machining learning algorithms including k-nearest neighbour, decision trees and artificial neural networks have been taken into consideration together with a kind of non-parametric statistical algorithm — Kolmogorov-Smirnov test. Besides algorithms, some parameters are also optimised locally, such as detection windows, acceptance thresholds. This hierarchical architecture gives traffic classifier more flexibility, higher accuracy and less response time

    Recognition of traffic generated by WebRTC communication

    Get PDF
    Network traffic recognition serves as a basic condition for network operators to differentiate and prioritize traffic for a number of purposes, from guaranteeing the Quality of Service (QoS), to monitoring safety, as well as monitoring and detecting anomalies. Web Real-Time Communication (WebRTC) is an open-source project that enables real-time audio, video, and text communication among browsers. Since WebRTC does not include any characteristic pattern for semantically based traffic recognition, this paper proposes models for recognizing traffic generated during WebRTC audio and video communication based on statistical characteristics and usage of machine learning in Weka tool. Five classification algorithms have been used for model development, such as Naive Bayes, J48, Random Forest, REP tree, and Bayes Net. The results show that J48 and BayesNet have the best performances in this experimental case of WebRTC traffic recognition. Future work will be focused on comparison of a wide range of machine learning algorithms using a large enough dataset to improve the significance of the results

    Network intrusion detection system for DDoS attacks in ICS using deep autoencoders

    Get PDF
    Anomaly detection in industrial control and cyber-physical systems has gained much attention over the past years due to the increasing modernisation and exposure of industrial environments. Current dangers to the connected industry include the theft of industrial intellectual property, denial of service, or the compromise of cloud components; all of which might result in a cyber-attack across the operational network. However, most scientific work employs device logs, which necessitate substantial understanding and preprocessing before they can be used in anomaly detection. In this paper, we propose a network intrusion detection system (NIDS) architecture based on a deep autoencoder trained on network flow data, which has the advantage of not requiring prior knowledge of the network topology or its underlying architecture. Experimental results show that the proposed model can detect anomalies, caused by distributed denial of service attacks, providing a high detection rate and low false alarms, outperforming the state-of-the-art and a baseline model in an unsupervised learning environment. Furthermore, the deep autoencoder model can detect abnormal behaviour in legitimate devices after an attack. We also demonstrate the suitability of the proposed NIDS in a real industrial plant from the alimentary sector, analysing the false positive rate and the viability of the data generation, filtering and preprocessing procedure for a near real time scenario. The suggested NIDS architecture is a low-cost solution that uses only fifteen network-based features, requires minimal processing, operates in unsupervised mode, and is straightforward to deploy in real-world scenarios.Axencia Galega de Innovación | Ref. IN854A 2019/15Centro para el Desarrollo Tecnológico Industrial | Ref. CER-20191012Agencia Estatal de Investigación | Ref. MTM2017-89422-PFinanciado para publicación en acceso aberto: Universidade de Vigo/CISU

    Anomaly Detection in BACnet/IP managed Building Automation Systems

    Get PDF
    Building Automation Systems (BAS) are a collection of devices and software which manage the operation of building services. The BAS market is expected to be a $19.25 billion USD industry by 2023, as a core feature of both the Internet of Things and Smart City technologies. However, securing these systems from cyber security threats is an emerging research area. Since initial deployment, BAS have evolved from isolated standalone networks to heterogeneous, interconnected networks allowing external connectivity through the Internet. The most prominent BAS protocol is BACnet/IP, which is estimated to hold 54.6% of world market share. BACnet/IP security features are often not implemented in BAS deployments, leaving systems unprotected against known network threats. This research investigated methods of detecting anomalous network traffic in BACnet/IP managed BAS in an effort to combat threats posed to these systems. This research explored the threats facing BACnet/IP devices, through analysis of Internet accessible BACnet devices, vendor-defined device specifications, investigation of the BACnet specification, and known network attacks identified in the surrounding literature. The collected data were used to construct a threat matrix, which was applied to models of BACnet devices to evaluate potential exposure. Further, two potential unknown vulnerabilities were identified and explored using state modelling and device simulation. A simulation environment and attack framework were constructed to generate both normal and malicious network traffic to explore the application of machine learning algorithms to identify both known and unknown network anomalies. To identify network patterns between the generated normal and malicious network traffic, unsupervised clustering, graph analysis with an unsupervised community detection algorithm, and time series analysis were used. The explored methods identified distinguishable network patterns for frequency-based known network attacks when compared to normal network traffic. However, as stand-alone methods for anomaly detection, these methods were found insufficient. Subsequently, Artificial Neural Networks and Hidden Markov Models were explored and found capable of detecting known network attacks. Further, Hidden Markov Models were also capable of detecting unknown network attacks in the generated datasets. The classification accuracy of the Hidden Markov Models was evaluated using the Matthews Correlation Coefficient which accounts for imbalanced class sizes and assess both positive and negative classification ability for deriving its metric. The Hidden Markov Models were found capable of repeatedly detecting both known and unknown BACnet/IP attacks with True Positive Rates greater than 0.99 and Matthews Correlation Coefficients greater than 0.8 for five of six evaluated hosts. This research identified and evaluated a range of methods capable of identifying anomalies in simulated BACnet/IP network traffic. Further, this research found that Hidden Markov Models were accurate at classifying both known and unknown attacks in the evaluated BACnet/IP managed BAS network

    Security related self-protected networks: Autonomous threat detection and response (ATDR)

    Get PDF
    >Magister Scientiae - MScCybersecurity defense tools, techniques and methodologies are constantly faced with increasing challenges including the evolution of highly intelligent and powerful new-generation threats. The main challenges posed by these modern digital multi-vector attacks is their ability to adapt with machine learning. Research shows that many existing defense systems fail to provide adequate protection against these latest threats. Hence, there is an ever-growing need for self-learning technologies that can autonomously adjust according to the behaviour and patterns of the offensive actors and systems. The accuracy and effectiveness of existing methods are dependent on decision making and manual input by human experts. This dependence causes 1) administration overhead, 2) variable and potentially limited accuracy and 3) delayed response time

    Security related self-protected networks: autonomous threat detection and response (ATDR)

    Get PDF
    Doctor EducationisCybersecurity defense tools, techniques and methodologies are constantly faced with increasing challenges including the evolution of highly intelligent and powerful new generation threats. The main challenges posed by these modern digital multi-vector attacks is their ability to adapt with machine learning. Research shows that many existing defense systems fail to provide adequate protection against these latest threats. Hence, there is an ever-growing need for self-learning technologies that can autonomously adjust according to the behaviour and patterns of the offensive actors and systems. The accuracy and effectiveness of existing methods are dependent on decision making and manual input by human expert. This dependence causes 1) administration overhead, 2) variable and potentially limited accuracy and 3) delayed response time. In this thesis, Autonomous Threat Detection and Response (ATDR) is a proposed general method aimed at contributing toward security related self-protected networks. Through a combination of unsupervised machine learning and Deep learning, ATDR is designed as an intelligent and autonomous decision-making system that uses big data processing requirements and data frame pattern identification layers to learn sequences of patterns and derive real-time data formations. This system enhances threat detection and response capabilities, accuracy and speed. Research provided a solid foundation for the proposed method around the scope of existing methods and the unanimous problem statements and findings by other authors

    Numerical Analysis for Relevant Features in Intrusion Detection (NARFid)

    Get PDF
    Identification of cyber attacks and network services is a robust field of study in the machine learning community. Less effort has been focused on understanding the domain space of real network data in identifying important features for cyber attack and network service classification. Motivations for such work allow for anomaly detection systems with less requirements on data “sniffed” off the network, extraction of features from the traffic, reduced learning time of algorithms, and ideally increased classification performance of anomalous behavior. This thesis evaluates the usefulness of a good feature subset for the general classification task of identifying cyber attacks and network services. The generality of the selected features elucidates the relevance or irrelevance of the feature set for the classification task of intrusion detection. Additionally, the thesis provides an extension to the Bhattacharyya method, which selects features by means of inter-class separability (Bhattacharyya coefficient). The extension for multiple class problems selects a minimal set of features with the best separability across all class pairs. Several feature selection algorithms (e.g., accuracy rate with genetic algorithm, RELIEF-F, GRLVQI, median Bhattacharyya and minimum surface Bhattacharyya methods) create feature subsets that describe the decision boundary for intrusion detection problems. The selected feature subsets maintain or improve the classification performance for at least three out of the four anomaly detectors (i.e., classifiers) under test. The feature subsets, which illustrate generality for the intrusion detection problem, range in size from 12 to 27 features. The original feature set consists of 248 features. Of the feature subsets demonstrating generality, the extension to the Bhattacharyya method generates the second smallest feature subset. This thesis quantitatively demonstrates that a relatively small feature set may be used for intrusion detection with machine learning classifiers
    corecore