828 research outputs found
APIC: A method for automated pattern identification and classification
Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML
Hierarchical TCP network traffic classification with adaptive optimisation
Nowadays, with the increasing deployment of modern packet-switching networks,
traffic classification is playing an important role in network administration. To
identify what kinds of traffic transmitting across networks can improve network
management in various ways, such as traffic shaping, differential services, enhanced
security, etc. By applying different policies to different kinds of traffic, Quality
of Service (QoS) can be achieved and the granularity can be as fine as flow-level.
Since illegal traffic can be identified and filtered, network security can be enhanced
by employing advanced traffic classification.
There are various traditional techniques for traffic classification. However,
some of them cannot handle traffic generated by applications using non-registered
ports or forged ports, some of them cannot deal with encrypted traffic and some
techniques require too much computational resources. The newly proposed technique
by other researchers, which uses statistical methods, gives an alternative
approach. It requires less resources, does not rely on ports and can deal with encrypted
traffic. Nevertheless, the performance of the classification using statistical
methods can be further improved.
In this thesis, we are aiming for optimising network traffic classification based
on the statistical approach. Because of the popularity of the TCP protocol, and
the difficulties for classification introduced by TCP traffic controls, our work is
focusing on classifying network traffic based on TCP protocol. An architecture has
been proposed for improving the classification performance, in terms of accuracy
and response time. Experiments have been taken and results have been evaluated
for proving the improved performance of the proposed optimised classifier.
In our work, network packets are reassembled into TCP flows. Then, the
statistical characteristics of flows are extracted. Finally the classes of input flows
can be determined by comparing them with the profiled samples. Instead of using only one algorithm for classifying all traffic flows, our proposed system employs
a series of binary classifiers, which use optimised algorithms to detect different
traffic classes separately. There is a decision making mechanism for dealing with
controversial results from the binary classifiers. Machining learning algorithms
including k-nearest neighbour, decision trees and artificial neural networks have
been taken into consideration together with a kind of non-parametric statistical
algorithm — Kolmogorov-Smirnov test. Besides algorithms, some parameters are
also optimised locally, such as detection windows, acceptance thresholds. This
hierarchical architecture gives traffic classifier more flexibility, higher accuracy
and less response time
Recognition of traffic generated by WebRTC communication
Network traffic recognition serves as a basic condition for network operators to differentiate and prioritize traffic for a number of purposes, from guaranteeing the Quality of Service (QoS), to monitoring safety, as well as monitoring and detecting anomalies. Web Real-Time Communication (WebRTC) is an open-source project that enables real-time audio, video, and text communication among browsers. Since WebRTC does not include any characteristic pattern for semantically based traffic recognition, this paper proposes models for recognizing traffic generated during WebRTC audio and video communication based on statistical characteristics and usage of machine learning in Weka tool. Five classification algorithms have been used for model development, such as Naive Bayes, J48, Random Forest, REP tree, and Bayes Net. The results show that J48 and BayesNet have the best performances in this experimental case of WebRTC traffic recognition. Future work will be focused on comparison of a wide range of machine learning algorithms using a large enough dataset to improve the significance of the results
Network intrusion detection system for DDoS attacks in ICS using deep autoencoders
Anomaly detection in industrial control and cyber-physical systems has gained much attention over the past years due to the increasing modernisation and exposure of industrial environments. Current dangers to the connected industry include the theft of industrial intellectual property, denial of service, or the compromise of cloud components; all of which might result in a cyber-attack across the operational network. However, most scientific work employs device logs, which necessitate substantial understanding and preprocessing before they can be used in anomaly detection. In this paper, we propose a network intrusion detection system (NIDS) architecture based on a deep autoencoder trained on network flow data, which has the advantage of not requiring prior knowledge of the network topology or its underlying architecture. Experimental results show that the proposed model can detect anomalies, caused by distributed denial of service attacks, providing a high detection rate and low false alarms, outperforming the state-of-the-art and a baseline model in an unsupervised learning environment. Furthermore, the deep autoencoder model can detect abnormal behaviour in legitimate devices after an attack. We also demonstrate the suitability of the proposed NIDS in a real industrial plant from the alimentary sector, analysing the false positive rate and the viability of the data generation, filtering and preprocessing procedure for a near real time scenario. The suggested NIDS architecture is a low-cost solution that uses only fifteen network-based features, requires minimal processing, operates in unsupervised mode, and is straightforward to deploy in real-world scenarios.Axencia Galega de Innovación | Ref. IN854A 2019/15Centro para el Desarrollo Tecnológico Industrial | Ref. CER-20191012Agencia Estatal de Investigación | Ref. MTM2017-89422-PFinanciado para publicación en acceso aberto: Universidade de Vigo/CISU
Anomaly Detection in BACnet/IP managed Building Automation Systems
Building Automation Systems (BAS) are a collection of devices and software which manage the operation of building services. The BAS market is expected to be a $19.25 billion USD industry by 2023, as a core feature of both the Internet of Things and Smart City technologies. However, securing these systems from cyber security threats is an emerging research area. Since initial deployment, BAS have evolved from isolated standalone networks to heterogeneous, interconnected networks allowing external connectivity through the Internet. The most prominent BAS protocol is BACnet/IP, which is estimated to hold 54.6% of world market share. BACnet/IP security features are often not implemented in BAS deployments, leaving systems unprotected against known network threats. This research investigated methods of detecting anomalous network traffic in BACnet/IP managed BAS in an effort to combat threats posed to these systems.
This research explored the threats facing BACnet/IP devices, through analysis of Internet accessible BACnet devices, vendor-defined device specifications, investigation of the BACnet specification, and known network attacks identified in the surrounding literature. The collected data were used to construct a threat matrix, which was applied to models of BACnet devices to evaluate potential exposure. Further, two potential unknown vulnerabilities were identified and explored using state modelling and device simulation.
A simulation environment and attack framework were constructed to generate both normal and malicious network traffic to explore the application of machine learning algorithms to identify both known and unknown network anomalies. To identify network patterns between the generated normal and malicious network traffic, unsupervised clustering, graph analysis with an unsupervised community detection algorithm, and time series analysis were used. The explored methods identified distinguishable network patterns for frequency-based known network attacks when compared to normal network traffic. However, as stand-alone methods for anomaly detection, these methods were found insufficient. Subsequently, Artificial Neural Networks and Hidden Markov Models were explored and found capable of detecting known network attacks. Further, Hidden Markov Models were also capable of detecting unknown network attacks in the generated datasets.
The classification accuracy of the Hidden Markov Models was evaluated using the Matthews Correlation Coefficient which accounts for imbalanced class sizes and assess both positive and negative classification ability for deriving its metric. The Hidden Markov Models were found capable of repeatedly detecting both known and unknown BACnet/IP attacks with True Positive Rates greater than 0.99 and Matthews Correlation Coefficients greater than 0.8 for five of six evaluated hosts.
This research identified and evaluated a range of methods capable of identifying anomalies in simulated BACnet/IP network traffic. Further, this research found that Hidden Markov Models were accurate at classifying both known and unknown attacks in the evaluated BACnet/IP managed BAS network
Security related self-protected networks: Autonomous threat detection and response (ATDR)
>Magister Scientiae - MScCybersecurity defense tools, techniques and methodologies are constantly faced with increasing
challenges including the evolution of highly intelligent and powerful new-generation threats. The
main challenges posed by these modern digital multi-vector attacks is their ability to adapt with
machine learning. Research shows that many existing defense systems fail to provide adequate
protection against these latest threats. Hence, there is an ever-growing need for self-learning technologies
that can autonomously adjust according to the behaviour and patterns of the offensive
actors and systems. The accuracy and effectiveness of existing methods are dependent on decision
making and manual input by human experts. This dependence causes 1) administration
overhead, 2) variable and potentially limited accuracy and 3) delayed response time
Security related self-protected networks: autonomous threat detection and response (ATDR)
Doctor EducationisCybersecurity defense tools, techniques and methodologies are constantly faced with increasing
challenges including the evolution of highly intelligent and powerful new generation threats. The
main challenges posed by these modern digital multi-vector attacks is their ability to adapt with
machine learning. Research shows that many existing defense systems fail to provide adequate
protection against these latest threats. Hence, there is an ever-growing need for self-learning technologies that can autonomously adjust according to the behaviour and patterns of the offensive
actors and systems. The accuracy and effectiveness of existing methods are dependent on decision
making and manual input by human expert. This dependence causes 1) administration overhead,
2) variable and potentially limited accuracy and 3) delayed response time.
In this thesis, Autonomous Threat Detection and Response (ATDR) is a proposed general method
aimed at contributing toward security related self-protected networks. Through a combination
of unsupervised machine learning and Deep learning, ATDR is designed as an intelligent and
autonomous decision-making system that uses big data processing requirements and data frame
pattern identification layers to learn sequences of patterns and derive real-time data formations.
This system enhances threat detection and response capabilities, accuracy and speed. Research
provided a solid foundation for the proposed method around the scope of existing methods and
the unanimous problem statements and findings by other authors
Numerical Analysis for Relevant Features in Intrusion Detection (NARFid)
Identification of cyber attacks and network services is a robust field of study in the machine learning community. Less effort has been focused on understanding the domain space of real network data in identifying important features for cyber attack and network service classification. Motivations for such work allow for anomaly detection systems with less requirements on data “sniffed” off the network, extraction of features from the traffic, reduced learning time of algorithms, and ideally increased classification performance of anomalous behavior. This thesis evaluates the usefulness of a good feature subset for the general classification task of identifying cyber attacks and network services. The generality of the selected features elucidates the relevance or irrelevance of the feature set for the classification task of intrusion detection. Additionally, the thesis provides an extension to the Bhattacharyya method, which selects features by means of inter-class separability (Bhattacharyya coefficient). The extension for multiple class problems selects a minimal set of features with the best separability across all class pairs. Several feature selection algorithms (e.g., accuracy rate with genetic algorithm, RELIEF-F, GRLVQI, median Bhattacharyya and minimum surface Bhattacharyya methods) create feature subsets that describe the decision boundary for intrusion detection problems. The selected feature subsets maintain or improve the classification performance for at least three out of the four anomaly detectors (i.e., classifiers) under test. The feature subsets, which illustrate generality for the intrusion detection problem, range in size from 12 to 27 features. The original feature set consists of 248 features. Of the feature subsets demonstrating generality, the extension to the Bhattacharyya method generates the second smallest feature subset. This thesis quantitatively demonstrates that a relatively small feature set may be used for intrusion detection with machine learning classifiers
- …