15,456 research outputs found

    Machine learning approach for detection of nonTor traffic

    Get PDF
    Intrusion detection has attracted a considerable interest from researchers and industry. After many years of research the community still faces the problem of building reliable and efficient intrusion detection systems (IDS) capable of handling large quantities of data with changing patterns in real time situations. The Tor network is popular in providing privacy and security to end user by anonymizing the identity of internet users connecting through a series of tunnels and nodes. This work identifies two problems; classification of Tor traffic and nonTor traffic to expose the activities within Tor traffic that minimizes the protection of users in using the UNB-CIC Tor Network Traffic dataset and classification of the Tor traffic flow in the network. This paper proposes a hybrid classifier; Artificial Neural Network in conjunction with Correlation feature selection algorithm for dimensionality reduction and improved classification performance. The reliability and efficiency of the propose hybrid classifier is compared with Support Vector Machine and naĂŻve Bayes classifiers in detecting nonTor traffic in UNB-CIC Tor Network Traffic dataset. Experimental results show the hybrid classifier, ANN-CFS proved a better classifier in detecting nonTor traffic and classifying the Tor traffic flow in UNB-CIC Tor Network Traffic dataset

    Anomaly detection using adaptive resonance theory

    Full text link
    Thesis (M.S.)--Boston UniversityThis thesis focuses on the problem of anomaly detection in computer networks. Anomalies are often malicious intrusion attempts that represent a serious threat to network security. Adaptive Resonance Theory (ART) is used as a classification scheme for identifying malicious network traffic. ART was originally developed as a theory to explain how the human eye categorizes visual patterns. For network intrusion detection, the core ART algorithm is implemented as a clustering algorithm that groups network traffic into clusters. A machine learning process allows the number of clusters to change over time to best conform to the data. Network traffic is characterized by network flows, which represent a packet, or series of packets, between two distinct nodes on a network. These flows can contain a number of attributes, including IP addresses, ports, size, and duration. These attributes form a multi-dimensional vector that is used in the clustering process. Once data is clustered along the defined dimensions, anomalies are identified as data points that do not match known good or nominal network traffic. The ART clustering algorithm is tested on a realistic network environment that was generated using the network flow simulation tool FS. The clustering results for this simulation show very promising detection rates for the ART clustering algorithm

    Dendritic Cell Algorithm with Optimised Parameters using Genetic Algorithm

    Get PDF
    Intrusion detection systems are developed with the abilities to discriminate between normal and anomalous traffic behaviours. The core challenge in implementing an intrusion detection systems is to determine and stop anomalous traffic behavior precisely before it causes any adverse effects to the network, information systems, or any other hardware and digital assets which forming or in the cyberspace. Inspired by the biological immune system, Dendritic Cell Algorithm (DCA) is a classification algorithm developed for the purpose of anomaly detection based on the danger theory and the functioning of human immune dendritic cells. In its core operation, DCA uses a weighted sum function to derive the output cumulative values from the input signals. The weights used in this function are either derived empirically from the data or defined by users. Due to this, the algorithm opens the doors for users to specify the weights that may not produce optimal result (often accuracy). This paper proposes a weight optimisation approach implemented using the popular stochastic search tool, genetic algorithm. The approach is validated and evaluated using the KDD99 dataset with promising results generated

    An efficient intrusion detection model based on hybridization of artificial bee colony and dragonfly algorithms for training multilayer perceptrons

    Get PDF
    One of the most persistent challenges concerning network security is to build a model capable of detecting intrusions in network systems. The issue has been extensively addressed in uncountable researches and using various techniques, of which a commonly used technique is that based on detecting intrusions in contrast to normal network traffic and the classification of network packets as either normal or abnormal. However, the problem of improving the accuracy and efficiency of classification models remains open and yet to be resolved. This study proposes a new binary classification model for intrusion detection, based on hybridization of Artificial Bee Colony algorithm (ABC) and Dragonfly algorithm (DA) for training an artificial neural network (ANN) in order to increase the classification accuracy rate for malicious and non-malicious traffic in networks. At first the model selects the suitable biases and weights utilizing a hybrid (ABC) and (DA). Next, the neural network is retrained using these ideal values in order for the intrusion detection model to be able to recognize new attacks. Ten other metaheuristic algorithms were adapted to train the neural network and their performances were compared with that of the proposed model. In addition, four types of intrusion detection evaluation datasets were applied to evaluate the proposed model in comparison to the others. The results of our experiments have demonstrated a significant improvement in inefficient network intrusion detection over other classification methods

    In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats

    Get PDF
    This paper describes the process and results of analyzing CICIDS2017, a modern, labeled data set for testing intrusion detection systems. The data set is divided into several days, each pertaining to different attack classes (Dos, DDoS, infiltration, botnet, etc.). A pipeline has been created that includes nine supervised learning algorithms. The goal was binary classification of benign versus attack traffic. Cross-validated parameter optimization, using a voting mechanism that includes five classification metrics, was employed to select optimal parameters. These results were interpreted to discover whether certain parameter choices were dominant for most (or all) of the attack classes. Ultimately, every algorithm was retested with optimal parameters to obtain the final classification scores. During the review of these results, execution time, both on consumerand corporate-grade equipment, was taken into account as an additional requirement. The work detailed in this paper establishes a novel supervised machine learning performance baseline for CICIDS2017

    Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm

    Get PDF
    Redundant and irrelevant features in data have caused a long-term problem in network traffic classification. These features not only slow down the process of classification but also prevent a classifier from making accurate decisions, especially when coping with big data. In this paper, we propose a mutual information based algorithm that analytically selects the optimal feature for classification. This mutual information based feature selection algorithm can handle linearly and nonlinearly dependent data features. Its effectiveness is evaluated in the cases of network intrusion detection. An Intrusion Detection System (IDS), named Least Square Support Vector Machine based IDS (LSSVM-IDS), is built using the features selected by our proposed feature selection algorithm. The performance of LSSVM-IDS is evaluated using three intrusion detection evaluation datasets, namely KDD Cup 99, NSL-KDD and Kyoto 2006+ dataset. The evaluation results show that our feature selection algorithm contributes more critical features for LSSVM-IDS to achieve better accuracy and lower computational cost compared with the state-of-the-art methods

    Unsupervised feature selection for anomaly-based network intrusion detection using cluster validity indices.

    Get PDF
    Master of Science in Computer Engineering. University of KwaZulu-Natal, Durban 2016.In recent years, there has been a rapid increase in Internet usage, which has in turn led to a rise in malicious network activity. Network Intrusion Detection Systems (NIDS) are tools that monitor network traffic with the purpose of rapidly and accurately detecting malicious activity. These systems provide a time window for responding to emerging threats and attacks aimed at exploiting vulnerabilities that arise from issues such as misconfigured firewalls and outdated software. Anomaly-based network intrusion detection systems construct a profile of legitimate or normal traffic patterns using machine learning techniques, and monitor network traffic for deviations from the profile, which are subsequently classified as threats or intrusions. Due to the richness of information contained in network traffic, it is possible to define large feature vectors from network packets. This often leads to redundant or irrelevant features being used in network intrusion detection systems, which typically reduces the detection performance of the system. The purpose of feature selection is to remove unnecessary or redundant features in a feature space, thereby improving the performance of learning algorithms and as a result the classification accuracy. Previous approaches have performed feature selection via optimization techniques, using the classification accuracy of the NIDS on a subset of the data as an objective function. While this approach has been shown to improve the performance of the system, it is unrealistic to assume that labelled training data is available in operational networks, which precludes the use of classification accuracy as an objective function in a practical system. This research proposes a method for feature selection in network intrusion detection that does not require any access to labelled data. The algorithm uses normalized cluster validity indices as an objective function that is optimized over the search space of candidate feature subsets via a genetic algorithm. Feature subsets produced by the algorithm are shown to improve the classification performance of an anomaly{based network intrusion detection system over the NSL-KDD dataset. Despite not requiring access to labelled data, the classification performance of the proposed system approaches that of efective feature subsets that were derived using labelled training data

    Radio frequency traffic classification over WLAN

    Get PDF
    Network traffic classification is the process of analyzing traffic flows and associating them to different categories of network applications. Network traffic classification represents an essential task in the whole chain of network security. Some of the most important and widely spread applications of traffic classification are the ability to classify encrypted traffic, the identification of malicious traffic flows, and the enforcement of security policies on the use of different applications. Passively monitoring a network utilizing low-cost and low-complexity wireless local area network (WLAN) devices is desirable. Mobile devices can be used or existing office desktops can be temporarily utilized when their computational load is low. This reduces the burden on existing network hardware. The aim of this paper is to investigate traffic classification techniques for wireless communications. To aid with intrusion detection, the key goal is to passively monitor and classify different traffic types over WLAN to ensure that network security policies are adhered to. The classification of encrypted WLAN data poses some unique challenges not normally encountered in wired traffic. WLAN traffic is analyzed for features that are then used as an input to six different machine learning (ML) algorithms for traffic classification. One of these algorithms (a Gaussian mixture model incorporating a universal background model) has not been applied to wired or wireless network classification before. The authors also propose a ML algorithm that makes use of the well-known vector quantization algorithm in conjunction with a decision tree—referred to as a TRee Adaptive Parallel Vector Quantiser. This algorithm has a number of advantages over the other ML algorithms tested and is suited to wireless traffic classification. An average F-score (harmonic mean of precision and recall) > 0.84 was achieved when training and testing on the same day across six distinct traffic types

    Numerical Analysis for Relevant Features in Intrusion Detection (NARFid)

    Get PDF
    Identification of cyber attacks and network services is a robust field of study in the machine learning community. Less effort has been focused on understanding the domain space of real network data in identifying important features for cyber attack and network service classification. Motivations for such work allow for anomaly detection systems with less requirements on data “sniffed” off the network, extraction of features from the traffic, reduced learning time of algorithms, and ideally increased classification performance of anomalous behavior. This thesis evaluates the usefulness of a good feature subset for the general classification task of identifying cyber attacks and network services. The generality of the selected features elucidates the relevance or irrelevance of the feature set for the classification task of intrusion detection. Additionally, the thesis provides an extension to the Bhattacharyya method, which selects features by means of inter-class separability (Bhattacharyya coefficient). The extension for multiple class problems selects a minimal set of features with the best separability across all class pairs. Several feature selection algorithms (e.g., accuracy rate with genetic algorithm, RELIEF-F, GRLVQI, median Bhattacharyya and minimum surface Bhattacharyya methods) create feature subsets that describe the decision boundary for intrusion detection problems. The selected feature subsets maintain or improve the classification performance for at least three out of the four anomaly detectors (i.e., classifiers) under test. The feature subsets, which illustrate generality for the intrusion detection problem, range in size from 12 to 27 features. The original feature set consists of 248 features. Of the feature subsets demonstrating generality, the extension to the Bhattacharyya method generates the second smallest feature subset. This thesis quantitatively demonstrates that a relatively small feature set may be used for intrusion detection with machine learning classifiers
    • …
    corecore