2,367 research outputs found

    Machine Learning DDoS Detection for Consumer Internet of Things Devices

    Full text link
    An increasing number of Internet of Things (IoT) devices are connecting to the Internet, yet many of these devices are fundamentally insecure, exposing the Internet to a variety of attacks. Botnets such as Mirai have used insecure consumer IoT devices to conduct distributed denial of service (DDoS) attacks on critical Internet infrastructure. This motivates the development of new techniques to automatically detect consumer IoT attack traffic. In this paper, we demonstrate that using IoT-specific network behaviors (e.g. limited number of endpoints and regular time intervals between packets) to inform feature selection can result in high accuracy DDoS detection in IoT network traffic with a variety of machine learning algorithms, including neural networks. These results indicate that home gateway routers or other network middleboxes could automatically detect local IoT device sources of DDoS attacks using low-cost machine learning algorithms and traffic data that is flow-based and protocol-agnostic.Comment: 7 pages, 3 figures, 3 tables, appears in the 2018 Workshop on Deep Learning and Security (DLS '18

    Reviewing Traffic ClassificationData Traffic Monitoring and Analysis

    Get PDF
    Traffic classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, traffic classification is the basic block that is required to enable any traffic management operations, from differentiating traffic pricing and treatment (e.g., policing, shaping, etc.), to security operations (e.g., firewalling, filtering, anomaly detection, etc.). Up to few years ago, almost any Internet application was using well-known transport layer protocol ports that easily allowed its identification. More recently, the number of applications using random or non-standard ports has dramatically increased (e.g. Skype, BitTorrent, VPNs, etc.). Moreover, often network applications are configured to use well-known protocol ports assigned to other applications (e.g. TCP port 80 originally reserved for Web traffic) attempting to disguise their presence. For these reasons, and for the importance of correctly classifying traffic flows, novel approaches based respectively on packet inspection, statistical and machine learning techniques, and behavioral methods have been investigated and are becoming standard practice. In this chapter, we discuss the main trend in the field of traffic classification and we describe some of the main proposals of the research community. We complete this chapter by developing two examples of behavioral classifiers: both use supervised machine learning algorithms for classifications, but each is based on different features to describe the traffic. After presenting them, we compare their performance using a large dataset, showing the benefits and drawback of each approac

    Numerical Analysis for Relevant Features in Intrusion Detection (NARFid)

    Get PDF
    Identification of cyber attacks and network services is a robust field of study in the machine learning community. Less effort has been focused on understanding the domain space of real network data in identifying important features for cyber attack and network service classification. Motivations for such work allow for anomaly detection systems with less requirements on data “sniffed” off the network, extraction of features from the traffic, reduced learning time of algorithms, and ideally increased classification performance of anomalous behavior. This thesis evaluates the usefulness of a good feature subset for the general classification task of identifying cyber attacks and network services. The generality of the selected features elucidates the relevance or irrelevance of the feature set for the classification task of intrusion detection. Additionally, the thesis provides an extension to the Bhattacharyya method, which selects features by means of inter-class separability (Bhattacharyya coefficient). The extension for multiple class problems selects a minimal set of features with the best separability across all class pairs. Several feature selection algorithms (e.g., accuracy rate with genetic algorithm, RELIEF-F, GRLVQI, median Bhattacharyya and minimum surface Bhattacharyya methods) create feature subsets that describe the decision boundary for intrusion detection problems. The selected feature subsets maintain or improve the classification performance for at least three out of the four anomaly detectors (i.e., classifiers) under test. The feature subsets, which illustrate generality for the intrusion detection problem, range in size from 12 to 27 features. The original feature set consists of 248 features. Of the feature subsets demonstrating generality, the extension to the Bhattacharyya method generates the second smallest feature subset. This thesis quantitatively demonstrates that a relatively small feature set may be used for intrusion detection with machine learning classifiers

    Denial of service attack detection through machine learning for the IoT

    Get PDF
    Sustained Internet of Things (IoT) deployment and functioning are heavily reliant on the use of effective data communication protocols. In the IoT landscape, the publish/subscribe-based Message Queuing Telemetry Transport (MQTT) protocol is popular. Cyber security threats against the MQTT protocol are anticipated to increase at par with its increasing use by IoT manufacturers. In particular, IoT is vulnerable to protocol-based Application layer Denial of Service (DoS) attacks, which have been known to cause widespread service disruption in legacy systems. In this paper, we propose an Application layer DoS attack detection framework for the MQTT protocol and test the scheme on legitimate and protocol compliant DoS attack scenarios. To protect the MQTT message brokers from such attacks, we propose a machine learning-based detection framework developed for the MQTT protocol. Through experiments, we demonstrate the impact of such attacks on various MQTT brokers and evaluate the effectiveness of the proposed framework to detect these malicious attacks. The results obtained indicate that the attackers can overwhelm the server resources even when legitimate access was denied to MQTT brokers and resources have been restricted. In addition, the MQTT features we have identified showed high attack detection accuracy. The field size and length-based features drastically reduced the false-positive rates and are suitable in detecting IoT based attacks

    Can Passive Mobile Application Traffic be Identified using Machine Learning Techniques

    Get PDF
    Mobile phone applications (apps) can generate background traffic when the end-user is not actively using the app. If this background traffic could be accurately identified, network operators could de-prioritise this traffic and free up network bandwidth for priority network traffic. The background app traffic should have IP packet features that could be utilised by a machine learning algorithm to identify app-generated (passive) traffic as opposed to user-generated (active) traffic. Previous research in the area of IP traffic classification focused on classifying high level network traffic types originating on a PC device. This research was concerned with classifying low level app traffic originating on mobile phone device. An innovative experiment setup was designed in order to answer the research question. A mobile phone running Android OS was configured to capture app network data. Three specific data trace procedures where then designed to comprehensively capture sample active and passive app traffic data. Feature generation in previous research recommend computing new features based on IP packet data. This research proposes a different approach. Feature generation was enabled by exposing inherent IP packet attributes as opposed to computing new features. Specific evaluation metrics were also designed in order to quantify the accuracy of the machine learning models at classifying active and passive app traffic. Three decision tree models were implemented; C5.0, C&R tree and CHAID tree. Each model was built using a standard implementation and with boosting. The findings indicate that passive app network traffic can be classified with an accuracy up to 84.8% using a CHAID decision tree algorithm with model boosting enabled. The finding also suggested that features derived from the inherent IP packet attributes, such as time frame delta and bytes in flight, had significant predictive value

    Cross Dataset Evaluation for IoT Network Intrusion Detection

    Get PDF
    With the advent of Internet of Things (IOT) technology, the need to ensure the security of an IOT network has become important. There are several intrusion detection systems (IDS) that are available for analyzing and predicting network anomalies and threats. However, it is challenging to evaluate them to realistically estimate their performance when deployed. A lot of research has been conducted where the training and testing is done using the same simulated dataset. However, realistically, a network on which an intrusion detection model is deployed will be very different from the network on which it was trained. The aim of this research is to perform a cross-dataset evaluation using different machine learning models for IDS. This helps ensure that a model that performs well when evaluated on one dataset will also perform well when deployed. Two publicly available simulation datasets., IOTID20 and Bot-IoT datasets created to capture IOT networks for different attacks such as DoS and Scanning were used for training and testing. Machine learning models applied to these datasets were evaluated within each dataset followed by cross -dataset evaluation. A significant difference was observed between the results obtained using the two datasets. Supervised machine learning models were built and evaluated for binary classification to classify between normal and anomaly attack instances as well as for multiclass classification to also categorize the type of attack on the IoT network

    Deep fused flow and topology features for botnet detection basing on pretrained GCN

    Full text link
    Nowadays, botnets have become one of the major threats to cyber security. The characteristics of botnets are mainly reflected in bots network behavior and their intercommunication relationships. Existing botnet detection methods use flow features or topology features individually, which overlook the other type of feature. This affects model performance. In this paper, we propose a botnet detection model which uses graph convolutional network (GCN) to deeply fuse flow features and topology features for the first time. We construct communication graphs from network traffic and represent nodes with flow features. Due to the imbalance of existing public traffic flow datasets, it is impossible to train a GCN model on these datasets. Therefore, we use a balanced public communication graph dataset to pretrain a GCN model, thereby guaranteeing its capacity for identify topology features. We then feed the communication graph with flow features into the pretrained GCN. The output from the last hidden layer is treated as the fusion of flow and topology features. Additionally, by adjusting the number of layers in the GCN network, the model can effectively detect botnets under both C2 and P2P structures. Validated on the public ISCX2014 dataset, our approach achieves a remarkable recall rate 92.90% and F1-score 92.76% for C2 botnets, alongside recall rate 94.66% and F1-score of 92.35% for P2P botnets. These results not only demonstrate the effectiveness of our method, but also outperform the performance of the currently leading detection models

    Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark

    Get PDF
    Today, deep learning approaches are widely used to build Intrusion Detection Systems for securing IoT environments. However, the models’ hidden and complex nature raises various concerns, such as trusting the model output and understanding why the model made certain decisions. Researchers generally publish their proposed model’s settings and performance results based on a specific dataset and a classification model but do not report the proposed model’s output and findings. Similarly, many researchers suggest an IDS solution by focusing only on a single benchmark dataset and classifier. Such solutions are prone to generating inaccurate and biased results. This paper overcomes these limitations in previous work by analyzing various benchmark datasets and various individual and hybrid deep learning classifiers towards finding the best IDS solution for IoT that is efficient, lightweight, and comprehensive in detecting network anomalies. We also showed the model’s localized predictions and analyzed the top contributing features impacting the global performance of deep learning models. This paper aims to extract the aggregate knowledge from various datasets and classifiers and analyze the commonalities to avoid any possible bias in results and increase the trust and transparency of deep learning models. We believe this paper’s findings will help future researchers build a comprehensive IDS based on well-performing classifiers and utilize the aggregated knowledge and the minimum set of significantly contributing features

    Detection of encrypted traffic generated by peer-to-peer live streaming applications using deep packet inspection

    Get PDF
    The number of applications using the peer-to-peer (P2P) networking paradigm and their popularity has substantially grown over the last decade. They evolved from the le-sharing applications to media streaming ones. Nowadays these applications commonly encrypt the communication contents or employ protocol obfuscation techniques. In this dissertation, it was conducted an investigation to identify encrypted traf c ows generated by three of the most popular P2P live streaming applications: TVUPlayer, Livestation and GoalBit. For this work, a test-bed that could simulate a near real scenario was created, and traf c was captured from a great variety of applications. The method proposed resort to Deep Packet Inspection (DPI), so we needed to analyse the payload of the packets in order to nd repeated patterns, that later were used to create a set of SNORT rules that can be used to detect key network packets generated by these applications. The method was evaluated experimentally on the test-bed created for that purpose, being shown that its accuracy is of 97% for GoalBit.A popularidade e o número de aplicações que usam o paradigma de redes par-a-par (P2P) têm crescido substancialmente na última década. Estas aplicações deixaram de serem usadas simplesmente para partilha de ficheiros e são agora usadas também para distribuir conteúdo multimédia. Hoje em dia, estas aplicações têm meios de cifrar o conteúdo da comunicação ou empregar técnicas de ofuscação directamente no protocolo. Nesta dissertação, foi realizada uma investigação para identificar fluxos de tráfego encriptados, que foram gerados por três aplicações populares de distribuição de conteúdo multimédia em redes P2P: TVUPlayer, Livestation e GoalBit. Para este trabalho, foi criada uma plataforma de testes que pretendia simular um cenário quase real, e o tráfego que foi capturado, continha uma grande variedade de aplicações. O método proposto nesta dissertação recorre à técnica de Inspecção Profunda de Pacotes (DPI), e por isso, foi necessário 21nalisar o conteúdo dos pacotes a fim de encontrar padrões que se repetissem, e que iriam mais tarde ser usados para criar um conjunto de regras SNORT para detecção de pacotes chave· na rede, gerados por estas aplicações, afim de se poder correctamente classificar os fluxos de tráfego. Após descobrir que a aplicação Livestation deixou de funcionar com P2P, apenas as duas regras criadas até esse momento foram usadas. Quanto à aplicação TVUPlayer, foram criadas várias regras a partir do tráfego gerado por ela mesma e que tiveram uma boa taxa de precisão. Várias regras foram também criadas para a aplicação GoalBit em que foram usados quatro cenários: com e sem encriptação usando a opção de transmissão tracker, e com e sem encriptação usando a opção de transmissão sem necessidade de tracker (aqui foi usado o protocolo Kademlia). O método foi avaliado experimentalmente na plataforma de testes criada para o efeito, sendo demonstrado que a precisão do conjunto de regras para a aplicação GoallBit é de 97%.Fundação para a Ciência e a Tecnologia (FCT
    corecore