12,245 research outputs found

    iTeleScope: Intelligent Video Telemetry and Classification in Real-Time using Software Defined Networking

    Full text link
    Video continues to dominate network traffic, yet operators today have poor visibility into the number, duration, and resolutions of the video streams traversing their domain. Current approaches are inaccurate, expensive, or unscalable, as they rely on statistical sampling, middle-box hardware, or packet inspection software. We present {\em iTelescope}, the first intelligent, inexpensive, and scalable SDN-based solution for identifying and classifying video flows in real-time. Our solution is novel in combining dynamic flow rules with telemetry and machine learning, and is built on commodity OpenFlow switches and open-source software. We develop a fully functional system, train it in the lab using multiple machine learning algorithms, and validate its performance to show over 95\% accuracy in identifying and classifying video streams from many providers including Youtube and Netflix. Lastly, we conduct tests to demonstrate its scalability to tens of thousands of concurrent streams, and deploy it live on a campus network serving several hundred real users. Our system gives unprecedented fine-grained real-time visibility of video streaming performance to operators of enterprise and carrier networks at very low cost.Comment: 12 pages, 16 figure

    An Experimental Evaluation of the Computational Cost of a DPI Traffic Classifier

    Get PDF
    A common belief in the scientific community is that traffic classifiers based on deep packet inspection (DPI) are far more expensive in terms of computational complexity compared to statistical classifiers. In this paper we counter this notion by defining accurate models for a deep packet inspection classifier and a statistical one based on support vector machines, and by evaluating their actual processing costs through experimental analysis. The results suggest that, contrary to the common belief, a DPI classifier and an SVM-based one can have comparable computational costs. Although much work is left to prove that our results apply in more general cases, this preliminary analysis is a first indication of how DPI classifiers might not be as computationally complex, compared to other approaches, as we previously though

    SENATUS: An Approach to Joint Traffic Anomaly Detection and Root Cause Analysis

    Full text link
    In this paper, we propose a novel approach, called SENATUS, for joint traffic anomaly detection and root-cause analysis. Inspired from the concept of a senate, the key idea of the proposed approach is divided into three stages: election, voting and decision. At the election stage, a small number of \nop{traffic flow sets (termed as senator flows)}senator flows are chosen\nop{, which are used} to represent approximately the total (usually huge) set of traffic flows. In the voting stage, anomaly detection is applied on the senator flows and the detected anomalies are correlated to identify the most possible anomalous time bins. Finally in the decision stage, a machine learning technique is applied to the senator flows of each anomalous time bin to find the root cause of the anomalies. We evaluate SENATUS using traffic traces collected from the Pan European network, GEANT, and compare against another approach which detects anomalies using lossless compression of traffic histograms. We show the effectiveness of SENATUS in diagnosing anomaly types: network scans and DoS/DDoS attacks

    APIC: A method for automated pattern identification and classification

    Get PDF
    Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML

    A Survey of Methods for Encrypted Traffic Classification and Analysis

    Get PDF
    With the widespread use of encrypted data transport network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away a lot of information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths.Šifrování síťového provozu se v dnešní době stalo standardem. To přináší vysoké nároky na monitorování síťového provozu, zejména pak na analýzu provozu a detekci anomálií, které jsou závislé na znalosti typu síťového provozu. V tomto článku přinášíme přehled existujících způsobů klasifikace a analýzy šifrovaného provozu. Nejprve popisujeme nejrozšířenější šifrovací protokoly, a ukazujeme, jakým způsobem lze získat informace pro analýzu a klasifikaci šifrovaného provozu. Následně se zabýváme klasifikačními metodami založenými na obsahu paketů a vlastnostech síťového provozu. Tyto metody klasifikujeme pomocí zavedené taxonomie. Výhodou některých popsaných klasifikačních metod je schopnost rozeznat nejen šifrovací protokol, ale také šifrovaný aplikační protokol. Na závěr porovnáváme silné a slabé stránky všech popsaných klasifikačních metod

    A traffic classification method using machine learning algorithm

    Get PDF
    Applying concepts of attack investigation in IT industry, this idea has been developed to design a Traffic Classification Method using Data Mining techniques at the intersection of Machine Learning Algorithm, Which will classify the normal and malicious traffic. This classification will help to learn about the unknown attacks faced by IT industry. The notion of traffic classification is not a new concept; plenty of work has been done to classify the network traffic for heterogeneous application nowadays. Existing techniques such as (payload based, port based and statistical based) have their own pros and cons which will be discussed in this literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now

    Multitask Learning for Network Traffic Classification

    Full text link
    Traffic classification has various applications in today's Internet, from resource allocation, billing and QoS purposes in ISPs to firewall and malware detection in clients. Classical machine learning algorithms and deep learning models have been widely used to solve the traffic classification task. However, training such models requires a large amount of labeled data. Labeling data is often the most difficult and time-consuming process in building a classifier. To solve this challenge, we reformulate the traffic classification into a multi-task learning framework where bandwidth requirement and duration of a flow are predicted along with the traffic class. The motivation of this approach is twofold: First, bandwidth requirement and duration are useful in many applications, including routing, resource allocation, and QoS provisioning. Second, these two values can be obtained from each flow easily without the need for human labeling or capturing flows in a controlled and isolated environment. We show that with a large amount of easily obtainable data samples for bandwidth and duration prediction tasks, and only a few data samples for the traffic classification task, one can achieve high accuracy. We conduct two experiment with ISCX and QUIC public datasets and show the efficacy of our approach
    corecore