9 research outputs found

    Waterfall Traffic Classification: A Quick Approach to Optimizing Cascade Classifiers

    Get PDF
    Heterogeneous wireless communication networks, like 4G LTE, transport diverse kinds of IP traffic: voice, video, Internet data, and more. In order to effectively manage such networks, administrators need adequate tools, of which traffic classification is the basis for visualizing, shaping, and filtering the broad streams of IP packets observed nowadays. In this paper, we describe a modular, cascading traffic classification system—the Waterfall architecture—and we extensively describe a novel technique for its optimization—in terms of CPU time, number of errors, and percentage of unrecognized flows. We show how to significantly accelerate the process of exhaustive search for the best performing cascade. We employ five datasets of real Internet transmissions and seven traffic analysis methods to demonstrate that our proposal yields valid results and outperforms a greedy optimizer

    KLASTERISASI DAN ANALISIS TRAFIK INTERNET MENGGUNAKAN FUZZY C MEAN DENGAN EKSTRAKSI FITUR DATA

    Get PDF
    Internet facilities is one important part of the infrastructure of the campus at this time. Internet facility is a part of teaching and learning activities. Important part of the internet facility is the internet bandwidth, which is often deemed less bandwidth for certain majors at certain hours of lecture hours especially active. To overcome this there needs to be an analysis and clustering of the internet traffic at each point where the distribution of bandwidth is done so that in the end can provide information that can support decision granting bandwidth at each point there. One algorithm for clustering algorithms used are Fuzzy C-Mean, in which the clustering process before the beginning of the internet bandwidth usage data that exists in one period will be collected to be input to the Fuzzy C-Mean algorithm for the distribution of clusters on the use of existing bandwidth based applications that use the internet and network users. But the initial dataset that of the Fuzzy C Mean is not optimal, so we need some optimization dataset using feature extraction data so that the resulting clusters by Fuzzy C Mean algorithm has the accurate output. Results to be obtained from this study is the extraction of feature data that is most appropriate to perform clustering and analysis of Internet traffic based on user applications and the amount of capacity used by the user, which information the clustering results can be used to optimize internet bandwidt

    Independent comparison of popular DPI tools for traffic classification

    Get PDF
    Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classification. However, the actual performance of DPI is still unclear to the research community, since the lack of public datasets prevent the comparison and reproducibility of their results. This paper presents a comprehensive comparison of 6 well-known DPI tools, which are commonly used in the traffic classification literature. Our study includes 2 commercial products (PACE and NBAR) and 4 open-source tools (OpenDPI, L7-filter, nDPI, and Libprotoident). We studied their performance in various scenarios (including packet and flow truncation) and at different classification levels (application protocol, application and web service). We carefully built a labeled dataset with more than 750 K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full packet payloads, to the research community. We believe this dataset could become a common benchmark for the comparison and validation of network traffic classifiers. Our results present PACE, a commercial tool, as the most accurate solution. Surprisingly, we find that some open-source tools, such as nDPI and Libprotoident, also achieve very high accuracy.Peer ReviewedPostprint (author’s final draft

    Towards the Deployment of Machine Learning Solutions in Network Traffic Classification: A Systematic Survey

    Get PDF
    International audienceTraffic analysis is a compound of strategies intended to find relationships, patterns, anomalies, and misconfigurations, among others things, in Internet traffic. In particular, traffic classification is a subgroup of strategies in this field that aims at identifying the application's name or type of Internet traffic. Nowadays, traffic classification has become a challenging task due to the rise of new technologies, such as traffic encryption and encapsulation, which decrease the performance of classical traffic classification strategies. Machine Learning gains interest as a new direction in this field, showing signs of future success, such as knowledge extraction from encrypted traffic, and more accurate Quality of Service management. Machine Learning is fast becoming a key tool to build traffic classification solutions in real network traffic scenarios; in this sense, the purpose of this investigation is to explore the elements that allow this technique to work in the traffic classification field. Therefore, a systematic review is introduced based on the steps to achieve traffic classification by using Machine Learning techniques. The main aim is to understand and to identify the procedures followed by the existing works to achieve their goals. As a result, this survey paper finds a set of trends derived from the analysis performed on this domain; in this manner, the authors expect to outline future directions for Machine Learning based traffic classification

    Network traffic classification : from theory to practice

    Get PDF
    Since its inception until today, the Internet has been in constant transformation. The analysis and monitoring of data networks try to shed some light on this huge black box of interconnected computers. In particular, the classification of the network traffic has become crucial for understanding the Internet. During the last years, the research community has proposed many solutions to accurately identify and classify the network traffic. However, the continuous evolution of Internet applications and their techniques to avoid detection make their identification a very challenging task, which is far from being completely solved. This thesis addresses the network traffic classification problem from a more practical point of view, filling the gap between the real-world requirements from the network industry, and the research carried out. The first block of this thesis aims to facilitate the deployment of existing techniques in production networks. To achieve this goal, we study the viability of using NetFlow as input in our classification technique, a monitoring protocol already implemented in most routers. Since the application of packet sampling has become almost mandatory in large networks, we also study its impact on the classification and propose a method to improve the accuracy in this scenario. Our results show that it is possible to achieve high accuracy with both sampled and unsampled NetFlow data, despite the limited information provided by NetFlow. Once the classification solution is deployed it is important to maintain its accuracy over time. Current network traffic classification techniques have to be regularly updated to adapt them to traffic changes. The second block of this thesis focuses on this issue with the goal of automatically maintaining the classification solution without human intervention. Using the knowledge of the first block, we propose a classification solution that combines several techniques only using Sampled NetFlow as input for the classification. Then, we show that classification models suffer from temporal and spatial obsolescence and, therefore, we design an autonomic retraining system that is able to automatically update the models and keep the classifier accurate along time. Going one step further, we introduce next the use of stream-based Machine Learning techniques for network traffic classification. In particular, we propose a classification solution based on Hoeffding Adaptive Trees. Apart from the features of stream-based techniques (i.e., process an instance at a time and inspect it only once, with a predefined amount of memory and a bounded amount of time), our technique is able to automatically adapt to the changes in the traffic by using only NetFlow data as input for the classification. The third block of this thesis aims to be a first step towards the impartial validation of state-of-the-art classification techniques. The wide range of techniques, datasets, and ground-truth generators make the comparison of different traffic classifiers a very difficult task. To achieve this goal we evaluate the reliability of different Deep Packet Inspection-based techniques (DPI) commonly used in the literature for ground-truth generation. The results we obtain show that some well-known DPI techniques present several limitations that make them not recommendable as a ground-truth generator in their current state. In addition, we publish some of the datasets used in our evaluations to address the lack of publicly available datasets and make the comparison and validation of existing techniques easier

    Intelligent Circuits and Systems

    Get PDF
    ICICS-2020 is the third conference initiated by the School of Electronics and Electrical Engineering at Lovely Professional University that explored recent innovations of researchers working for the development of smart and green technologies in the fields of Energy, Electronics, Communications, Computers, and Control. ICICS provides innovators to identify new opportunities for the social and economic benefits of society.  This conference bridges the gap between academics and R&D institutions, social visionaries, and experts from all strata of society to present their ongoing research activities and foster research relations between them. It provides opportunities for the exchange of new ideas, applications, and experiences in the field of smart technologies and finding global partners for future collaboration. The ICICS-2020 was conducted in two broad categories, Intelligent Circuits & Intelligent Systems and Emerging Technologies in Electrical Engineering
    corecore