2,476 research outputs found

    Towards the Deployment of Machine Learning Solutions in Network Traffic Classification: A Systematic Survey

    Get PDF
    International audienceTraffic analysis is a compound of strategies intended to find relationships, patterns, anomalies, and misconfigurations, among others things, in Internet traffic. In particular, traffic classification is a subgroup of strategies in this field that aims at identifying the application's name or type of Internet traffic. Nowadays, traffic classification has become a challenging task due to the rise of new technologies, such as traffic encryption and encapsulation, which decrease the performance of classical traffic classification strategies. Machine Learning gains interest as a new direction in this field, showing signs of future success, such as knowledge extraction from encrypted traffic, and more accurate Quality of Service management. Machine Learning is fast becoming a key tool to build traffic classification solutions in real network traffic scenarios; in this sense, the purpose of this investigation is to explore the elements that allow this technique to work in the traffic classification field. Therefore, a systematic review is introduced based on the steps to achieve traffic classification by using Machine Learning techniques. The main aim is to understand and to identify the procedures followed by the existing works to achieve their goals. As a result, this survey paper finds a set of trends derived from the analysis performed on this domain; in this manner, the authors expect to outline future directions for Machine Learning based traffic classification

    A Survey of Methods for Encrypted Traffic Classification and Analysis

    Get PDF
    With the widespread use of encrypted data transport network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away a lot of information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths.Šifrování síťového provozu se v dnešní době stalo standardem. To přináší vysoké nároky na monitorování síťového provozu, zejména pak na analýzu provozu a detekci anomálií, které jsou závislé na znalosti typu síťového provozu. V tomto článku přinášíme přehled existujících způsobů klasifikace a analýzy šifrovaného provozu. Nejprve popisujeme nejrozšířenější šifrovací protokoly, a ukazujeme, jakým způsobem lze získat informace pro analýzu a klasifikaci šifrovaného provozu. Následně se zabýváme klasifikačními metodami založenými na obsahu paketů a vlastnostech síťového provozu. Tyto metody klasifikujeme pomocí zavedené taxonomie. Výhodou některých popsaných klasifikačních metod je schopnost rozeznat nejen šifrovací protokol, ale také šifrovaný aplikační protokol. Na závěr porovnáváme silné a slabé stránky všech popsaných klasifikačních metod

    A survey on the application of deep learning for code injection detection

    Get PDF
    Abstract Code injection is one of the top cyber security attack vectors in the modern world. To overcome the limitations of conventional signature-based detection techniques, and to complement them when appropriate, multiple machine learning approaches have been proposed. While analysing these approaches, the surveys focus predominantly on the general intrusion detection, which can be further applied to specific vulnerabilities. In addition, among the machine learning steps, data preprocessing, being highly critical in the data analysis process, appears to be the least researched in the context of Network Intrusion Detection, namely in code injection. The goal of this survey is to fill in the gap through analysing and classifying the existing machine learning techniques applied to the code injection attack detection, with special attention to Deep Learning. Our analysis reveals that the way the input data is preprocessed considerably impacts the performance and attack detection rate. The proposed full preprocessing cycle demonstrates how various machine-learning-based approaches for detection of code injection attacks take advantage of different input data preprocessing techniques. The most used machine learning methods and preprocessing stages have been also identified

    APIC: A method for automated pattern identification and classification

    Get PDF
    Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML

    Salattujen komento- ja ohjauskanavien havaitseminen verkkosormenjälkien avulla

    Get PDF
    The threat landscape of the Internet has evolved drastically into an environment where malware are increasingly developed by financially motivated cybercriminal groups who mirror legitimate businesses in their structure and processes. These groups develop sophisticated malware with the aim of transforming persistent control over large numbers of infected machines into profit. Recent developments have shown that malware authors seek to hide their Command and Control channels by implementing custom application layer protocols and using custom encryption algorithms. This technique effectively thwarts conventional pattern-based detection mechanisms. This thesis presents network fingerprints, a novel way of performing network-based detection of encrypted Command and Control channels. The goal of the work was to produce a proof of concept system that is able to generate accurate and reliable network signatures for this purpose. The thesis presents and explains the individual phases of an analysis pipeline that was built to process and analyze malware network traffic and to produce network fingerprint signatures. The analysis system was used to generate network fingerprints that were deployed to an intrusion detection system in real-world networks for a test period of 17 days. The experimental phase produced 71 true positive detections and 9 false positive detections, and therefore proved that the established technique is capable of performing detection of targeted encrypted Command and Control channels. Furthermore, the effects on the performance of the underlying intrusion detection system were measured. These results showed that network fingerprints induce an increase of 2-9% to the packet loss and a small increase to the overall computational load of the intrusion detection system.Internetin uhkaympäristön radikaalin kehittymisen myötä edistyksellisiä haittaohjelmia kehittävät kyberrikollisryhmät ovat muuttuneet järjestäytyneiksi ja taloudellista voittoa tavoitteleviksi organisaatioiksi. Nämä rakenteiltaan ja prosesseiltaan laillisia yrityksiä muistuttavat organisaatiot pyrkivät saastuttamaan suuria määriä tietokoneita ja saavuttamaan yhtämittaisen hallintakyvyn. Tutkimukset ovat osoittaneet, että tuntemattomien salausmenetelmien ja uusien sovellustason protokollien käyttö haittaohjelmien komento- ja hallintakanavien piilottamiseksi tietoverkoissa ovat kasvussa. Tämän kaltaiset tekniikat vaikeuttavat oleellisesti perinteisiä toistuviin kuvioihin perustuvia havaitsemismenetelmiä. Tämä työ esittelee salattujen komento- ja hallintakanavien havaitsemiseen suunnitellun uuden konseptin, verkkosormenjäljet. Työn tavoitteena oli toteuttaa prototyyppijärjestelmä, joka analysoi ja prosessoi haittaohjelmaliikennettä, sekä kykenee tuottamaan tarkkoja ja tehokkaita haittaohjelmakohtaisia verkkosormenjälkitunnisteita. Työ selittää verkkosormenjälkien teorian ja käy yksityiskohtaisesti läpi kehitetyn järjestelmän eri osiot ja vaiheet. Järjestelmästä tuotetut verkkosormenjäljet asennettiin 17 päiväksi oikeisiin tietoverkkoihin osaksi tunkeilijan havaitsemisjärjestelmää. Testijakso tuotti yhteensä 71 oikeaa haittaohjelmahavaintoa sekä 9 väärää havaintoa. Menetelmän käyttöönoton vaikutukset tunkeilijan havaitsemisjärjestelmän suorituskykyyn olivat 2 – 9 % kasvu pakettihäviössä ja pieni nousu laskennallisessa kokonaiskuormituksessa. Tulokset osoittavat, että kehitetty järjestelmä kykenee onnistuneesti analysoimaan haittaohjelmaliikennettä sekä tuottamaan salattuja komento- ja hallintakanavia havaitsevia verkkosormenjälkiä

    A traffic classification method using machine learning algorithm

    Get PDF
    Applying concepts of attack investigation in IT industry, this idea has been developed to design a Traffic Classification Method using Data Mining techniques at the intersection of Machine Learning Algorithm, Which will classify the normal and malicious traffic. This classification will help to learn about the unknown attacks faced by IT industry. The notion of traffic classification is not a new concept; plenty of work has been done to classify the network traffic for heterogeneous application nowadays. Existing techniques such as (payload based, port based and statistical based) have their own pros and cons which will be discussed in this literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now
    corecore