192 research outputs found

    NetSentry: A deep learning approach to detecting incipient large-scale network attacks

    Get PDF
    Machine Learning (ML) techniques are increasingly adopted to tackle ever-evolving high-profile network attacks, including DDoS, botnet, and ransomware, due to their unique ability to extract complex patterns hidden in data streams. These approaches are however routinely validated with data collected in the same environment, and their performance degrades when deployed in different network topologies and/or applied on previously unseen traffic, as we uncover. This suggests malicious/benign behaviors are largely learned superficially and ML-based Network Intrusion Detection System (NIDS) need revisiting, to be effective in practice. In this paper we dive into the mechanics of large-scale network attacks, with a view to understanding how to use ML for Network Intrusion Detection (NID) in a principled way. We reveal that, although cyberattacks vary significantly in terms of payloads, vectors and targets, their early stages, which are critical to successful attack outcomes, share many similarities and exhibit important temporal correlations. Therefore, we treat NID as a time-sensitive task and propose NetSentry, perhaps the first of its kind NIDS that builds on Bidirectional Asymmetric LSTM (Bi-ALSTM), an original ensemble of sequential neural models, to detect network threats before they spread. We cross-evaluate NetSentry using two practical datasets, training on one and testing on the other, and demonstrate F1 score gains above 33% over the state-of-the-art, as well as up to 3 times higher rates of detecting attacks such as XSS and web bruteforce. Further, we put forward a novel data augmentation technique that boosts the generalization abilities of a broad range of supervised deep learning algorithms, leading to average F1 score gains above 35%

    NLP Methods in Host-based Intrusion Detection Systems: A Systematic Review and Future Directions

    Full text link
    Host based Intrusion Detection System (HIDS) is an effective last line of defense for defending against cyber security attacks after perimeter defenses (e.g., Network based Intrusion Detection System and Firewall) have failed or been bypassed. HIDS is widely adopted in the industry as HIDS is ranked among the top two most used security tools by Security Operation Centers (SOC) of organizations. Although effective and efficient HIDS is highly desirable for industrial organizations, the evolution of increasingly complex attack patterns causes several challenges resulting in performance degradation of HIDS (e.g., high false alert rate creating alert fatigue for SOC staff). Since Natural Language Processing (NLP) methods are better suited for identifying complex attack patterns, an increasing number of HIDS are leveraging the advances in NLP that have shown effective and efficient performance in precisely detecting low footprint, zero day attacks and predicting the next steps of attackers. This active research trend of using NLP in HIDS demands a synthesized and comprehensive body of knowledge of NLP based HIDS. Thus, we conducted a systematic review of the literature on the end to end pipeline of the use of NLP in HIDS development. For the end to end NLP based HIDS development pipeline, we identify, taxonomically categorize and systematically compare the state of the art of NLP methods usage in HIDS, attacks detected by these NLP methods, datasets and evaluation metrics which are used to evaluate the NLP based HIDS. We highlight the relevant prevalent practices, considerations, advantages and limitations to support the HIDS developers. We also outline the future research directions for the NLP based HIDS development

    Developing reliable anomaly detection system for critical hosts: a proactive defense paradigm

    Full text link
    Current host-based anomaly detection systems have limited accuracy and incur high processing costs. This is due to the need for processing massive audit data of the critical host(s) while detecting complex zero-day attacks which can leave minor, stealthy and dispersed artefacts. In this research study, this observation is validated using existing datasets and state-of-the-art algorithms related to the construction of the features of a host's audit data, such as the popular semantic-based extraction and decision engines, including Support Vector Machines, Extreme Learning Machines and Hidden Markov Models. There is a challenging trade-off between achieving accuracy with a minimum processing cost and processing massive amounts of audit data that can include complex attacks. Also, there is a lack of a realistic experimental dataset that reflects the normal and abnormal activities of current real-world computers. This thesis investigates the development of new methodologies for host-based anomaly detection systems with the specific aims of improving accuracy at a minimum processing cost while considering challenges such as complex attacks which, in some cases, can only be visible via a quantified computing resource, for example, the execution times of programs, the processing of massive amounts of audit data, the unavailability of a realistic experimental dataset and the automatic minimization of the false positive rate while dealing with the dynamics of normal activities. This study provides three original and significant contributions to this field of research which represent a marked advance in its body of knowledge. The first major contribution is the generation and release of a realistic intrusion detection systems dataset as well as the development of a metric based on fuzzy qualitative modeling for embedding the possible quality of realism in a dataset's design process and assessing this quality in existing or future datasets. The second key contribution is constructing and evaluating the hidden host features to identify the trivial differences between the normal and abnormal artefacts of hosts' activities at a minimum processing cost. Linux-centric features include the frequencies and ranges, frequency-domain representations and Gaussian interpretations of system call identifiers with execution times while, for Windows, a count of the distinct core Dynamic Linked Library calls is identified as a hidden host feature. The final key contribution is the development of two new anomaly-based statistical decision engines for capitalizing on the potential of some of the suggested hidden features and reliably detecting anomalies. The first engine, which has a forensic module, is based on stochastic theories including Hierarchical hidden Markov models and the second is modeled using Gaussian Mixture Modeling and Correntropy. The results demonstrate that the proposed host features and engines are competent for meeting the identified challenges

    Xanthus: Push-button Orchestration of Host Provenance Data Collection

    Get PDF
    Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Unfortunately, evaluating these anomaly detectors is hard. There are few high-quality, publicly-available audit logs, and there are no pre-existing frameworks that enable push-button creation of realistic system traces. To make trace generation easier, we created Xanthus, an automated tool that orchestrates virtual machines to generate realistic audit logs. Using Xanthus' simple management interface, administrators select a base VM image, configure a particular tracing framework to use within that VM, and define post-launch scripts that collect and save trace data. Once data collection is finished, Xanthus creates a self-describing archive, which contains the VM, its configuration parameters, and the collected trace data. We demonstrate that Xanthus hides many of the tedious (yet subtle) orchestration tasks that humans often get wrong; Xanthus avoids mistakes that lead to non-replicable experiments.Comment: 6 pages, 1 figure, 7 listings, 1 table, worksho

    Command & Control: Understanding, Denying and Detecting - A review of malware C2 techniques, detection and defences

    Full text link
    In this survey, we first briefly review the current state of cyber attacks, highlighting significant recent changes in how and why such attacks are performed. We then investigate the mechanics of malware command and control (C2) establishment: we provide a comprehensive review of the techniques used by attackers to set up such a channel and to hide its presence from the attacked parties and the security tools they use. We then switch to the defensive side of the problem, and review approaches that have been proposed for the detection and disruption of C2 channels. We also map such techniques to widely-adopted security controls, emphasizing gaps or limitations (and success stories) in current best practices.Comment: Work commissioned by CPNI, available at c2report.org. 38 pages. Listing abstract compressed from version appearing in repor

    Deteção de ataques de negação de serviços distribuídos na origem

    Get PDF
    From year to year new records of the amount of traffic in an attack are established, which demonstrate not only the constant presence of distributed denialof-service attacks, but also its evolution, demarcating itself from the other network threats. The increasing importance of resource availability alongside the security debate on network devices and infrastructures is continuous, given the preponderant role in both the home and corporate domains. In the face of the constant threat, the latest network security systems have been applying pattern recognition techniques to infer, detect, and react more quickly and assertively. This dissertation proposes methodologies to infer network activities patterns, based on their traffic: follows a behavior previously defined as normal, or if there are deviations that raise suspicions about the normality of the action in the network. It seems that the future of network defense systems continues in this direction, not only by increasing amount of traffic, but also by the diversity of actions, services and entities that reflect different patterns, thus contributing to the detection of anomalous activities on the network. The methodologies propose the collection of metadata, up to the transport layer of the osi model, which will then be processed by the machien learning algorithms in order to classify the underlying action. Intending to contribute beyond denial-of-service attacks and the network domain, the methodologies were described in a generic way, in order to be applied in other scenarios of greater or less complexity. The third chapter presents a proof of concept with attack vectors that marked the history and a few evaluation metrics that allows to compare the different classifiers as to their success rate, given the various activities in the network and inherent dynamics. The various tests show flexibility, speed and accuracy of the various classification algorithms, setting the bar between 90 and 99 percent.De ano para ano são estabelecidos novos recordes de quantidade de tráfego num ataque, que demonstram não só a presença constante de ataques de negação de serviço distribuídos, como também a sua evolução, demarcando-se das outras ameaças de rede. A crescente importância da disponibilidade de recursos a par do debate sobre a segurança nos dispositivos e infraestruturas de rede é contínuo, dado o papel preponderante tanto no dominio doméstico como no corporativo. Face à constante ameaça, os sistemas de segurança de rede mais recentes têm vindo a aplicar técnicas de reconhecimento de padrões para inferir, detetar e reagir de forma mais rápida e assertiva. Esta dissertação propõe metodologias para inferir padrões de atividades na rede, tendo por base o seu tráfego: se segue um comportamento previamente definido como normal, ou se existem desvios que levantam suspeitas sobre normalidade da ação na rede. Tudo indica que o futuro dos sistemas de defesa de rede continuará neste sentido, servindo-se não só do crescente aumento da quantidade de tráfego, como também da diversidade de ações, serviços e entidades que refletem padrões distintos contribuindo assim para a deteção de atividades anómalas na rede. As metodologias propõem a recolha de metadados, até á camada de transporte, que seguidamente serão processados pelos algoritmos de aprendizagem automática com o objectivo de classificar a ação subjacente. Pretendendo que o contributo fosse além dos ataques de negação de serviço e do dominio de rede, as metodologias foram descritas de forma tendencialmente genérica, de forma a serem aplicadas noutros cenários de maior ou menos complexidade. No quarto capítulo é apresentada uma prova de conceito com vetores de ataques que marcaram a história e, algumas métricas de avaliação que permitem comparar os diferentes classificadores quanto à sua taxa de sucesso, face às várias atividades na rede e inerentes dinâmicas. Os vários testes mostram flexibilidade, rapidez e precisão dos vários algoritmos de classificação, estabelecendo a fasquia entre os 90 e os 99 por cento.Mestrado em Engenharia de Computadores e Telemátic

    TOWARDS A HOLISTIC EFFICIENT STACKING ENSEMBLE INTRUSION DETECTION SYSTEM USING NEWLY GENERATED HETEROGENEOUS DATASETS

    Get PDF
    With the exponential growth of network-based applications globally, there has been a transformation in organizations\u27 business models. Furthermore, cost reduction of both computational devices and the internet have led people to become more technology dependent. Consequently, due to inordinate use of computer networks, new risks have emerged. Therefore, the process of improving the speed and accuracy of security mechanisms has become crucial.Although abundant new security tools have been developed, the rapid-growth of malicious activities continues to be a pressing issue, as their ever-evolving attacks continue to create severe threats to network security. Classical security techniquesfor instance, firewallsare used as a first line of defense against security problems but remain unable to detect internal intrusions or adequately provide security countermeasures. Thus, network administrators tend to rely predominantly on Intrusion Detection Systems to detect such network intrusive activities. Machine Learning is one of the practical approaches to intrusion detection that learns from data to differentiate between normal and malicious traffic. Although Machine Learning approaches are used frequently, an in-depth analysis of Machine Learning algorithms in the context of intrusion detection has received less attention in the literature.Moreover, adequate datasets are necessary to train and evaluate anomaly-based network intrusion detection systems. There exist a number of such datasetsas DARPA, KDDCUP, and NSL-KDDthat have been widely adopted by researchers to train and evaluate the performance of their proposed intrusion detection approaches. Based on several studies, many such datasets are outworn and unreliable to use. Furthermore, some of these datasets suffer from a lack of traffic diversity and volumes, do not cover the variety of attacks, have anonymized packet information and payload that cannot reflect the current trends, or lack feature set and metadata.This thesis provides a comprehensive analysis of some of the existing Machine Learning approaches for identifying network intrusions. Specifically, it analyzes the algorithms along various dimensionsnamely, feature selection, sensitivity to the hyper-parameter selection, and class imbalance problemsthat are inherent to intrusion detection. It also produces a new reliable dataset labeled Game Theory and Cyber Security (GTCS) that matches real-world criteria, contains normal and different classes of attacks, and reflects the current network traffic trends. The GTCS dataset is used to evaluate the performance of the different approaches, and a detailed experimental evaluation to summarize the effectiveness of each approach is presented. Finally, the thesis proposes an ensemble classifier model composed of multiple classifiers with different learning paradigms to address the issue of detection accuracy and false alarm rate in intrusion detection systems

    The effectiveness of evasion techniques against intrusion prevention systems

    Get PDF
    Evaasioita ja evaasiokombinaatiota käytetään naamioimaan hyökkäyksiä, jotta tietoturvalaitteet eivät havaitsisi niitä. Diplomityössä tutkitaan näiden tekniikoiden tehokkuutta uusimpia tunkeutumisenestojärjestelmiä vastaan. Yhteensä 11 tunkeutumisenestojärjestelmää tutkittiin, joista 10 on kaupallista ja yksi ilmainen. Tutkimuksessa suoritettiin neljä koetta. Jokainen koe sisälsi miljoona hyökkäystä, jotka suoritettiin jokaista tunkeutumisenestojärjestelmää vastaan satunnaisin evaasioin ja evaasiokombinaatioin. Käytetty hyökkäys pysyi samana yksittäisen kokeen aikana, mutta jokainen hyökkäys oli naamioitu eri evaasiotekniikoin. Yhtenäistettyjä konfiguraatioita käytettiin, jotta saataisiin vertailukelpoisia tuloksia. Tulokset osoittavat, että evaasiotekniikat ovat toimivia suurinta osaa testattuja tunkeutumisenestojärjestelmiä vastaan. Vaikka osa evaasiotekniikoista on peräisin 1990-luvulta, ne voidaan saada hienosäädettyä huijaamaan suurinta osaa testatuista laitteista. Yksi evaasiotekniikka ei ole aina riittävä, jotta voitaisiin välttää hyökkäyksen havainnointi. Monen eri tekniikan yhdistäminen lisää kuitenkin todennäköisyyttä löytää tapa kiertää havainnointi.Evasions and evasion combinations are used to masquerade attacks in order to avoid detection by security appliances. This thesis evaluates the effectiveness of these techniques against the state of the art intrusion prevention systems. In total, 11 intrusion prevention systems were studied, 10 commercial and 1 free solution. Four experiments were conducted in this study. Each of the experiments contained a million attacks that were performed with randomized evasions and evasion combinations against each intrusion prevention system. The used attack stayed the same during a single experiment, but each attack was disguised with different evasion techniques. Standardized configurations were used in order to produce comparable results. The results indicate that evasion techniques are effective against the majority of tested intrusion prevention systems. Even though some of the techniques are from the 1990s, they can be fine-tuned to fool most of the tested appliances. One evasion technique is not always enough to avoid detection, but combining multiple techniques increases the possibility to find a way to evade detection
    corecore