163 research outputs found

    Formalizing evasion attacks against machine learning security detectors

    Get PDF
    Recent work has shown that adversarial examples can bypass machine learning-based threat detectors relying on static analysis by applying minimal perturbations. To preserve malicious functionality, previous attacks either apply trivial manipulations (e.g. padding), potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard adversarial variants that do not correctly execute in sandbox environments. While machine learning systems for detecting SQL injections have been proposed in the literature, no attacks have been tested against the proposed solutions to assess the effectiveness and robustness of these methods. In this thesis, we overcome these limitations by developing RAMEn, a unifying framework that (i) can express attacks for different domains, (ii) generalizes previous attacks against machine learning models, and (iii) uses functions that preserve the functionality of manipulated objects. We provide new attacks for both Windows malware and SQL injection detection scenarios by exploiting the format used for representing these objects. To show the efficacy of RAMEn, we provide experimental results of our strategies in both white-box and black-box settings. The white-box attacks against Windows malware detectors show that it takes only the 2% of the input size of the target to evade detection with ease. To further speed up the black-box attacks, we overcome the issues mentioned before by presenting a novel family of black-box attacks that are both query-efficient and functionality-preserving, as they rely on the injection of benign content, which will never be executed, either at the end of the malicious file, or within some newly-created sections, encoded in an algorithm called GAMMA. We also evaluate whether GAMMA transfers to other commercial antivirus solutions, and surprisingly find that it can evade many commercial antivirus engines. For evading SQLi detectors, we create WAF-A-MoLE, a mutational fuzzer that that exploits random mutations of the input samples, keeping alive only the most promising ones. WAF-A-MoLE is capable of defeating detectors built with different architectures by using the novel practical manipulations we have proposed. To facilitate reproducibility and future work, we open-source our framework and corresponding attack implementations. We conclude by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts naturally into the learning process

    Practical Attacks on Machine Learning: A Case Study on Adversarial Windows Malware

    Get PDF
    While machine learning is vulnerable to adversarial examples, it still lacks systematic procedures and tools for evaluating its security in different application contexts. In this article, we discuss how to develop automated and scalable security evaluations of machine learning using practical attacks, reporting a use case on Windows malware detection

    Explaining Machine Learning DGA Detectors from DNS Traffic Data

    Get PDF
    One of the most common causes of lack of continuity of online systems stems from a widely popular Cyber Attack known as Distributed Denial of Service (DDoS), in which a network of infected devices (botnet) gets exploited to flood the computational capacity of services through the commands of an attacker. This attack is made by leveraging the Domain Name System (DNS) technology through Domain Generation Algorithms (DGAs), a stealthy connection strategy that yet leaves suspicious data patterns. To detect such threats, advances in their analysis have been made. For the majority, they found Machine Learning (ML) as a solution, which can be highly effective in analyzing and classifying massive amounts of data. Although strongly performing, ML models have a certain degree of obscurity in their decision-making process. To cope with this problem, a branch of ML known as Explainable ML tries to break down the black-box nature of classifiers and make them interpretable and human-readable. This work addresses the problem of Explainable ML in the context of botnet and DGA detection, which at the best of our knowledge, is the first to concretely break down the decisions of ML classifiers when devised for botnet/DGA detection, therefore providing global and local explanations

    WAF-A-MoLE: An adversarial tool for assessing ML-based WAFs

    Get PDF
    Abstract Web Application Firewalls (WAFs) are plug-and-play security gateways that promise to enhance the security of a (potentially vulnerable) system with minimal cost and configuration. In recent years, machine learning-based WAFs are catching up with traditional, signature-based ones. They are competitive because they do not require predefined rules; instead, they infer their rules through a learning process. In this paper, we present WAF-A-MoLE, a WAF breaching tool. It uses guided mutational-based fuzzing to generate adversarial examples. The main applications include WAF ( i ) penetration testing, ( i i ) benchmarking and ( i i i ) hardening

    Robust Machine Learning for Malware Detection over Time

    Get PDF
    The presence and persistence of Android malware is an on-going threat that plagues this information era, and machine learning technologies are now extensively used to deploy more effective detectors that can block the majority of these malicious programs. However, these algorithms have not been developed to pursue the natural evolution of malware, and their performances significantly degrade over time because of such concept-drift. Currently, state-of-the-art techniques only focus on detecting the presence of such drift, or they address it by relying on frequent updates of models. Hence, there is a lack of knowledge regarding the cause of the concept drift, and ad-hoc solutions that can counter the passing of time are still underinvestigated. In this work, we commence to address these issues as we propose (i) a drift-analysis framework to identify which characteristics of data are causing the drift, and (ii) SVM-CB, a time-aware classifier that leverages the drift-analysis information to slow down the performance drop. We highlight the efficacy of our contribution by comparing its degradation over time with a state-of-the-art classifier, and we show that SVM-CB better withstand the distribution changes that naturally characterizes the malware domain. We conclude by discussing the limitations of our approach and how our contribution can be taken as a first step towards more time-resistant classifiers that not only tackle, but also understand the concept drift that affect data

    Adversarial ModSecurity: Countering Adversarial SQL Injections with Robust Machine Learning

    Full text link
    ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set, identifying well-known attack patterns. Each rule in the CRS is manually assigned a weight, based on the severity of the corresponding attack, and a request is detected as malicious if the sum of the weights of the firing rules exceeds a given threshold. In this work, we show that this simple strategy is largely ineffective for detecting SQL injection (SQLi) attacks, as it tends to block many legitimate requests, while also being vulnerable to adversarial SQLi attacks, i.e., attacks intentionally manipulated to evade detection. To overcome these issues, we design a robust machine learning model, named AdvModSec, which uses the CRS rules as input features, and it is trained to detect adversarial SQLi attacks. Our experiments show that AdvModSec, being trained on the traffic directed towards the protected web services, achieves a better trade-off between detection and false positive rates, improving the detection rate of the vanilla version of ModSecurity with CRS by 21%. Moreover, our approach is able to improve its adversarial robustness against adversarial SQLi attacks by 42%, thereby taking a step forward towards building more robust and trustworthy WAFs

    Explaining Vulnerabilities of Deep Learning to Adversarial Malware Binaries

    Get PDF
    Recent work has shown that deep-learning algorithms for malware detection are also susceptible to adversarial examples, i.e., carefully-crafted perturbations to input malware that enable misleading classification. Although this has questioned their suitability for this task, it is not yet clear why such algorithms are easily fooled also in this particular application domain. In this work, we take a first step to tackle this issue by leveraging explainable machine-learning algorithms developed to interpret the black-box decisions of deep neural networks. In particular, we use an explainable technique known as feature attribution to identify the most influential input features contributing to each decision, and adapt it to provide meaningful explanations to the classification of malware binaries. In this case, we find that a recently-proposed convolutional neural network does not learn any meaningful characteristic for malware detection from the data and text sections of executable files, but rather tends to learn to discriminate between benign and malware samples based on the characteristics found in the file header. Based on this finding, we propose a novel attack algorithm that generates adversarial malware binaries by only changing few tens of bytes in the file header. With respect to the other state-of-the-art attack algorithms, our attack does not require injecting any padding bytes at the end of the file, and it is much more efficient, as it requires manipulating much fewer bytes

    ZenHackAdemy: Ethical Hacking @ DIBRIS

    Get PDF
    Cybersecurity attacks are on the rise, and the current response is not effective enough. The need for a competent workforce, able to face attackers, is increasing. At the moment, the gap between academia and real-world skills is huge and academia cannot provide students with skills that match those of an attacker. To pass on these skills, teachers have to train students in scenarios as close as possible to real-world ones. Capture the Flag (CTF) competitions are a great tool to achieve this goal, since they encourage students to think as an attacker does, thus creating more awareness on the modalities and consequences of an attack. We describe our experience in running an educational activity on ethical hacking, which we proposed to computer science and computer engineering students. We organized seminars, outside formal classes, and provided online support on the hands-on part of the training. We delivered different types of exercises and held a final CTF competition. These activities resulted in growing a community of students and researchers interested in cybersecurity, and some of them have formed ZenHack, an official CTF team

    Explaining vulnerabilities of deep learning to adversarial malware binaries

    Get PDF
    Recent work has shown that deep-learning algorithms for malware detection are also susceptible to adversarial examples, i.e., carefully-crafted perturbations to input malware that enable misleading classification. Although this has questioned their suitability for this task, it is not yet clear why such algorithms are easily fooled also in this particular application domain. In this work, we take a first step to tackle this issue by leveraging explainable machine-learning algorithms developed to interpret the black-box decisions of deep neural networks. In particular, we use an explainable technique known as feature attribution to identify the most influential input features contributing to each decision, and adapt it to provide meaningful explanations to the classification of malware binaries. In this case, we find that a recently-proposed convolutional neural network does not learn any meaningful characteristic for malware detection from the data and text sections of executable files, but rather tends to learn to discriminate between benign and malware samples based on the characteristics found in the file header. Based on this finding, we propose a novel attack algorithm that generates adversarial malware binaries by only changing few tens of bytes in the file header. With respect to the other state-of-the-art attack algorithms, our attack does not require injecting any padding bytes at the end of the file, and it is much more efficient, as it requires manipulating much fewer bytes
    corecore