466 research outputs found

    Adversarial machine learning for cyber security

    Get PDF
    This master thesis aims to take advantage of state of the art and tools that have been developed in Adversarial Machine Learning (AML) and related research branches to strengthen Machine Learning (ML) models used in cyber security. First, it seeks to collect, organize and summarize the most recent and potential state-of-the-art techniques in AML, considering that it is a research branch in an unstable state with a great diversity of difficult to contrast proposals, which rapidly evolve but are quickly replaced by attacks or defenses with greater potential. This summary is important considering that the AML literature is far from being able to create defensive techniques that effectively protect a ML model from all possible attacks, and it is relevant to analyze them both in detail and with criteria in order to apply them in practice. It is also useful to find biases in state-of-the-art to be considered regarding the measurement of the attack or defense effectiveness, which can be addressed by proposing methodologies and metrics to mitigate them. Additionally, it is considered inappropriate to analyze AML in isolation, considering that the robustness of a ML model to adversarial attacks is totally related to its generalization capacity to in-distribution cases, to its robustness to out-of-distribution cases, and to the possibility of overinterpretation, using spurious (but statistically valid) patterns in the model that may give a false sense of high performance. Therefore, this thesis proposes a methodology to previously evaluate the exposure of a model to these considerations, focusing on improving it in progressive order of priorities in each of its stages, and to guarantee satisfactory overall robustness. Based on this methodology, two interesting case studies are chosen to be explored in greater depth to evaluate their robustness to adversarial attacks, perform attacks to gain insights about their strengths and weaknesses, and finally propose improvements. In this process, all kinds of approaches are used depending on the type of problem evaluated and its assumptions, performing exploratory analysis, applying AML attacks and detailing their implications, proposing improvements and implementation of defenses such as Adversarial Training, and finally creating and proposing a methodology to correctly evaluate the effectiveness of a defense avoiding the biases of the state of the art. For each of the case studies, it is possible to create efficient adversarial attacks, analyze the strengths of each model, and in the case of the second case study, it is possible to increase the adversarial robustness of a Classification Convolutional Neural Network using Adversarial Training. This leads to other positive effects on the model, such as a better representation of the data, easier implementation of techniques to detect adversarial cases through anomaly analysis, and insights concerning its performance to reinforce the model from other viewp

    Formalizing evasion attacks against machine learning security detectors

    Get PDF
    Recent work has shown that adversarial examples can bypass machine learning-based threat detectors relying on static analysis by applying minimal perturbations. To preserve malicious functionality, previous attacks either apply trivial manipulations (e.g. padding), potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard adversarial variants that do not correctly execute in sandbox environments. While machine learning systems for detecting SQL injections have been proposed in the literature, no attacks have been tested against the proposed solutions to assess the effectiveness and robustness of these methods. In this thesis, we overcome these limitations by developing RAMEn, a unifying framework that (i) can express attacks for different domains, (ii) generalizes previous attacks against machine learning models, and (iii) uses functions that preserve the functionality of manipulated objects. We provide new attacks for both Windows malware and SQL injection detection scenarios by exploiting the format used for representing these objects. To show the efficacy of RAMEn, we provide experimental results of our strategies in both white-box and black-box settings. The white-box attacks against Windows malware detectors show that it takes only the 2% of the input size of the target to evade detection with ease. To further speed up the black-box attacks, we overcome the issues mentioned before by presenting a novel family of black-box attacks that are both query-efficient and functionality-preserving, as they rely on the injection of benign content, which will never be executed, either at the end of the malicious file, or within some newly-created sections, encoded in an algorithm called GAMMA. We also evaluate whether GAMMA transfers to other commercial antivirus solutions, and surprisingly find that it can evade many commercial antivirus engines. For evading SQLi detectors, we create WAF-A-MoLE, a mutational fuzzer that that exploits random mutations of the input samples, keeping alive only the most promising ones. WAF-A-MoLE is capable of defeating detectors built with different architectures by using the novel practical manipulations we have proposed. To facilitate reproducibility and future work, we open-source our framework and corresponding attack implementations. We conclude by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts naturally into the learning process
    • …
    corecore