1,140 research outputs found
Tracking Cyber Adversaries with Adaptive Indicators of Compromise
A forensics investigation after a breach often uncovers network and host
indicators of compromise (IOCs) that can be deployed to sensors to allow early
detection of the adversary in the future. Over time, the adversary will change
tactics, techniques, and procedures (TTPs), which will also change the data
generated. If the IOCs are not kept up-to-date with the adversary's new TTPs,
the adversary will no longer be detected once all of the IOCs become invalid.
Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular
expressions (regexes), up-to-date with a dynamic adversary. Our framework
solves the TTK problem in an automated, cyclic fashion to bracket a previously
discovered adversary. This tracking is accomplished through a data-driven
approach of self-adapting a given model based on its own detection
capabilities.
In our initial experiments, we found that the true positive rate (TPR) of the
adaptive solution degrades much less significantly over time than the naive
solution, suggesting that self-updating the model allows the continued
detection of positives (i.e., adversaries). The cost for this performance is in
the false positive rate (FPR), which increases over time for the adaptive
solution, but remains constant for the naive solution. However, the difference
in overall detection performance, as measured by the area under the curve
(AUC), between the two methods is negligible. This result suggests that
self-updating the model over time should be done in practice to continue to
detect known, evolving adversaries.Comment: This was presented at the 4th Annual Conf. on Computational Science &
Computational Intelligence (CSCI'17) held Dec 14-16, 2017 in Las Vegas,
Nevada, US
Tracking Cyber Adversaries with Adaptive Indicators of Compromise
A forensics investigation after a breach often uncovers network and host
indicators of compromise (IOCs) that can be deployed to sensors to allow early
detection of the adversary in the future. Over time, the adversary will change
tactics, techniques, and procedures (TTPs), which will also change the data
generated. If the IOCs are not kept up-to-date with the adversary's new TTPs,
the adversary will no longer be detected once all of the IOCs become invalid.
Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular
expressions (regexes), up-to-date with a dynamic adversary. Our framework
solves the TTK problem in an automated, cyclic fashion to bracket a previously
discovered adversary. This tracking is accomplished through a data-driven
approach of self-adapting a given model based on its own detection
capabilities.
In our initial experiments, we found that the true positive rate (TPR) of the
adaptive solution degrades much less significantly over time than the naive
solution, suggesting that self-updating the model allows the continued
detection of positives (i.e., adversaries). The cost for this performance is in
the false positive rate (FPR), which increases over time for the adaptive
solution, but remains constant for the naive solution. However, the difference
in overall detection performance, as measured by the area under the curve
(AUC), between the two methods is negligible. This result suggests that
self-updating the model over time should be done in practice to continue to
detect known, evolving adversaries.Comment: This was presented at the 4th Annual Conf. on Computational Science &
Computational Intelligence (CSCI'17) held Dec 14-16, 2017 in Las Vegas,
Nevada, US
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
Malware still constitutes a major threat in the cybersecurity landscape, also
due to the widespread use of infection vectors such as documents. These
infection vectors hide embedded malicious code to the victim users,
facilitating the use of social engineering techniques to infect their machines.
Research showed that machine-learning algorithms provide effective detection
mechanisms against such threats, but the existence of an arms race in
adversarial settings has recently challenged such systems. In this work, we
focus on malware embedded in PDF files as a representative case of such an arms
race. We start by providing a comprehensive taxonomy of the different
approaches used to generate PDF malware, and of the corresponding
learning-based detection systems. We then categorize threats specifically
targeted against learning-based PDF malware detectors, using a well-established
framework in the field of adversarial machine learning. This framework allows
us to categorize known vulnerabilities of learning-based PDF malware detectors
and to identify novel attacks that may threaten such systems, along with the
potential defense mechanisms that can mitigate the impact of such threats. We
conclude the paper by discussing how such findings highlight promising research
directions towards tackling the more general challenge of designing robust
malware detectors in adversarial settings
Formalizing evasion attacks against machine learning security detectors
Recent work has shown that adversarial examples can bypass machine learning-based threat detectors relying on static analysis by applying minimal perturbations.
To preserve malicious functionality, previous attacks either apply trivial manipulations (e.g. padding), potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard adversarial variants that do not correctly execute in sandbox environments.
While machine learning systems for detecting SQL injections have been proposed in the literature, no attacks have been tested against the proposed solutions to assess the effectiveness and robustness of these methods.
In this thesis, we overcome these limitations by developing RAMEn, a unifying framework that (i) can express attacks for different domains, (ii) generalizes previous attacks against machine learning models, and (iii) uses functions that preserve the functionality of manipulated objects.
We provide new attacks for both Windows malware and SQL injection detection scenarios by exploiting the format used for representing these objects.
To show the efficacy of RAMEn, we provide experimental results of our strategies in both white-box and black-box settings.
The white-box attacks against Windows malware detectors show that it takes only the 2% of the input size of the target to evade detection with ease.
To further speed up the black-box attacks, we overcome the issues mentioned before by presenting a novel family of black-box attacks that are both query-efficient and functionality-preserving, as they rely on the injection of benign content, which will never be executed, either at the end of the malicious file, or within some newly-created sections, encoded in an algorithm called GAMMA.
We also evaluate whether GAMMA transfers to other commercial antivirus solutions, and surprisingly find that it can evade many commercial antivirus engines.
For evading SQLi detectors, we create WAF-A-MoLE, a mutational fuzzer that that exploits random mutations of the input samples, keeping alive only the most promising ones.
WAF-A-MoLE is capable of defeating detectors built with different architectures by using the novel practical manipulations we have proposed.
To facilitate reproducibility and future work, we open-source our framework and corresponding attack implementations.
We conclude by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts naturally into the learning process
Recommended from our members
MDEA : malware detection with evolutionary adversarial learning
Many applications have used machine learning as a tool to detect malware. These
applications take in raw or processed binary data to feed neural network models to classify
benign or malicious files. Even though this approach has proved effective against dynamic
changes, such as encrypting, obfuscating and packing techniques, it is vulnerable to
specific evasion attacks to where that small changes to the input data cause
misclassification at test time. In this paper, I propose MDEA, an Adversarial Malware
Detection model that combines a neural network and evolutionary optimization attack
samples to make the network robust against evasion attacks. By retraining the model with
the evolved malware samples, network performance improves a big margin.Computer Science
MDEA: Malware Detection with Evolutionary Adversarial Learning
Malware detection have used machine learning to detect malware in programs.
These applications take in raw or processed binary data to neural network
models to classify as benign or malicious files. Even though this approach has
proven effective against dynamic changes, such as encrypting, obfuscating and
packing techniques, it is vulnerable to specific evasion attacks where that
small changes in the input data cause misclassification at test time. This
paper proposes a new approach: MDEA, an Adversarial Malware Detection model
uses evolutionary optimization to create attack samples to make the network
robust against evasion attacks. By retraining the model with the evolved
malware samples, its performance improves a significant margin.Comment: 8 pages, 6 figure
A NEAT Approach to Malware Classification
Current malware detection software often relies on machine learning, which is seen as an improvement over signature-based techniques. Problems with a machine learning based approach can arise when malware writers modify their code with the intent to evade detection. This leads to a cat and mouse situation where new models must constantly be trained to detect new malware variants. In this research, we experiment with genetic algorithms as a means of evolving machine learning models to detect malware. Genetic algorithms, which simulate natural selection, provide a way for models to adapt to continuous changes in a malware families, and thereby improve detection rates. Specifically, we use the Neuro-Evolution of Augmenting Topologies
(NEAT) algorithm to optimize machine learning classifiers based on decision trees and neural networks. We compare the performance of our NEAT approach to standard models, including random forest and support vector machines
An Adaptive Feature Centric XG Boost Ensemble Classifier Model for Improved Malware Detection and Classification
Machine learning (ML) is often used to solve the problem of malware detection and classification and various machine learning approaches are adapted to the problem of malware classification; still acquiring poor performance by the way of feature selection, and classification. To manage the issue, an efficient Adaptive Feature Centric XG Boost Ensemble Learner Classifier “AFC-XG Boost” novel algorithm is presented in this paper. The proposed model has been designed to handle varying data sets of malware detection obtained from Kaggle data set. The model turns the process of XG Boost classifier in several stages to optimize the performance. At preprocessing stage, the data set given has been noise removed, normalized and tamper removed using Feature Base Optimizer “FBO” algorithm. The FBO would normalize the data points as well as performs noise removal according to the feature values and their base information. Similarly, the performance of standard XG Boost has been optimized by adapting Feature selection using Class Based Principle Component Analysis “CBPCA” algorithm, which performs feature selection according to the fitness of any feature for different classes. Based on the selected features, the method generates regression tree for each feature considered. Based on the generated trees, the method performs classification by computing Tree Level Ensemble Similarity “TLES” and Class Level Ensemble Similarity “CLES”. Using both method computes the value of Class Match Similarity “CMS” based on which the malware has been classified. The proposed approach achieves 97% accuracy in malware detection and classification with the less time complexity of 34 seconds for 75000 sample
- …