4,169 research outputs found
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
Malware still constitutes a major threat in the cybersecurity landscape, also
due to the widespread use of infection vectors such as documents. These
infection vectors hide embedded malicious code to the victim users,
facilitating the use of social engineering techniques to infect their machines.
Research showed that machine-learning algorithms provide effective detection
mechanisms against such threats, but the existence of an arms race in
adversarial settings has recently challenged such systems. In this work, we
focus on malware embedded in PDF files as a representative case of such an arms
race. We start by providing a comprehensive taxonomy of the different
approaches used to generate PDF malware, and of the corresponding
learning-based detection systems. We then categorize threats specifically
targeted against learning-based PDF malware detectors, using a well-established
framework in the field of adversarial machine learning. This framework allows
us to categorize known vulnerabilities of learning-based PDF malware detectors
and to identify novel attacks that may threaten such systems, along with the
potential defense mechanisms that can mitigate the impact of such threats. We
conclude the paper by discussing how such findings highlight promising research
directions towards tackling the more general challenge of designing robust
malware detectors in adversarial settings
CharBot: A Simple and Effective Method for Evading DGA Classifiers
Domain generation algorithms (DGAs) are commonly leveraged by malware to
create lists of domain names which can be used for command and control (C&C)
purposes. Approaches based on machine learning have recently been developed to
automatically detect generated domain names in real-time. In this work, we
present a novel DGA called CharBot which is capable of producing large numbers
of unregistered domain names that are not detected by state-of-the-art
classifiers for real-time detection of DGAs, including the recently published
methods FANCI (a random forest based on human-engineered features) and LSTM.MI
(a deep learning approach). CharBot is very simple, effective and requires no
knowledge of the targeted DGA classifiers. We show that retraining the
classifiers on CharBot samples is not a viable defense strategy. We believe
these findings show that DGA classifiers are inherently vulnerable to
adversarial attacks if they rely only on the domain name string to make a
decision. Designing a robust DGA classifier may, therefore, necessitate the use
of additional information besides the domain name alone. To the best of our
knowledge, CharBot is the simplest and most efficient black-box adversarial
attack against DGA classifiers proposed to date
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
Adversarial Detection of Flash Malware: Limitations and Open Issues
During the past four years, Flash malware has become one of the most
insidious threats to detect, with almost 600 critical vulnerabilities targeting
Adobe Flash disclosed in the wild. Research has shown that machine learning can
be successfully used to detect Flash malware by leveraging static analysis to
extract information from the structure of the file or its bytecode. However,
the robustness of Flash malware detectors against well-crafted evasion attempts
- also known as adversarial examples - has never been investigated. In this
paper, we propose a security evaluation of a novel, representative Flash
detector that embeds a combination of the prominent, static features employed
by state-of-the-art tools. In particular, we discuss how to craft adversarial
Flash malware examples, showing that it suffices to manipulate the
corresponding source malware samples slightly to evade detection. We then
empirically demonstrate that popular defense techniques proposed to mitigate
evasion attempts, including re-training on adversarial examples, may not always
be sufficient to ensure robustness. We argue that this occurs when the feature
vectors extracted from adversarial examples become indistinguishable from those
of benign data, meaning that the given feature representation is intrinsically
vulnerable. In this respect, we are the first to formally define and
quantitatively characterize this vulnerability, highlighting when an attack can
be countered by solely improving the security of the learning algorithm, or
when it requires also considering additional features. We conclude the paper by
suggesting alternative research directions to improve the security of
learning-based Flash malware detectors
- …