30 research outputs found
Evading Classifiers by Morphing in the Dark
Learning-based systems have been shown to be vulnerable to evasion through
adversarial data manipulation. These attacks have been studied under
assumptions that the adversary has certain knowledge of either the target model
internals, its training dataset or at least classification scores it assigns to
input samples. In this paper, we investigate a much more constrained and
realistic attack scenario wherein the target classifier is minimally exposed to
the adversary, revealing on its final classification decision (e.g., reject or
accept an input sample). Moreover, the adversary can only manipulate malicious
samples using a blackbox morpher. That is, the adversary has to evade the
target classifier by morphing malicious samples "in the dark". We present a
scoring mechanism that can assign a real-value score which reflects evasion
progress to each sample based on the limited information available. Leveraging
on such scoring mechanism, we propose an evasion method -- EvadeHC -- and
evaluate it against two PDF malware detectors, namely PDFRate and Hidost. The
experimental evaluation demonstrates that the proposed evasion attacks are
effective, attaining evasion rate on the evaluation dataset.
Interestingly, EvadeHC outperforms the known classifier evasion technique that
operates based on classification scores output by the classifiers. Although our
evaluations are conducted on PDF malware classifier, the proposed approaches
are domain-agnostic and is of wider application to other learning-based
systems
A Formalization of Robustness for Deep Neural Networks
Deep neural networks have been shown to lack robustness to small input
perturbations. The process of generating the perturbations that expose the lack
of robustness of neural networks is known as adversarial input generation. This
process depends on the goals and capabilities of the adversary, In this paper,
we propose a unifying formalization of the adversarial input generation process
from a formal methods perspective. We provide a definition of robustness that
is general enough to capture different formulations. The expressiveness of our
formalization is shown by modeling and comparing a variety of adversarial
attack techniques