26,965 research outputs found
Secure Detection of Image Manipulation by means of Random Feature Selection
We address the problem of data-driven image manipulation detection in the
presence of an attacker with limited knowledge about the detector.
Specifically, we assume that the attacker knows the architecture of the
detector, the training data and the class of features V the detector can rely
on. In order to get an advantage in his race of arms with the attacker, the
analyst designs the detector by relying on a subset of features chosen at
random in V. Given its ignorance about the exact feature set, the adversary
attacks a version of the detector based on the entire feature set. In this way,
the effectiveness of the attack diminishes since there is no guarantee that
attacking a detector working in the full feature space will result in a
successful attack against the reduced-feature detector. We theoretically prove
that, thanks to random feature selection, the security of the detector
increases significantly at the expense of a negligible loss of performance in
the absence of attacks. We also provide an experimental validation of the
proposed procedure by focusing on the detection of two specific kinds of image
manipulations, namely adaptive histogram equalization and median filtering. The
experiments confirm the gain in security at the expense of a negligible loss of
performance in the absence of attacks
Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning
Learning-based pattern classifiers, including deep networks, have shown
impressive performance in several application domains, ranging from computer
vision to cybersecurity. However, it has also been shown that adversarial input
perturbations carefully crafted either at training or at test time can easily
subvert their predictions. The vulnerability of machine learning to such wild
patterns (also referred to as adversarial examples), along with the design of
suitable countermeasures, have been investigated in the research field of
adversarial machine learning. In this work, we provide a thorough overview of
the evolution of this research area over the last ten years and beyond,
starting from pioneering, earlier work on the security of non-deep learning
algorithms up to more recent work aimed to understand the security properties
of deep learning algorithms, in the context of computer vision and
cybersecurity tasks. We report interesting connections between these
apparently-different lines of work, highlighting common misconceptions related
to the security evaluation of machine-learning algorithms. We review the main
threat models and attacks defined to this end, and discuss the main limitations
of current work, along with the corresponding future challenges towards the
design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201
Adversarial Detection of Flash Malware: Limitations and Open Issues
During the past four years, Flash malware has become one of the most
insidious threats to detect, with almost 600 critical vulnerabilities targeting
Adobe Flash disclosed in the wild. Research has shown that machine learning can
be successfully used to detect Flash malware by leveraging static analysis to
extract information from the structure of the file or its bytecode. However,
the robustness of Flash malware detectors against well-crafted evasion attempts
- also known as adversarial examples - has never been investigated. In this
paper, we propose a security evaluation of a novel, representative Flash
detector that embeds a combination of the prominent, static features employed
by state-of-the-art tools. In particular, we discuss how to craft adversarial
Flash malware examples, showing that it suffices to manipulate the
corresponding source malware samples slightly to evade detection. We then
empirically demonstrate that popular defense techniques proposed to mitigate
evasion attempts, including re-training on adversarial examples, may not always
be sufficient to ensure robustness. We argue that this occurs when the feature
vectors extracted from adversarial examples become indistinguishable from those
of benign data, meaning that the given feature representation is intrinsically
vulnerable. In this respect, we are the first to formally define and
quantitatively characterize this vulnerability, highlighting when an attack can
be countered by solely improving the security of the learning algorithm, or
when it requires also considering additional features. We conclude the paper by
suggesting alternative research directions to improve the security of
learning-based Flash malware detectors
Effectiveness of random deep feature selection for securing image manipulation detectors against adversarial examples
We investigate if the random feature selection approach proposed in [1] to
improve the robustness of forensic detectors to targeted attacks, can be
extended to detectors based on deep learning features. In particular, we study
the transferability of adversarial examples targeting an original CNN image
manipulation detector to other detectors (a fully connected neural network and
a linear SVM) that rely on a random subset of the features extracted from the
flatten layer of the original network. The results we got by considering three
image manipulation detection tasks (resizing, median filtering and adaptive
histogram equalization), two original network architectures and three classes
of attacks, show that feature randomization helps to hinder attack
transferability, even if, in some cases, simply changing the architecture of
the detector, or even retraining the detector is enough to prevent the
transferability of the attacks.Comment: Submitted to the ICASSP conference to be held in 2020, Barcelona,
Spai
Spectral Signatures in Backdoor Attacks
A recent line of work has uncovered a new form of data poisoning: so-called
\emph{backdoor} attacks. These attacks are particularly dangerous because they
do not affect a network's behavior on typical, benign data. Rather, the network
only deviates from its expected output when triggered by a perturbation planted
by an adversary.
In this paper, we identify a new property of all known backdoor attacks,
which we call \emph{spectral signatures}. This property allows us to utilize
tools from robust statistics to thwart the attacks. We demonstrate the efficacy
of these signatures in detecting and removing poisoned examples on real image
sets and state of the art neural network architectures. We believe that
understanding spectral signatures is a crucial first step towards designing ML
systems secure against such backdoor attacksComment: 16 pages, accepted to NIPS 201
Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks
Transferability captures the ability of an attack against a machine-learning
model to be effective against a different, potentially unknown, model.
Empirical evidence for transferability has been shown in previous work, but the
underlying reasons why an attack transfers or not are not yet well understood.
In this paper, we present a comprehensive analysis aimed to investigate the
transferability of both test-time evasion and training-time poisoning attacks.
We provide a unifying optimization framework for evasion and poisoning attacks,
and a formal definition of transferability of such attacks. We highlight two
main factors contributing to attack transferability: the intrinsic adversarial
vulnerability of the target model, and the complexity of the surrogate model
used to optimize the attack. Based on these insights, we define three metrics
that impact an attack's transferability. Interestingly, our results derived
from theoretical analysis hold for both evasion and poisoning attacks, and are
confirmed experimentally using a wide range of linear and non-linear
classifiers and datasets
- …