6,775 research outputs found
Efficient Defenses Against Adversarial Attacks
Following the recent adoption of deep neural networks (DNN) accross a wide
range of applications, adversarial attacks against these models have proven to
be an indisputable threat. Adversarial samples are crafted with a deliberate
intention of undermining a system. In the case of DNNs, the lack of better
understanding of their working has prevented the development of efficient
defenses. In this paper, we propose a new defense method based on practical
observations which is easy to integrate into models and performs better than
state-of-the-art defenses. Our proposed solution is meant to reinforce the
structure of a DNN, making its prediction more stable and less likely to be
fooled by adversarial samples. We conduct an extensive experimental study
proving the efficiency of our method against multiple attacks, comparing it to
numerous defenses, both in white-box and black-box setups. Additionally, the
implementation of our method brings almost no overhead to the training
procedure, while maintaining the prediction performance of the original model
on clean samples.Comment: 16 page
Adversarial Attack on Radar-based Environment Perception Systems
Due to their robustness to degraded capturing conditions, radars are widely
used for environment perception, which is a critical task in applications like
autonomous vehicles. More specifically, Ultra-Wide Band (UWB) radars are
particularly efficient for short range settings as they carry rich information
on the environment. Recent UWB-based systems rely on Machine Learning (ML) to
exploit the rich signature of these sensors. However, ML classifiers are
susceptible to adversarial examples, which are created from raw data to fool
the classifier such that it assigns the input to the wrong class. These attacks
represent a serious threat to systems integrity, especially for safety-critical
applications. In this work, we present a new adversarial attack on UWB radars
in which an adversary injects adversarial radio noise in the wireless channel
to cause an obstacle recognition failure. First, based on signals collected in
real-life environment, we show that conventional attacks fail to generate
robust noise under realistic conditions. We propose a-RNA, i.e., Adversarial
Radio Noise Attack to overcome these issues. Specifically, a-RNA generates an
adversarial noise that is efficient without synchronization between the input
signal and the noise. Moreover, a-RNA generated noise is, by-design, robust
against pre-processing countermeasures such as filtering-based defenses.
Moreover, in addition to the undetectability objective by limiting the noise
magnitude budget, a-RNA is also efficient in the presence of sophisticated
defenses in the spectral domain by introducing a frequency budget. We believe
this work should alert about potentially critical implementations of
adversarial attacks on radar systems that should be taken seriously
Blacklight: Defending Black-Box Adversarial Attacks on Deep Neural Networks
The vulnerability of deep neural networks (DNNs) to adversarial examples is
well documented. Under the strong white-box threat model, where attackers have
full access to DNN internals, recent work has produced continual advancements
in defenses, often followed by more powerful attacks that break them.
Meanwhile, research on the more realistic black-box threat model has focused
almost entirely on reducing the query-cost of attacks, making them increasingly
practical for ML models already deployed today.
This paper proposes and evaluates Blacklight, a new defense against black-box
adversarial attacks. Blacklight targets a key property of black-box attacks: to
compute adversarial examples, they produce sequences of highly similar images
while trying to minimize the distance from some initial benign input. To detect
an attack, Blacklight computes for each query image a compact set of one-way
hash values that form a probabilistic fingerprint. Variants of an image produce
nearly identical fingerprints, and fingerprint generation is robust against
manipulation. We evaluate Blacklight on 5 state-of-the-art black-box attacks,
across a variety of models and classification tasks. While the most efficient
attacks take thousands or tens of thousands of queries to complete, Blacklight
identifies them all, often after only a handful of queries. Blacklight is also
robust against several powerful countermeasures, including an optimal black-box
attack that approximates white-box attacks in efficiency. Finally, Blacklight
significantly outperforms the only known alternative in both detection coverage
of attack queries and resistance against persistent attackers
Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks
We consider adversarial examples for image classification in the black-box
decision-based setting. Here, an attacker cannot access confidence scores, but
only the final label. Most attacks for this scenario are either unreliable or
inefficient. Focusing on the latter, we show that a specific class of attacks,
Boundary Attacks, can be reinterpreted as a biased sampling framework that
gains efficiency from domain knowledge. We identify three such biases, image
frequency, regional masks and surrogate gradients, and evaluate their
performance against an ImageNet classifier. We show that the combination of
these biases outperforms the state of the art by a wide margin. We also
showcase an efficient way to attack the Google Cloud Vision API, where we craft
convincing perturbations with just a few hundred queries. Finally, the methods
we propose have also been found to work very well against strong defenses: Our
targeted attack won second place in the NeurIPS 2018 Adversarial Vision
Challenge.Comment: For source code and videos, see
https://github.com/ttbrunner/biased_boundary_attac
Mockingbird: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial Traces
Website Fingerprinting (WF) is a type of traffic analysis attack that enables
a local passive eavesdropper to infer the victim's activity, even when the
traffic is protected by a VPN or an anonymity system like Tor. Leveraging a
deep-learning classifier, a WF attacker can gain over 98% accuracy on Tor
traffic. In this paper, we explore a novel defense, Mockingbird, based on the
idea of adversarial examples that have been shown to undermine machine-learning
classifiers in other domains. Since the attacker gets to design and train his
attack classifier based on the defense, we first demonstrate that at a
straightforward technique for generating adversarial-example based traces fails
to protect against an attacker using adversarial training for robust
classification. We then propose Mockingbird, a technique for generating traces
that resists adversarial training by moving randomly in the space of viable
traces and not following more predictable gradients. The technique drops the
accuracy of the state-of-the-art attack hardened with adversarial training from
98% to 42-58% while incurring only 58% bandwidth overhead. The attack accuracy
is generally lower than state-of-the-art defenses, and much lower when
considering Top-2 accuracy, while incurring lower bandwidth overheads.Comment: 18 pages, 13 figures and 8 Tables. Accepted in IEEE Transactions on
Information Forensics and Security (TIFS
- …