2,095 research outputs found
APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection
Physical adversarial attacks threaten to fool object detection systems, but
reproducible research on the real-world effectiveness of physical patches and
how to defend against them requires a publicly available benchmark dataset. We
present APRICOT, a collection of over 1,000 annotated photographs of printed
adversarial patches in public locations. The patches target several object
categories for three COCO-trained detection models, and the photos represent
natural variation in position, distance, lighting conditions, and viewing
angle. Our analysis suggests that maintaining adversarial robustness in
uncontrolled settings is highly challenging, but it is still possible to
produce targeted detections under white-box and sometimes black-box settings.
We establish baselines for defending against adversarial patches through
several methods, including a detector supervised with synthetic data and
unsupervised methods such as kernel density estimation, Bayesian uncertainty,
and reconstruction error. Our results suggest that adversarial patches can be
effectively flagged, both in a high-knowledge, attack-specific scenario, and in
an unsupervised setting where patches are detected as anomalies in natural
images. This dataset and the described experiments provide a benchmark for
future research on the effectiveness of and defenses against physical
adversarial objects in the wild.Comment: 23 pages, 14 figures, 3 tables. Updated version as accepted to ECCV
202
SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations
Research into adversarial examples (AE) has developed rapidly, yet static
adversarial patches are still the main technique for conducting attacks in the
real world, despite being obvious, semi-permanent and unmodifiable once
deployed.
In this paper, we propose Short-Lived Adversarial Perturbations (SLAP), a
novel technique that allows adversaries to realize physically robust real-world
AE by using a light projector. Attackers can project a specifically crafted
adversarial perturbation onto a real-world object, transforming it into an AE.
This allows the adversary greater control over the attack compared to
adversarial patches: (i) projections can be dynamically turned on and off or
modified at will, (ii) projections do not suffer from the locality constraint
imposed by patches, making them harder to detect.
We study the feasibility of SLAP in the self-driving scenario, targeting both
object detector and traffic sign recognition tasks, focusing on the detection
of stop signs. We conduct experiments in a variety of ambient light conditions,
including outdoors, showing how in non-bright settings the proposed method
generates AE that are extremely robust, causing misclassifications on
state-of-the-art networks with up to 99% success rate for a variety of angles
and distances. We also demostrate that SLAP-generated AE do not present
detectable behaviours seen in adversarial patches and therefore bypass
SentiNet, a physical AE detection method. We evaluate other defences including
an adaptive defender using adversarial learning which is able to thwart the
attack effectiveness up to 80% even in favourable attacker conditions.Comment: 13 pages, to be published in Usenix Security 2021, project page
https://github.com/ssloxford/short-lived-adversarial-perturbation
Self-Supervised Feature Learning by Learning to Spot Artifacts
We introduce a novel self-supervised learning method based on adversarial
training. Our objective is to train a discriminator network to distinguish real
images from images with synthetic artifacts, and then to extract features from
its intermediate layers that can be transferred to other data domains and
tasks. To generate images with artifacts, we pre-train a high-capacity
autoencoder and then we use a damage and repair strategy: First, we freeze the
autoencoder and damage the output of the encoder by randomly dropping its
entries. Second, we augment the decoder with a repair network, and train it in
an adversarial manner against the discriminator. The repair network helps
generate more realistic images by inpainting the dropped feature entries. To
make the discriminator focus on the artifacts, we also make it predict what
entries in the feature were dropped. We demonstrate experimentally that
features learned by creating and spotting artifacts achieve state of the art
performance in several benchmarks.Comment: CVPR 2018 (spotlight
MeshAdv: Adversarial Meshes for Visual Recognition
Highly expressive models such as deep neural networks (DNNs) have been widely
applied to various applications. However, recent studies show that DNNs are
vulnerable to adversarial examples, which are carefully crafted inputs aiming
to mislead the predictions. Currently, the majority of these studies have
focused on perturbation added to image pixels, while such manipulation is not
physically realistic. Some works have tried to overcome this limitation by
attaching printable 2D patches or painting patterns onto surfaces, but can be
potentially defended because 3D shape features are intact. In this paper, we
propose meshAdv to generate "adversarial 3D meshes" from objects that have rich
shape features but minimal textural variation. To manipulate the shape or
texture of the objects, we make use of a differentiable renderer to compute
accurate shading on the shape and propagate the gradient. Extensive experiments
show that the generated 3D meshes are effective in attacking both classifiers
and object detectors. We evaluate the attack under different viewpoints. In
addition, we design a pipeline to perform black-box attack on a photorealistic
renderer with unknown rendering parameters.Comment: Published in IEEE CVPR201
- …