Search CORE

1,300 research outputs found

SADA: Semantic Adversarial Diagnostic Attacks for Autonomous Applications

Author: Ghanem Bernard
Hamdi Abdullah
Müller Matthias
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 02/12/2019
Field of study

One major factor impeding more widespread adoption of deep neural networks (DNNs) is their lack of robustness, which is essential for safety-critical applications such as autonomous driving. This has motivated much recent work on adversarial attacks for DNNs, which mostly focus on pixel-level perturbations void of semantic meaning. In contrast, we present a general framework for adversarial attacks on trained agents, which covers semantic perturbations to the environment of the agent performing the task as well as pixel-level attacks. To do this, we re-frame the adversarial attack problem as learning a distribution of parameters that always fools the agent. In the semantic case, our proposed adversary (denoted as BBGAN) is trained to sample parameters that describe the environment with which the black-box agent interacts, such that the agent performs its dedicated task poorly in this environment. We apply BBGAN on three different tasks, primarily targeting aspects of autonomous navigation: object detection, self-driving, and autonomous UAV racing. On these tasks, BBGAN can generate failure cases that consistently fool a trained agent.Comment: Accepted at AAAI'2

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Generative Adversarial Perturbations

Author: Belongie Serge
Gao Bicheng
Katsman Isay
Poursaeed Omid
Publication venue
Publication date: 06/07/2018
Field of study

In this paper, we propose novel generative models for creating adversarial examples, slightly perturbed images resembling natural images but maliciously crafted to fool pre-trained models. We present trainable deep neural networks for transforming images to adversarial perturbations. Our proposed models can produce image-agnostic and image-dependent perturbations for both targeted and non-targeted attacks. We also demonstrate that similar architectures can achieve impressive results in fooling classification and semantic segmentation models, obviating the need for hand-crafting attack methods for each task. Using extensive experiments on challenging high-resolution datasets such as ImageNet and Cityscapes, we show that our perturbations achieve high fooling rates with small perturbation norms. Moreover, our attacks are considerably faster than current iterative methods at inference time.Comment: CVPR 2018, camera-ready versio

arXiv.org e-Print Archive

Crossref