Search CORE

113 research outputs found

The Adversarial Attack and Detection under the Fisher Information Metric

Author: Fletcher P. Thomas
Peng Yaxin
Shen Chaomin
Yu Mixue
Zhang Guixu
Zhao Chenxiao
Publication venue
Publication date: 08/02/2019
Field of study

Many deep learning models are vulnerable to the adversarial attack, i.e., imperceptible but intentionally-designed perturbations to the input can cause incorrect output of the networks. In this paper, using information geometry, we provide a reasonable explanation for the vulnerability of deep learning models. By considering the data space as a non-linear space with the Fisher information metric induced from a neural network, we first propose an adversarial attack algorithm termed one-step spectral attack (OSSA). The method is described by a constrained quadratic form of the Fisher information matrix, where the optimal adversarial perturbation is given by the first eigenvector, and the model vulnerability is reflected by the eigenvalues. The larger an eigenvalue is, the more vulnerable the model is to be attacked by the corresponding eigenvector. Taking advantage of the property, we also propose an adversarial detection method with the eigenvalues serving as characteristics. Both our attack and detection algorithms are numerically optimized to work efficiently on large datasets. Our evaluations show superior performance compared with other methods, implying that the Fisher information is a promising approach to investigate the adversarial attacks and defenses.Comment: Accepted as an AAAI-2019 oral pape

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Adversarial Attacks on Deep Neural Networks for Time Series Classification

Author: Fawaz Hassan Ismail
Forestier Germain
Idoumghar Lhassane
Muller Pierre-Alain
Weber Jonathan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2019
Field of study

Time Series Classification (TSC) problems are encountered in many real life data mining tasks ranging from medicine and security to human activity recognition and food safety. With the recent success of deep neural networks in various domains such as computer vision and natural language processing, researchers started adopting these techniques for solving time series data mining problems. However, to the best of our knowledge, no previous work has considered the vulnerability of deep learning models to adversarial time series examples, which could potentially make them unreliable in situations where the decision taken by the classifier is crucial such as in medicine and security. For computer vision problems, such attacks have been shown to be very easy to perform by altering the image and adding an imperceptible amount of noise to trick the network into wrongly classifying the input image. Following this line of work, we propose to leverage existing adversarial attack mechanisms to add a special noise to the input time series in order to decrease the network's confidence when classifying instances at test time. Our results reveal that current state-of-the-art deep learning time series classifiers are vulnerable to adversarial attacks which can have major consequences in multiple domains such as food safety and quality assurance.Comment: Accepted at IJCNN 201

arXiv.org e-Print Archive

Crossref

Hal-Diderot

Adversarial-Playground: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning

Author: Norton Andrew P.
Qi Yanjun
Publication venue
Publication date: 01/08/2017
Field of study

Recent studies have shown that attackers can force deep learning models to misclassify so-called "adversarial examples": maliciously generated images formed by making imperceptible modifications to pixel values. With growing interest in deep learning for security applications, it is important for security experts and users of machine learning to recognize how learning systems may be attacked. Due to the complex nature of deep learning, it is challenging to understand how deep models can be fooled by adversarial examples. Thus, we present a web-based visualization tool, Adversarial-Playground, to demonstrate the efficacy of common adversarial methods against a convolutional neural network (CNN) system. Adversarial-Playground is educational, modular and interactive. (1) It enables non-experts to compare examples visually and to understand why an adversarial example can fool a CNN-based image classifier. (2) It can help security experts explore more vulnerability of deep learning as a software module. (3) Building an interactive visualization is challenging in this domain due to the large feature space of image classification (generating adversarial examples is slow in general and visualizing images are costly). Through multiple novel design choices, our tool can provide fast and accurate responses to user requests. Empirically, we find that our client-server division strategy reduced the response time by an average of 1.5 seconds per sample. Our other innovation, a faster variant of JSMA evasion algorithm, empirically performed twice as fast as JSMA and yet maintains a comparable evasion rate. Project source code and data from our experiments available at: https://github.com/QData/AdversarialDNN-PlaygroundComment: 5 pages. {I.2.6}{Artificial Intelligence} ; {K.6.5}{Management of Computing and Information Systems}{Security and Protection}. arXiv admin note: substantial text overlap with arXiv:1706.0176

arXiv.org e-Print Archive

Crossref

Towards the Transferable Audio Adversarial Attack via Ensemble Methods

Author: Chen Yuxuan
Guo Feng
Ju Lei
Sun Zheng
Publication venue
Publication date: 18/04/2023
Field of study

In recent years, deep learning (DL) models have achieved significant progress in many domains, such as autonomous driving, facial recognition, and speech recognition. However, the vulnerability of deep learning models to adversarial attacks has raised serious concerns in the community because of their insufficient robustness and generalization. Also, transferable attacks have become a prominent method for black-box attacks. In this work, we explore the potential factors that impact adversarial examples (AEs) transferability in DL-based speech recognition. We also discuss the vulnerability of different DL systems and the irregular nature of decision boundaries. Our results show a remarkable difference in the transferability of AEs between speech and images, with the data relevance being low in images but opposite in speech recognition. Motivated by dropout-based ensemble approaches, we propose random gradient ensembles and dynamic gradient-weighted ensembles, and we evaluate the impact of ensembles on the transferability of AEs. The results show that the AEs created by both approaches are valid for transfer to the black box API.Comment: Submitted to Cybersecurity journal 202

arXiv.org e-Print Archive