Search CORE

4 research outputs found

A new method for countering evasion adversarial attacks on information systems based on artificial intelligence

Author: Alexander A. Menshchikov
Alisa A. Vorobeva
Dmitry I. Sivkov
Maxim A. Matuzko
Roman I. Safiullin
Publication venue: Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
Publication date: 01/04/2024
Field of study

Modern artificial intelligence (AI) technologies are being used in a variety of fields, from science to everyday life. However, the widespread use of AI-based systems has highlighted a problem with their vulnerability to adversarial attacks. These attacks include methods of fooling or misleading an artificial neural network, disrupting its operations, and causing it to make incorrect predictions. This study focuses on protecting image recognition models against adversarial evasion attacks which have been recognized as the most challenging and dangerous. In these attacks, adversaries create adversarial data that contains minor perturbations compared to the original image, and then send it to a trained model in an attempt to change its response to the desired outcome. These distortions can involve adding noise or even changing a few pixels. In this paper, we consider the most relevant methods for generating adversarial data: the Fast Gradient Sign Method (FGSM), the Square Method (SQ), the predicted gradient descent method (PGD), the Basic Iterative Method (BIM), the Carlini-Wagner method (CW) and Jacobian Saliency Map Attack (JSMA). We also study modern techniques for defending against evasion attacks through model modification, such as adversarial training and pre-processing of incoming data, including spatial smoothing, feature squeezing, jpeg compression, minimizing total variance, and defensive distillation. While these methods are effective against certain types of attacks, to date, there is no single method that can be used as a universal defense. Instead, we propose a new method that combines adversarial learning with image pre-processing. We suggest that adversarial training should be performed on adversarial samples generated from common attack methods which can then be effectively defended against. The image preprocessing aims to counter attacks that were not considered during adversarial training. This allows to protect the system from new types of attacks. It is proposed to use jpeg compression and feature squeezing on the pre-processing stage. This reduces the impact of adversarial perturbations and effectively counteracts all types of considered attacks. The evaluation of image recognition model (based on convolutional neural network) performance metrics based was conducted. The experimental data included original images and adversarial images created using attack FGSM, PGD, BIM, SQ, CW, and JSMA methods. At the same time, adversarial training of the model was performed in experiments on data containing only adversarial examples for the FGSM, PGD, and BIM attack methods. Dataset used in experiments was balanced. The average accuracy of image recognition was estimated with crafted adversarial imaged datasets. It was concluded that adversarial training is effective only in countering attacks that were used during model training, while methods of pre-processing incoming data are effective only against more simple attacks. The average recognition accuracy using the developed method was 0.94, significantly higher than those considered methods for countering attacks. It has been shown that the accuracy without using any counteraction methods is approximately 0.19, while with adversarial learning it is 0.79. Spatial smoothing provides an accuracy of 0.58, and feature squeezing results in an accuracy of 0.88. Jpeg compression provides an accuracy of 0.37, total variance minimization — 0.58 and defensive distillation — 0.44. At the same time, image recognition accuracy provided by developed method for FGSM, PGD, BIM, SQ, CW, and JSMA attacks is 0.99, 0.99, 0.98, 0.98, 0.99 and 0.73, respectively. The developed method is a more universal solution for countering all types of attacks and works quite effectively against complex adversarial attacks such as CW and JSMA. The developed method makes it possible to increase accuracy of image recognition model for adversarial images. Unlike adversarial learning, it also increases recognition accuracy on adversarial data generated using attacks not used on training stage. The results are useful for researchers and practitioners in the field of machine learning

Directory of Open Access Journals

Adversarial Machine Learning For Advanced Medical Imaging Systems

Author: Li Xin
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2022
Field of study

Although deep neural networks (DNNs) have achieved significant advancement in various challenging tasks of computer vision, they are also known to be vulnerable to so-called adversarial attacks. With only imperceptibly small perturbations added to a clean image, adversarial samples can drastically change models’ prediction, resulting in a significant drop in DNN’s performance. This phenomenon poses a serious threat to security-critical applications of DNNs, such as medical imaging, autonomous driving, and surveillance systems. In this dissertation, we present adversarial machine learning approaches for natural image classification and advanced medical imaging systems. We start by describing our advanced medical imaging systems to tackle the major challenges of on-device deployment: automation, uncertainty, and resource constraint. It is followed by novel unsupervised and semi-supervised robust training schemes to enhance the adversarial robustness of these medical imaging systems. These methods are designed to tackle the unique challenges of defending against adversarial attacks on medical imaging systems and are sufficiently flexible to generalize to various medical imaging modalities and problems. We continue on developing novel training scheme to enhance adversarial robustness of the general DNN based natural image classification models. Based on a unique insight into the predictive behavior of DNNs that they tend to misclassify adversarial samples into the most probable false classes, we propose a new loss function as a drop-in replacement for the cross-entropy loss to improve DNN\u27s adversarial robustness. Specifically, it enlarges the probability gaps between true class and false classes and prevents them from being melted by small perturbations. Finally, we conclude the dissertation by summarizing original contributions and discussing our future work that leverages DNN interpretability constraint on adversarial training to tackle the central machine learning problem of generalization gap

Digital Commons@Wayne State University