4 research outputs found
A new method for countering evasion adversarial attacks on information systems based on artificial intelligence
Modern artificial intelligence (AI) technologies are being used in a variety of fields, from science to everyday life.
However, the widespread use of AI-based systems has highlighted a problem with their vulnerability to adversarial
attacks. These attacks include methods of fooling or misleading an artificial neural network, disrupting its operations, and
causing it to make incorrect predictions. This study focuses on protecting image recognition models against adversarial
evasion attacks which have been recognized as the most challenging and dangerous. In these attacks, adversaries
create adversarial data that contains minor perturbations compared to the original image, and then send it to a trained
model in an attempt to change its response to the desired outcome. These distortions can involve adding noise or even
changing a few pixels. In this paper, we consider the most relevant methods for generating adversarial data: the Fast
Gradient Sign Method (FGSM), the Square Method (SQ), the predicted gradient descent method (PGD), the Basic
Iterative Method (BIM), the Carlini-Wagner method (CW) and Jacobian Saliency Map Attack (JSMA). We also study
modern techniques for defending against evasion attacks through model modification, such as adversarial training and
pre-processing of incoming data, including spatial smoothing, feature squeezing, jpeg compression, minimizing total
variance, and defensive distillation. While these methods are effective against certain types of attacks, to date, there is
no single method that can be used as a universal defense. Instead, we propose a new method that combines adversarial
learning with image pre-processing. We suggest that adversarial training should be performed on adversarial samples
generated from common attack methods which can then be effectively defended against. The image preprocessing aims
to counter attacks that were not considered during adversarial training. This allows to protect the system from new
types of attacks. It is proposed to use jpeg compression and feature squeezing on the pre-processing stage. This reduces
the impact of adversarial perturbations and effectively counteracts all types of considered attacks. The evaluation of
image recognition model (based on convolutional neural network) performance metrics based was conducted. The
experimental data included original images and adversarial images created using attack FGSM, PGD, BIM, SQ, CW, and
JSMA methods. At the same time, adversarial training of the model was performed in experiments on data containing
only adversarial examples for the FGSM, PGD, and BIM attack methods. Dataset used in experiments was balanced.
The average accuracy of image recognition was estimated with crafted adversarial imaged datasets. It was concluded
that adversarial training is effective only in countering attacks that were used during model training, while methods of
pre-processing incoming data are effective only against more simple attacks. The average recognition accuracy using
the developed method was 0.94, significantly higher than those considered methods for countering attacks. It has been
shown that the accuracy without using any counteraction methods is approximately 0.19, while with adversarial learning
it is 0.79. Spatial smoothing provides an accuracy of 0.58, and feature squeezing results in an accuracy of 0.88. Jpeg
compression provides an accuracy of 0.37, total variance minimization — 0.58 and defensive distillation — 0.44. At
the same time, image recognition accuracy provided by developed method for FGSM, PGD, BIM, SQ, CW, and JSMA
attacks is 0.99, 0.99, 0.98, 0.98, 0.99 and 0.73, respectively. The developed method is a more universal solution for
countering all types of attacks and works quite effectively against complex adversarial attacks such as CW and JSMA.
The developed method makes it possible to increase accuracy of image recognition model for adversarial images.
Unlike adversarial learning, it also increases recognition accuracy on adversarial data generated using attacks not used
on training stage. The results are useful for researchers and practitioners in the field of machine learning
Adversarial Machine Learning For Advanced Medical Imaging Systems
Although deep neural networks (DNNs) have achieved significant advancement in various challenging tasks of computer vision, they are also known to be vulnerable to so-called adversarial attacks. With only imperceptibly small perturbations added to a clean image, adversarial samples can drastically change models’ prediction, resulting in a significant drop in DNN’s performance. This phenomenon poses a serious threat to security-critical applications of DNNs, such as medical imaging, autonomous driving, and surveillance systems. In this dissertation, we present adversarial machine learning approaches for natural image classification and advanced medical imaging systems.
We start by describing our advanced medical imaging systems to tackle the major challenges of on-device deployment: automation, uncertainty, and resource constraint. It is followed by novel unsupervised and semi-supervised robust training schemes to enhance the adversarial robustness of these medical imaging systems. These methods are designed to tackle the unique challenges of defending against adversarial attacks on medical imaging systems and are sufficiently flexible to generalize to various medical imaging modalities and problems. We continue on developing novel training scheme to enhance adversarial robustness of the general DNN based natural image classification models. Based on a unique insight into the predictive behavior of DNNs that they tend to misclassify adversarial samples into the most probable false classes, we propose a new loss function as a drop-in replacement for the cross-entropy loss to improve DNN\u27s adversarial robustness. Specifically, it enlarges the probability gaps between true class and false classes and prevents them from being melted by small perturbations. Finally, we conclude the dissertation by summarizing original contributions and discussing our future work that leverages DNN interpretability constraint on adversarial training to tackle the central machine learning problem of generalization gap