1 research outputs found
Generating Black-Box Adversarial Examples in Sparse Domain
Applications of machine learning (ML) models and convolutional neural
networks (CNNs) have been rapidly increased. Although state-of-the-art CNNs
provide high accuracy in many applications, recent investigations show that
such networks are highly vulnerable to adversarial attacks. The black-box
adversarial attack is one type of attack that the attacker does not have any
knowledge about the model or the training dataset, but it has some input data
set and their labels. In this paper, we propose a novel approach to generate a
black-box attack in sparse domain whereas the most important information of an
image can be observed. Our investigation shows that large sparse (LaS)
components play a critical role in the performance of image classifiers. Under
this presumption, to generate adversarial example, we transfer an image into a
sparse domain and put a threshold to choose only k LaS components. In contrast
to the very recent works that randomly perturb k low frequency (LoF)
components, we perturb k LaS components either randomly (query-based) or in the
direction of the most correlated sparse signal from a different class. We show
that LaS components contain some middle or higher frequency components
information which leads fooling image classifiers with a fewer number of
queries. We demonstrate the effectiveness of this approach by fooling six
state-of-the-art image classifiers, the TensorFlow Lite (TFLite) model of
Google Cloud Vision platform, and YOLOv5 model as an object detection
algorithm. Mean squared error (MSE) and peak signal to noise ratio (PSNR) are
used as quality metrics. We also present a theoretical proof to connect these
metrics to the level of perturbation in the sparse domain