269 research outputs found
A Closer Look at Evaluating the Bit-Flip Attack Against Deep Neural Networks
Deep neural network models are massively deployed on a wide variety of
hardware platforms. This results in the appearance of new attack vectors that
significantly extend the standard attack surface, extensively studied by the
adversarial machine learning community. One of the first attack that aims at
drastically dropping the performance of a model, by targeting its parameters
(weights) stored in memory, is the Bit-Flip Attack (BFA). In this work, we
point out several evaluation challenges related to the BFA. First of all, the
lack of an adversary's budget in the standard threat model is problematic,
especially when dealing with physical attacks. Moreover, since the BFA presents
critical variability, we discuss the influence of some training parameters and
the importance of the model architecture. This work is the first to present the
impact of the BFA against fully-connected architectures that present different
behaviors compared to convolutional neural networks. These results highlight
the importance of defining robust and sound evaluation methodologies to
properly evaluate the dangers of parameter-based attacks as well as measure the
real level of robustness offered by a defense.Comment: Extended version from IEEE IOLTS'2022 short pape
Mind the Scaling Factors: Resilience Analysis of Quantized Adversarially Robust CNNs
As more deep learning algorithms enter safety-critical application domains, the importance of analyzing their resilience against hardware faults cannot be overstated. Most existing works focus on bit-flips in memory, fewer focus on compute errors, and almost none study the effect of hardware faults on adversarially trained convolutional neural networks (CNNs). In this work, we show that adversarially trained CNNs are more susceptible to failure due to hardware errors when compared to vanilla-trained models. We identify large differences in the quantization scaling factors of the CNNs which are resilient to hardware faults and those which are not. As adversarially trained CNNs learn robustness against input attack perturbations, their internal weight and activation distributions open a backdoor for injecting large magnitude hardware faults. We propose a simple weight decay remedy for adversarially trained models to maintain adversarial robustness and hardware resilience in the same CNN. We improve the fault resilience of an adversarially trained ResNet56 by 25% for large-scale bit-flip benchmarks on activation data while gaining slightly improved accuracy and adversarial robustness
Invading The Integrity of Deep Learning (DL) Models Using LSB Perturbation & Pixel Manipulation
The use of deep learning (DL) models for solving classification and recognition-related problems are expanding at an exponential rate. However, these models are computationally expensive both in terms of time and resources. This imposes an entry barrier for low-profile businesses and scientific research projects with limited resources. Therefore, many organizations prefer to use fully outsourced trained models, cloud computing services, pre-trained models are available for download and transfer learning. This ubiquitous adoption of DL has unlocked numerous opportunities but has also brought forth potential threats to its prospects. Among the security threats, backdoor attacks and adversarial attacks have emerged as significant concerns and have attracted considerable research attention in recent years since it poses a serious threat to the integrity and confidentiality of the DL systems and highlights the need for robust security mechanisms to safeguard these systems. In this research, the proposed methodology comprises two primary components: backdoor attack and adversarial attack. For the backdoor attack, the Least Significant Bit (LSB) perturbation technique is employed to subtly alter image pixels by flipping the least significant bits. Extensive experimentation determined that 3-bit flips strike an optimal balance between accuracy and covertness. For the adversarial attack, the Pixel Perturbation approach directly manipulates pixel values to maximize misclassifications, with the optimal number of pixel changes found to be 4-5. Experimental evaluations were conducted using the MNIST, Fashion MNIST, and CIFAR-10 datasets. The results showcased high success rates for the attacks while simultaneously maintaining a relatively covert profile. Comparative analyses revealed that the proposed techniques exhibited greater imperceptibility compared to prior works such as Badnets and One-Pixel attacks
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
Deep neural networks (DNNs) are widely deployed on real-world devices.
Concerns regarding their security have gained great attention from researchers.
Recently, a new weight modification attack called bit flip attack (BFA) was
proposed, which exploits memory fault inject techniques such as row hammer to
attack quantized models in the deployment stage. With only a few bit flips, the
target model can be rendered useless as a random guesser or even be implanted
with malicious functionalities. In this work, we seek to further reduce the
number of bit flips. We propose a training-assisted bit flip attack, in which
the adversary is involved in the training stage to build a high-risk model to
release. This high-risk model, obtained coupled with a corresponding malicious
model, behaves normally and can escape various detection methods. The results
on benchmark datasets show that an adversary can easily convert this high-risk
but normal model to a malicious one on victim's side by \textbf{flipping only
one critical bit} on average in the deployment stage. Moreover, our attack
still poses a significant threat even when defenses are employed. The codes for
reproducing main experiments are available at
\url{https://github.com/jianshuod/TBA}.Comment: This work is accepted by the ICCV 2023. 14 page
- …