2,408 research outputs found
Vulnerability of deep neural networks for detecting COVID-19 cases from chest X-ray images to universal adversarial attacks
Under the epidemic of the novel coronavirus disease 2019 (COVID-19), chest
X-ray computed tomography imaging is being used for effectively screening
COVID-19 patients. The development of computer-aided systems based on deep
neural networks (DNNs) has been advanced, to rapidly and accurately detect
COVID-19 cases, because the need for expert radiologists, who are limited in
number, forms a bottleneck for the screening. However, so far, the
vulnerability of DNN-based systems has been poorly evaluated, although DNNs are
vulnerable to a single perturbation, called universal adversarial perturbation
(UAP), which can induce DNN failure in most classification tasks. Thus, we
focus on representative DNN models for detecting COVID-19 cases from chest
X-ray images and evaluate their vulnerability to UAPs generated using simple
iterative algorithms. We consider nontargeted UAPs, which cause a task failure
resulting in an input being assigned an incorrect label, and targeted UAPs,
which cause the DNN to classify an input into a specific class. The results
demonstrate that the models are vulnerable to nontargeted and targeted UAPs,
even in case of small UAPs. In particular, 2% norm of the UPAs to the average
norm of an image in the image dataset achieves >85% and >90% success rates for
the nontargeted and targeted attacks, respectively. Due to the nontargeted
UAPs, the DNN models judge most chest X-ray images as COVID-19 cases. The
targeted UAPs make the DNN models classify most chest X-ray images into a given
target class. The results indicate that careful consideration is required in
practical applications of DNNs to COVID-19 diagnosis; in particular, they
emphasize the need for strategies to address security concerns. As an example,
we show that iterative fine-tuning of the DNN models using UAPs improves the
robustness of the DNN models against UAPs.Comment: 17 pages, 5 figures, 3 table
Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples
Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and
CVPR. These defenses are mainly focused on mitigating white-box attacks. They
do not properly examine black-box attacks. In this paper, we expand upon the
analysis of these defenses to include adaptive black-box adversaries. Our
evaluation is done on nine defenses including Barrage of Random Transforms,
ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error
Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer
Zones. Our investigation is done using two black-box adversarial models and six
widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our
analyses show most recent defenses (7 out of 9) provide only marginal
improvements in security (), as compared to undefended networks. For
every defense, we also show the relationship between the amount of data the
adversary has at their disposal, and the effectiveness of adaptive black-box
attacks. Overall, our results paint a clear picture: defenses need both
thorough white-box and black-box analyses to be considered secure. We provide
this large scale study and analyses to motivate the field to move towards the
development of more robust black-box defenses
IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks
We introduce a novel approach to counter adversarial attacks, namely, image
resampling. Image resampling transforms a discrete image into a new one,
simulating the process of scene recapturing or rerendering as specified by a
geometrical transformation. The underlying rationale behind our idea is that
image resampling can alleviate the influence of adversarial perturbations while
preserving essential semantic information, thereby conferring an inherent
advantage in defending against adversarial attacks. To validate this concept,
we present a comprehensive study on leveraging image resampling to defend
against adversarial attacks. We have developed basic resampling methods that
employ interpolation strategies and coordinate shifting magnitudes. Our
analysis reveals that these basic methods can partially mitigate adversarial
attacks. However, they come with apparent limitations: the accuracy of clean
images noticeably decreases, while the improvement in accuracy on adversarial
examples is not substantial. We propose implicit representation-driven image
resampling (IRAD) to overcome these limitations. First, we construct an
implicit continuous representation that enables us to represent any input image
within a continuous coordinate space. Second, we introduce SampleNet, which
automatically generates pixel-wise shifts for resampling in response to
different inputs. Furthermore, we can extend our approach to the
state-of-the-art diffusion-based method, accelerating it with fewer time steps
while preserving its defense capability. Extensive experiments demonstrate that
our method significantly enhances the adversarial robustness of diverse deep
models against various attacks while maintaining high accuracy on clean images
DISCO: Adversarial Defense with Local Implicit Functions
The problem of adversarial defenses for image classification, where the goal
is to robustify a classifier against adversarial examples, is considered.
Inspired by the hypothesis that these examples lie beyond the natural image
manifold, a novel aDversarIal defenSe with local impliCit functiOns (DISCO) is
proposed to remove adversarial perturbations by localized manifold projections.
DISCO consumes an adversarial image and a query pixel location and outputs a
clean RGB value at the location. It is implemented with an encoder and a local
implicit module, where the former produces per-pixel deep features and the
latter uses the features in the neighborhood of query pixel for predicting the
clean RGB value. Extensive experiments demonstrate that both DISCO and its
cascade version outperform prior defenses, regardless of whether the defense is
known to the attacker. DISCO is also shown to be data and parameter efficient
and to mount defenses that transfers across datasets, classifiers and attacks.Comment: Accepted to Neurips 202
Rethinking Adversarial Training with A Simple Baseline
We report competitive results on RobustBench for CIFAR and SVHN using a
simple yet effective baseline approach. Our approach involves a training
protocol that integrates rescaled square loss, cyclic learning rates, and
erasing-based data augmentation. The outcomes we have achieved are comparable
to those of the model trained with state-of-the-art techniques, which is
currently the predominant choice for adversarial training. Our baseline,
referred to as SimpleAT, yields three novel empirical insights. (i) By
switching to square loss, the accuracy is comparable to that obtained by using
both de-facto training protocol plus data augmentation. (ii) One cyclic
learning rate is a good scheduler, which can effectively reduce the risk of
robust overfitting. (iii) Employing rescaled square loss during model training
can yield a favorable balance between adversarial and natural accuracy. In
general, our experimental results show that SimpleAT effectively mitigates
robust overfitting and consistently achieves the best performance at the end of
training. For example, on CIFAR-10 with ResNet-18, SimpleAT achieves
approximately 52% adversarial accuracy against the current strong AutoAttack.
Furthermore, SimpleAT exhibits robust performance on various image corruptions,
including those commonly found in CIFAR-10-C dataset. Finally, we assess the
effectiveness of these insights through two techniques: bias-variance analysis
and logit penalty methods. Our findings demonstrate that all of these simple
techniques are capable of reducing the variance of model predictions, which is
regarded as the primary contributor to robust overfitting. In addition, our
analysis also uncovers connections with various advanced state-of-the-art
methods.Comment: 19 pages, 8 figures, 6 table
- …