52,747 research outputs found
Robust Proxy: Improving Adversarial Robustness by Robust Proxy Learning
Recently, it has been widely known that deep neural networks are highly
vulnerable and easily broken by adversarial attacks. To mitigate the
adversarial vulnerability, many defense algorithms have been proposed.
Recently, to improve adversarial robustness, many works try to enhance feature
representation by imposing more direct supervision on the discriminative
feature. However, existing approaches lack an understanding of learning
adversarially robust feature representation. In this paper, we propose a novel
training framework called Robust Proxy Learning. In the proposed method, the
model explicitly learns robust feature representations with robust proxies. To
this end, firstly, we demonstrate that we can generate class-representative
robust features by adding class-wise robust perturbations. Then, we use the
class representative features as robust proxies. With the class-wise robust
features, the model explicitly learns adversarially robust features through the
proposed robust proxy learning framework. Through extensive experiments, we
verify that we can manually generate robust features, and our proposed learning
framework could increase the robustness of the DNNs.Comment: Accepted at IEEE Transactions on Information Forensics and Security
(TIFS
Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training
Adversarial training is a standard method to train deep neural networks to be
robust to adversarial perturbation. Similar to surprising ability in the standard deep learning setting, neural networks
trained by adversarial training also generalize well for . However, in constrast with clean generalization, while adversarial
training method is able to achieve low , there
still exists a significant , which promotes
us exploring what mechanism leads to both during learning process. In this paper, we provide
a theoretical understanding of this CGRO phenomenon in adversarial training.
First, we propose a theoretical framework of adversarial training, where we
analyze to explain how adversarial training
leads network learner to CGRO regime. Specifically, we prove that, under our
patch-structured dataset, the CNN model provably partially learns the true
feature but exactly memorizes the spurious features from training-adversarial
examples, which thus results in clean generalization and robust overfitting.
For more general data assumption, we then show the efficiency of CGRO
classifier from the perspective of . On the
empirical side, to verify our theoretical analysis in real-world vision
dataset, we investigate the during
training. Moreover, inspired by our experiments, we prove a robust
generalization bound based on of loss landscape,
which may be an independent interest.Comment: 27 pages, comments welcom
Adversarial Momentum-Contrastive Pre-Training for Robust Feature Extraction
Recently proposed adversarial self-supervised learning methods usually
require big batches and long training epochs to extract robust features, which
is not friendly in practical application. In this paper, we present a novel
adversarial momentum-contrastive learning approach that leverages two memory
banks to track the invariant features across different mini-batches. These
memory banks can be efficiently incorporated into each iteration and help the
network to learn more robust feature representations with smaller batches and
far fewer epochs. Furthermore, after fine-tuning on the classification tasks,
the proposed approach can meet or exceed the performance of some
state-of-the-art supervised baselines on real world datasets. Our code is
available at \url{https://github.com/MTandHJ/amoc}.Comment: 16 pages;6 figures; preprin
Evaluating the Robustness of Off-Road Autonomous Driving Segmentation against Adversarial Attacks: A Dataset-Centric analysis
This study investigates the vulnerability of semantic segmentation models to
adversarial input perturbations, in the domain of off-road autonomous driving.
Despite good performance in generic conditions, the state-of-the-art
classifiers are often susceptible to (even) small perturbations, ultimately
resulting in inaccurate predictions with high confidence. Prior research has
directed their focus on making models more robust by modifying the architecture
and training with noisy input images, but has not explored the influence of
datasets in adversarial attacks. Our study aims to address this gap by
examining the impact of non-robust features in off-road datasets and comparing
the effects of adversarial attacks on different segmentation network
architectures. To enable this, a robust dataset is created consisting of only
robust features and training the networks on this robustified dataset. We
present both qualitative and quantitative analysis of our findings, which have
important implications on improving the robustness of machine learning models
in off-road autonomous driving applications. Additionally, this work
contributes to the safe navigation of autonomous robot Unimog U5023 in rough
off-road unstructured environments by evaluating the robustness of segmentation
outputs. The code is publicly available at
https://github.com/rohtkumar/adversarial_attacks_ on_segmentationComment: 8 page
Robust Ranking Explanations
Robust explanations of machine learning models are critical to establish
human trust in the models. Due to limited cognition capability, most humans can
only interpret the top few salient features. It is critical to make top salient
features robust to adversarial attacks, especially those against the more
vulnerable gradient-based explanations. Existing defense measures robustness
using -norms, which have weaker protection power. We define explanation
thickness for measuring salient features ranking stability, and derive
tractable surrogate bounds of the thickness to design the \textit{R2ET}
algorithm to efficiently maximize the thickness and anchor top salient
features. Theoretically, we prove a connection between R2ET and adversarial
training. Experiments with a wide spectrum of network architectures and data
modalities, including brain networks, demonstrate that R2ET attains higher
explanation robustness under stealthy attacks while retaining accuracy.Comment: Accepted to IMLH (Interpretable ML in Healthcare) workshop at ICML
2023. arXiv admin note: substantial text overlap with arXiv:2212.1410
- …