127 research outputs found
Addressing Mistake Severity in Neural Networks with Semantic Knowledge
Robustness in deep neural networks and machine learning algorithms in general
is an open research challenge. In particular, it is difficult to ensure
algorithmic performance is maintained on out-of-distribution inputs or
anomalous instances that cannot be anticipated at training time. Embodied
agents will be deployed in these conditions, and are likely to make incorrect
predictions. An agent will be viewed as untrustworthy unless it can maintain
its performance in dynamic environments. Most robust training techniques aim to
improve model accuracy on perturbed inputs; as an alternate form of robustness,
we aim to reduce the severity of mistakes made by neural networks in
challenging conditions. We leverage current adversarial training methods to
generate targeted adversarial attacks during the training process in order to
increase the semantic similarity between a model's predictions and true labels
of misclassified instances. Results demonstrate that our approach performs
better with respect to mistake severity compared to standard and adversarially
trained models. We also find an intriguing role that non-robust features play
with regards to semantic similarity
PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition
Notwithstanding the prominent performance achieved in various applications,
point cloud recognition models have often suffered from natural corruptions and
adversarial perturbations. In this paper, we delve into boosting the general
robustness of point cloud recognition models and propose Point-Cloud
Contrastive Adversarial Training (PointCAT). The main intuition of PointCAT is
encouraging the target recognition model to narrow the decision gap between
clean point clouds and corrupted point clouds. Specifically, we leverage a
supervised contrastive loss to facilitate the alignment and uniformity of the
hypersphere features extracted by the recognition model, and design a pair of
centralizing losses with the dynamic prototype guidance to avoid these features
deviating from their belonging category clusters. To provide the more
challenging corrupted point clouds, we adversarially train a noise generator
along with the recognition model from the scratch, instead of using
gradient-based attack as the inner loop like previous adversarial training
methods. Comprehensive experiments show that the proposed PointCAT outperforms
the baseline methods and dramatically boosts the robustness of different point
cloud recognition models, under a variety of corruptions including isotropic
point noises, the LiDAR simulated noises, random point dropping and adversarial
perturbations
Comprehensive Assessment of the Performance of Deep Learning Classifiers Reveals a Surprising Lack of Robustness
Reliable and robust evaluation methods are a necessary first step towards
developing machine learning models that are themselves robust and reliable.
Unfortunately, current evaluation protocols typically used to assess
classifiers fail to comprehensively evaluate performance as they tend to rely
on limited types of test data, and ignore others. For example, using the
standard test data fails to evaluate the predictions made by the classifier to
samples from classes it was not trained on. On the other hand, testing with
data containing samples from unknown classes fails to evaluate how well the
classifier can predict the labels for known classes. This article advocates
bench-marking performance using a wide range of different types of data and
using a single metric that can be applied to all such data types to produce a
consistent evaluation of performance. Using such a benchmark it is found that
current deep neural networks, including those trained with methods that are
believed to produce state-of-the-art robustness, are extremely vulnerable to
making mistakes on certain types of data. This means that such models will be
unreliable in real-world scenarios where they may encounter data from many
different domains, and that they are insecure as they can easily be fooled into
making the wrong decisions. It is hoped that these results will motivate the
wider adoption of more comprehensive testing methods that will, in turn, lead
to the development of more robust machine learning methods in the future.
Code is available at:
\url{https://codeberg.org/mwspratling/RobustnessEvaluation
RobustBench: a standardized adversarial robustness benchmark
As a research community, we are still lacking a systematic understanding of
the progress on adversarial robustness which often makes it hard to identify
the most promising ideas in training robust models. A key challenge in
benchmarking robustness is that its evaluation is often error-prone leading to
robustness overestimation. Our goal is to establish a standardized benchmark of
adversarial robustness, which as accurately as possible reflects the robustness
of the considered models within a reasonable computational budget. To this end,
we start by considering the image classification task and introduce
restrictions (possibly loosened in the future) on the allowed models. We
evaluate adversarial robustness with AutoAttack, an ensemble of white- and
black-box attacks, which was recently shown in a large-scale study to improve
almost all robustness evaluations compared to the original publications. To
prevent overadaptation of new defenses to AutoAttack, we welcome external
evaluations based on adaptive attacks, especially where AutoAttack flags a
potential overestimation of robustness. Our leaderboard, hosted at
https://robustbench.github.io/, contains evaluations of 120+ models and aims at
reflecting the current state of the art in image classification on a set of
well-defined tasks in - and -threat models and on common
corruptions, with possible extensions in the future. Additionally, we
open-source the library https://github.com/RobustBench/robustbench that
provides unified access to 80+ robust models to facilitate their downstream
applications. Finally, based on the collected models, we analyze the impact of
robustness on the performance on distribution shifts, calibration,
out-of-distribution detection, fairness, privacy leakage, smoothness, and
transferability.Comment: The camera-ready version accepted at the NeurIPS'21 Datasets and
Benchmarks Track: 120+ evaluations, 80+ models, 7 leaderboards (Linf, L2,
common corruptions; CIFAR-10, CIFAR-100, ImageNet), significantly expanded
analysis part (calibration, fairness, privacy leakage, smoothness,
transferability
Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
Adversarial training is by far the most successful strategy for improving
robustness of neural networks to adversarial attacks. Despite its success as a
defense mechanism, adversarial training fails to generalize well to unperturbed
test set. We hypothesize that this poor generalization is a consequence of
adversarial training with uniform perturbation radius around every training
sample. Samples close to decision boundary can be morphed into a different
class under a small perturbation budget, and enforcing large margins around
these samples produce poor decision boundaries that generalize poorly.
Motivated by this hypothesis, we propose instance adaptive adversarial training
-- a technique that enforces sample-specific perturbation margins around every
training sample. We show that using our approach, test accuracy on unperturbed
samples improve with a marginal drop in robustness. Extensive experiments on
CIFAR-10, CIFAR-100 and Imagenet datasets demonstrate the effectiveness of our
proposed approach
On the benefits of defining vicinal distributions in latent space
The vicinal risk minimization (VRM) principle is an empirical risk
minimization (ERM) variant that replaces Dirac masses with vicinal functions.
There is strong numerical and theoretical evidence showing that VRM outperforms
ERM in terms of generalization if appropriate vicinal functions are chosen.
Mixup Training (MT), a popular choice of vicinal distribution, improves the
generalization performance of models by introducing globally linear behavior in
between training examples. Apart from generalization, recent works have shown
that mixup trained models are relatively robust to input
perturbations/corruptions and at the same time are calibrated better than their
non-mixup counterparts. In this work, we investigate the benefits of defining
these vicinal distributions like mixup in latent space of generative models
rather than in input space itself. We propose a new approach - \textit{VarMixup
(Variational Mixup)} - to better sample mixup images by using the latent
manifold underlying the data. Our empirical studies on CIFAR-10, CIFAR-100, and
Tiny-ImageNet demonstrate that models trained by performing mixup in the latent
manifold learned by VAEs are inherently more robust to various input
corruptions/perturbations, are significantly better calibrated, and exhibit
more local-linear loss landscapes.Comment: Accepted at Elsevier Pattern Recognition Letters (2021), Best Paper
Award at CVPR 2021 Workshop on Adversarial Machine Learning in Real-World
Computer Vision (AML-CV), Also accepted at ICLR 2021 Workshops on
Robust-Reliable Machine Learning (Oral) and Generalization beyond the
training distribution (Abstract
Lipschitz Continuity Retained Binary Neural Network
Relying on the premise that the performance of a binary neural network can be
largely restored with eliminated quantization error between full-precision
weight vectors and their corresponding binary vectors, existing works of
network binarization frequently adopt the idea of model robustness to reach the
aforementioned objective. However, robustness remains to be an ill-defined
concept without solid theoretical support. In this work, we introduce the
Lipschitz continuity, a well-defined functional property, as the rigorous
criteria to define the model robustness for BNN. We then propose to retain the
Lipschitz continuity as a regularization term to improve the model robustness.
Particularly, while the popular Lipschitz-involved regularization methods often
collapse in BNN due to its extreme sparsity, we design the Retention Matrices
to approximate spectral norms of the targeted weight matrices, which can be
deployed as the approximation for the Lipschitz constant of BNNs without the
exact Lipschitz constant computation (NP-hard). Our experiments prove that our
BNN-specific regularization method can effectively strengthen the robustness of
BNN (testified on ImageNet-C), achieving state-of-the-art performance on CIFAR
and ImageNet.Comment: Paper accepted to ECCV 202
Assessment Framework for Deepfake Detection in Real-world Situations
Detecting digital face manipulation in images and video has attracted
extensive attention due to the potential risk to public trust. To counteract
the malicious usage of such techniques, deep learning-based deepfake detection
methods have been employed and have exhibited remarkable performance. However,
the performance of such detectors is often assessed on related benchmarks that
hardly reflect real-world situations. For example, the impact of various image
and video processing operations and typical workflow distortions on detection
accuracy has not been systematically measured. In this paper, a more reliable
assessment framework is proposed to evaluate the performance of learning-based
deepfake detectors in more realistic settings. To the best of our
acknowledgment, it is the first systematic assessment approach for deepfake
detectors that not only reports the general performance under real-world
conditions but also quantitatively measures their robustness toward different
processing operations. To demonstrate the effectiveness and usage of the
framework, extensive experiments and detailed analysis of three popular
deepfake detection methods are further presented in this paper. In addition, a
stochastic degradation-based data augmentation method driven by realistic
processing operations is designed, which significantly improves the robustness
of deepfake detectors
Generative Classifiers as a Basis for Trustworthy Image Classification
With the maturing of deep learning systems, trustworthiness is becoming
increasingly important for model assessment. We understand trustworthiness as
the combination of explainability and robustness. Generative classifiers (GCs)
are a promising class of models that are said to naturally accomplish these
qualities. However, this has mostly been demonstrated on simple datasets such
as MNIST and CIFAR in the past. In this work, we firstly develop an
architecture and training scheme that allows GCs to operate on a more relevant
level of complexity for practical computer vision, namely the ImageNet
challenge. Secondly, we demonstrate the immense potential of GCs for
trustworthy image classification. Explainability and some aspects of robustness
are vastly improved compared to feed-forward models, even when the GCs are just
applied naively. While not all trustworthiness problems are solved completely,
we observe that GCs are a highly promising basis for further algorithms and
modifications. We release our trained model for download in the hope that it
serves as a starting point for other generative classification tasks, in much
the same way as pretrained ResNet architectures do for discriminative
classification
Boosting Adversarial Training with Hypersphere Embedding
Adversarial training (AT) is one of the most effective defenses against
adversarial attacks for deep learning models. In this work, we advocate
incorporating the hypersphere embedding (HE) mechanism into the AT procedure by
regularizing the features onto compact manifolds, which constitutes a
lightweight yet effective module to blend in the strength of representation
learning. Our extensive analyses reveal that AT and HE are well coupled to
benefit the robustness of the adversarially trained models from several
aspects. We validate the effectiveness and adaptability of HE by embedding it
into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as
the FreeAT and FastAT strategies. In the experiments, we evaluate our methods
under a wide range of adversarial attacks on the CIFAR-10 and ImageNet
datasets, which verifies that integrating HE can consistently enhance the model
robustness for each AT framework with little extra computation.Comment: NeurIPS 202
- …