3,890 research outputs found
On the Adversarial Robustness of Subspace Learning
In this paper, we study the adversarial robustness of subspace learning
problems. Different from the assumptions made in existing work on robust
subspace learning where data samples are contaminated by gross sparse outliers
or small dense noises, we consider a more powerful adversary who can first
observe the data matrix and then intentionally modify the whole data matrix. We
first characterize the optimal rank-one attack strategy that maximizes the
subspace distance between the subspace learned from the original data matrix
and that learned from the modified data matrix. We then generalize the study to
the scenario without the rank constraint and characterize the corresponding
optimal attack strategy. Our analysis shows that the optimal strategies depend
on the singular values of the original data matrix and the adversary's energy
budget. Finally, we provide numerical experiments and practical applications to
demonstrate the efficiency of the attack strategies
Understanding Robustness in Teacher-Student Setting: A New Perspective
Adversarial examples have appeared as a ubiquitous property of machine
learning models where bounded adversarial perturbation could mislead the models
to make arbitrarily incorrect predictions. Such examples provide a way to
assess the robustness of machine learning models as well as a proxy for
understanding the model training process. Extensive studies try to explain the
existence of adversarial examples and provide ways to improve model robustness
(e.g. adversarial training). While they mostly focus on models trained on
datasets with predefined labels, we leverage the teacher-student framework and
assume a teacher model, or oracle, to provide the labels for given instances.
We extend Tian (2019) in the case of low-rank input data and show that student
specialization (trained student neuron is highly correlated with certain
teacher neuron at the same layer) still happens within the input subspace, but
the teacher and student nodes could differ wildly out of the data subspace,
which we conjecture leads to adversarial examples. Extensive experiments show
that student specialization correlates strongly with model robustness in
different scenarios, including student trained via standard training,
adversarial training, confidence-calibrated adversarial training, and training
with robust feature dataset. Our studies could shed light on the future
exploration about adversarial examples, and enhancing model robustness via
principled data augmentation.Comment: Proceedings of the 24th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR:
Volume 13
Estimating Principal Components under Adversarial Perturbations
Robustness is a key requirement for widespread deployment of machine learning
algorithms, and has received much attention in both statistics and computer
science. We study a natural model of robustness for high-dimensional
statistical estimation problems that we call the adversarial perturbation
model. An adversary can perturb every sample arbitrarily up to a specified
magnitude measured in some norm, say . Our model
is motivated by emerging paradigms such as low precision machine learning and
adversarial training.
We study the classical problem of estimating the top- principal subspace
of the Gaussian covariance matrix in high dimensions, under the adversarial
perturbation model. We design a computationally efficient algorithm that given
corrupted data, recovers an estimate of the top- principal subspace with
error that depends on a robustness parameter that we identify. This
parameter corresponds to the operator norm of the projector onto the
principal subspace, and generalizes well-studied analytic notions of sparsity.
Additionally, in the absence of corruptions, our algorithmic guarantees recover
existing bounds for problems such as sparse PCA and its higher rank analogs. We
also prove that the above dependence on the parameter is almost
optimal asymptotically, not just in a minimax sense, but remarkably for every
instance of the problem. This instance-optimal guarantee shows that the operator norm of the subspace essentially characterizes the estimation error
under adversarial perturbations.Comment: It is to appear at COLT 202
Spanning Attack: Reinforce Black-box Attacks with Unlabeled Data
Adversarial black-box attacks aim to craft adversarial perturbations by
querying input-output pairs of machine learning models. They are widely used to
evaluate the robustness of pre-trained models. However, black-box attacks often
suffer from the issue of query inefficiency due to the high dimensionality of
the input space, and therefore incur a false sense of model robustness. In this
paper, we relax the conditions of the black-box threat model, and propose a
novel technique called the spanning attack. By constraining adversarial
perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled
dataset, the spanning attack significantly improves the query efficiency of a
wide variety of existing black-box attacks. Extensive experiments show that the
proposed method works favorably in both soft-label and hard-label black-box
attacks. Our code is available at https://github.com/wangwllu/spanning_attack
Global Robustness Evaluation of Deep Neural Networks with Provable Guarantees for the Norm
Deployment of deep neural networks (DNNs) in safety- or security-critical
systems requires provable guarantees on their correct behaviour. A common
requirement is robustness to adversarial perturbations in a neighbourhood
around an input. In this paper we focus on the norm and aim to compute,
for a trained DNN and an input, the maximal radius of a safe norm ball around
the input within which there are no adversarial examples. Then we define global
robustness as an expectation of the maximal safe radius over a test data set.
We first show that the problem is NP-hard, and then propose an approximate
approach to iteratively compute lower and upper bounds on the network's
robustness. The approach is \emph{anytime}, i.e., it returns intermediate
bounds and robustness estimates that are gradually, but strictly, improved as
the computation proceeds; \emph{tensor-based}, i.e., the computation is
conducted over a set of inputs simultaneously, instead of one by one, to enable
efficient GPU computation; and has \emph{provable guarantees}, i.e., both the
bounds and the robustness estimates can converge to their optimal values.
Finally, we demonstrate the utility of the proposed approach in practice to
compute tight bounds by applying and adapting the anytime algorithm to a set of
challenging problems, including global robustness evaluation, competitive
attacks, test case generation for DNNs, and local robustness evaluation on
large-scale ImageNet DNNs. We release the code of all case studies via GitHub.Comment: 42 Pages, Github: https://github.com/TrustAI/L0-TR
On the Effect of Low-Rank Weights on Adversarial Robustness of Neural Networks
Recently, there has been an abundance of works on designing Deep Neural
Networks (DNNs) that are robust to adversarial examples. In particular, a
central question is which features of DNNs influence adversarial robustness
and, therefore, can be to used to design robust DNNs. In this work, this
problem is studied through the lens of compression which is captured by the
low-rank structure of weight matrices. It is first shown that adversarial
training tends to promote simultaneously low-rank and sparse structure in the
weight matrices of neural networks. This is measured through the notions of
effective rank and effective sparsity. In the reverse direction, when the low
rank structure is promoted by nuclear norm regularization and combined with
sparsity inducing regularizations, neural networks show significantly improved
adversarial robustness. The effect of nuclear norm regularization on
adversarial robustness is paramount when it is applied to convolutional neural
networks. Although still not competing with adversarial training, this result
contributes to understanding the key properties of robust classifiers
GraCIAS: Grassmannian of Corrupted Images for Adversarial Security
Input transformation based defense strategies fall short in defending against
strong adversarial attacks. Some successful defenses adopt approaches that
either increase the randomness within the applied transformations, or make the
defense computationally intensive, making it substantially more challenging for
the attacker. However, it limits the applicability of such defenses as a
pre-processing step, similar to computationally heavy approaches that use
retraining and network modifications to achieve robustness to perturbations. In
this work, we propose a defense strategy that applies random image corruptions
to the input image alone, constructs a self-correlation based subspace followed
by a projection operation to suppress the adversarial perturbation. Due to its
simplicity, the proposed defense is computationally efficient as compared to
the state-of-the-art, and yet can withstand huge perturbations. Further, we
develop proximity relationships between the projection operator of a clean
image and of its adversarially perturbed version, via bounds relating geodesic
distance on the Grassmannian to matrix Frobenius norms. We empirically show
that our strategy is complementary to other weak defenses like JPEG compression
and can be seamlessly integrated with them to create a stronger defense. We
present extensive experiments on the ImageNet dataset across four different
models namely InceptionV3, ResNet50, VGG16 and MobileNet models with
perturbation magnitude set to {\epsilon} = 16. Unlike state-of-the-art
approaches, even without any retraining, the proposed strategy achieves an
absolute improvement of ~ 4.5% in defense accuracy on ImageNet.Comment: 16 page
Convolutional Neural Networks with Transformed Input based on Robust Tensor Network Decomposition
Tensor network decomposition, originated from quantum physics to model
entangled many-particle quantum systems, turns out to be a promising
mathematical technique to efficiently represent and process big data in
parsimonious manner. In this study, we show that tensor networks can
systematically partition structured data, e.g. color images, for distributed
storage and communication in privacy-preserving manner. Leveraging the sea of
big data and metadata privacy, empirical results show that neighbouring
subtensors with implicit information stored in tensor network formats cannot be
identified for data reconstruction. This technique complements the existing
encryption and randomization techniques which store explicit data
representation at one place and highly susceptible to adversarial attacks such
as side-channel attacks and de-anonymization. Furthermore, we propose a theory
for adversarial examples that mislead convolutional neural networks to
misclassification using subspace analysis based on singular value decomposition
(SVD). The theory is extended to analyze higher-order tensors using
tensor-train SVD (TT-SVD); it helps to explain the level of susceptibility of
different datasets to adversarial attacks, the structural similarity of
different adversarial attacks including global and localized attacks, and the
efficacy of different adversarial defenses based on input transformation. An
efficient and adaptive algorithm based on robust TT-SVD is then developed to
detect strong and static adversarial attacks
Robustness of classifiers to universal perturbations: a geometric perspective
Deep networks have recently been shown to be vulnerable to universal
perturbations: there exist very small image-agnostic perturbations that cause
most natural images to be misclassified by such classifiers. In this paper, we
propose the first quantitative analysis of the robustness of classifiers to
universal perturbations, and draw a formal link between the robustness to
universal perturbations, and the geometry of the decision boundary.
Specifically, we establish theoretical bounds on the robustness of classifiers
under two decision boundary models (flat and curved models). We show in
particular that the robustness of deep networks to universal perturbations is
driven by a key property of their curvature: there exists shared directions
along which the decision boundary of deep networks is systematically positively
curved. Under such conditions, we prove the existence of small universal
perturbations. Our analysis further provides a novel geometric method for
computing universal perturbations, in addition to explaining their properties.Comment: Published at ICLR 201
Robust Subspace Recovery with Adversarial Outliers
We study the problem of robust subspace recovery (RSR) in the presence of
adversarial outliers. That is, we seek a subspace that contains a large portion
of a dataset when some fraction of the data points are arbitrarily corrupted.
We first examine a theoretical estimator that is intractable to calculate and
use it to derive information-theoretic bounds of exact recovery. We then
propose two tractable estimators: a variant of RANSAC and a simple relaxation
of the theoretical estimator. The two estimators are fast to compute and
achieve state-of-the-art theoretical performance in a noiseless RSR setting
with adversarial outliers. The former estimator achieves better theoretical
guarantees in the noiseless case, while the latter estimator is robust to small
noise, and its guarantees significantly improve with non-adversarial models of
outliers. We give a complete comparison of guarantees for the adversarial RSR
problem, as well as a short discussion on the estimation of affine subspaces.Comment: 21 pages, 1 tabl
- …