1,989 research outputs found
Certified Training: Small Boxes are All You Need
We propose the novel certified training method, SABR, which outperforms
existing methods across perturbation magnitudes on MNIST, CIFAR-10, and
TinyImageNet, in terms of both standard and certifiable accuracies. The key
insight behind SABR is that propagating interval bounds for a small but
carefully selected subset of the adversarial input region is sufficient to
approximate the worst-case loss over the whole region while significantly
reducing approximation errors. SABR does not only establish a new
state-of-the-art in all commonly used benchmarks but more importantly, points
to a new class of certified training methods promising to overcome the
robustness-accuracy trade-off
Learning Deep Neural Networks by Iterative Linearisation
The excellent real-world performance of deep neural networks has received
increasing attention. Despite the capacity to overfit significantly, such large
models work better than smaller ones. This phenomenon is often referred to as
the scaling law by practitioners. It is of fundamental interest to study why
the scaling law exists and how it avoids/controls overfitting. One approach has
been looking at infinite width limits of neural networks (e.g., Neural Tangent
Kernels, Gaussian Processes); however, in practise, these do not fully explain
finite networks as their infinite counterparts do not learn features.
Furthermore, the empirical kernel for finite networks (i.e., the inner product
of feature vectors), changes significantly during training in contrast to
infinite width networks. In this work we derive an iterative linearised
training method. We justify iterative lineralisation as an interpolation
between finite analogs of the infinite width regime, which do not learn
features, and standard gradient descent training which does. We show some
preliminary results where iterative linearised training works well, noting in
particular how much feature learning is required to achieve comparable
performance. We also provide novel insights into the training behaviour of
neural networks
From Anecdotal Evidence to Quantitative Evaluation Methods:A Systematic Review on Evaluating Explainable AI
The rising popularity of explainable artificial intelligence (XAI) to
understand high-performing black boxes, also raised the question of how to
evaluate explanations of machine learning (ML) models. While interpretability
and explainability are often presented as a subjectively validated binary
property, we consider it a multi-faceted concept. We identify 12 conceptual
properties, such as Compactness and Correctness, that should be evaluated for
comprehensively assessing the quality of an explanation. Our so-called Co-12
properties serve as categorization scheme for systematically reviewing the
evaluation practice of more than 300 papers published in the last 7 years at
major AI and ML conferences that introduce an XAI method. We find that 1 in 3
papers evaluate exclusively with anecdotal evidence, and 1 in 5 papers evaluate
with users. We also contribute to the call for objective, quantifiable
evaluation methods by presenting an extensive overview of quantitative XAI
evaluation methods. This systematic collection of evaluation methods provides
researchers and practitioners with concrete tools to thoroughly validate,
benchmark and compare new and existing XAI methods. This also opens up
opportunities to include quantitative metrics as optimization criteria during
model training in order to optimize for accuracy and interpretability
simultaneously.Comment: Link to website added: https://utwente-dmb.github.io/xai-papers
Unifying Gradients to Improve Real-world Robustness for Deep Networks
The wide application of deep neural networks (DNNs) demands an increasing
amount of attention to their real-world robustness, i.e., whether a DNN resists
black-box adversarial attacks, among which score-based query attacks (SQAs) are
most threatening since they can effectively hurt a victim network with the only
access to model outputs. Defending against SQAs requires a slight but artful
variation of outputs due to the service purpose for users, who share the same
output information with SQAs. In this paper, we propose a real-world defense by
Unifying Gradients (UniG) of different data so that SQAs could only probe a
much weaker attack direction that is similar for different samples. Since such
universal attack perturbations have been validated as less aggressive than the
input-specific perturbations, UniG protects real-world DNNs by indicating
attackers a twisted and less informative attack direction. We implement UniG
efficiently by a Hadamard product module which is plug-and-play. According to
extensive experiments on 5 SQAs, 2 adaptive attacks and 7 defense baselines,
UniG significantly improves real-world robustness without hurting clean
accuracy on CIFAR10 and ImageNet. For instance, UniG maintains a model of
77.80% accuracy under 2500-query Square attack while the state-of-the-art
adversarially-trained model only has 67.34% on CIFAR10. Simultaneously, UniG
outperforms all compared baselines in terms of clean accuracy and achieves the
smallest modification of the model output. The code is released at
https://github.com/snowien/UniG-pytorch
Recommended from our members
Depth uncertainty in neural networks
Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding model uncertainty. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass. We validate our approach on real-world regression and image classification tasks. Our approach provides uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Our theoretical understanding of deep learning has not kept pace with its empirical success. While network architecture is known to be critical, we do not yet understand its effect on learned representations and network behavior, or how this architecture should reflect task this http URL this work, we begin to address this gap by introducing the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics within an architecture. Crucially, because of the gating, these networks can compute nonlinear functions of their input. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning. Our analysis demonstrates that the learning dynamics in structured networks can be conceptualized as a neural race with an implicit bias towards shared representations, which then govern the model's ability to systematically generalize, multi-task, and transfer. We validate our key insights on naturalistic datasets and with relaxed assumptions. Taken together, our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures and the role of modularity and compositionality in solving real-world problems. The code and results are available at this https URL
- …