4,513 research outputs found
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
This paper addresses the tradeoff between standard accuracy on clean examples
and robustness against adversarial examples in deep neural networks (DNNs).
Although adversarial training (AT) improves robustness, it degrades the
standard accuracy, thus yielding the tradeoff. To mitigate this tradeoff, we
propose a novel AT method called ARREST, which comprises three components: (i)
adversarial finetuning (AFT), (ii) representation-guided knowledge distillation
(RGKD), and (iii) noisy replay (NR). AFT trains a DNN on adversarial examples
by initializing its parameters with a DNN that is standardly pretrained on
clean examples. RGKD and NR respectively entail a regularization term and an
algorithm to preserve latent representations of clean examples during AFT. RGKD
penalizes the distance between the representations of the standardly pretrained
and AFT DNNs. NR switches input adversarial examples to nonadversarial ones
when the representation changes significantly during AFT. By combining these
components, ARREST achieves both high standard accuracy and robustness.
Experimental results demonstrate that ARREST mitigates the tradeoff more
effectively than previous AT-based methods do.Comment: Accepted by International Conference on Computer Vision (ICCV) 202
With False Friends Like These, Who Can Notice Mistakes?
Adversarial examples crafted by an explicit adversary have attracted
significant attention in machine learning. However, the security risk posed by
a potential false friend has been largely overlooked. In this paper, we unveil
the threat of hypocritical examples -- inputs that are originally misclassified
yet perturbed by a false friend to force correct predictions. While such
perturbed examples seem harmless, we point out for the first time that they
could be maliciously used to conceal the mistakes of a substandard (i.e., not
as good as required) model during an evaluation. Once a deployer trusts the
hypocritical performance and applies the "well-performed" model in real-world
applications, unexpected failures may happen even in benign environments. More
seriously, this security risk seems to be pervasive: we find that many types of
substandard models are vulnerable to hypocritical examples across multiple
datasets. Furthermore, we provide the first attempt to characterize the threat
with a metric called hypocritical risk and try to circumvent it via several
countermeasures. Results demonstrate the effectiveness of the countermeasures,
while the risk remains non-negligible even after adaptive robust training.Comment: AAAI 202
Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
Evaluating robustness of machine-learning models to adversarial examples is a
challenging problem. Many defenses have been shown to provide a false sense of
security by causing gradient-based attacks to fail, and they have been broken
under more rigorous evaluations. Although guidelines and best practices have
been suggested to improve current adversarial robustness evaluations, the lack
of automatic testing and debugging tools makes it difficult to apply these
recommendations in a systematic manner. In this work, we overcome these
limitations by (i) defining a set of quantitative indicators which unveil
common failures in the optimization of gradient-based attacks, and (ii)
proposing specific mitigation strategies within a systematic evaluation
protocol. Our extensive experimental analysis shows that the proposed
indicators of failure can be used to visualize, debug and improve current
adversarial robustness evaluations, providing a first concrete step towards
automatizing and systematizing current adversarial robustness evaluations. Our
open-source code is available at:
https://github.com/pralab/IndicatorsOfAttackFailure
Robust sound event detection in bioacoustic sensor networks
Bioacoustic sensors, sometimes known as autonomous recording units (ARUs),
can record sounds of wildlife over long periods of time in scalable and
minimally invasive ways. Deriving per-species abundance estimates from these
sensors requires detection, classification, and quantification of animal
vocalizations as individual acoustic events. Yet, variability in ambient noise,
both over time and across sensors, hinders the reliability of current automated
systems for sound event detection (SED), such as convolutional neural networks
(CNN) in the time-frequency domain. In this article, we develop, benchmark, and
combine several machine listening techniques to improve the generalizability of
SED models across heterogeneous acoustic environments. As a case study, we
consider the problem of detecting avian flight calls from a ten-hour recording
of nocturnal bird migration, recorded by a network of six ARUs in the presence
of heterogeneous background noise. Starting from a CNN yielding
state-of-the-art accuracy on this task, we introduce two noise adaptation
techniques, respectively integrating short-term (60 milliseconds) and long-term
(30 minutes) context. First, we apply per-channel energy normalization (PCEN)
in the time-frequency domain, which applies short-term automatic gain control
to every subband in the mel-frequency spectrogram. Secondly, we replace the
last dense layer in the network by a context-adaptive neural network (CA-NN)
layer. Combining them yields state-of-the-art results that are unmatched by
artificial data augmentation alone. We release a pre-trained version of our
best performing system under the name of BirdVoxDetect, a ready-to-use detector
of avian flight calls in field recordings.Comment: 32 pages, in English. Submitted to PLOS ONE journal in February 2019;
revised August 2019; published October 201
Strength-Adaptive Adversarial Training
Adversarial training (AT) is proved to reliably improve network's robustness
against adversarial data. However, current AT with a pre-specified perturbation
budget has limitations in learning a robust network. Firstly, applying a
pre-specified perturbation budget on networks of various model capacities will
yield divergent degree of robustness disparity between natural and robust
accuracies, which deviates from robust network's desideratum. Secondly, the
attack strength of adversarial training data constrained by the pre-specified
perturbation budget fails to upgrade as the growth of network robustness, which
leads to robust overfitting and further degrades the adversarial robustness. To
overcome these limitations, we propose \emph{Strength-Adaptive Adversarial
Training} (SAAT). Specifically, the adversary employs an adversarial loss
constraint to generate adversarial training data. Under this constraint, the
perturbation budget will be adaptively adjusted according to the training state
of adversarial data, which can effectively avoid robust overfitting. Besides,
SAAT explicitly constrains the attack strength of training data through the
adversarial loss, which manipulates model capacity scheduling during training,
and thereby can flexibly control the degree of robustness disparity and adjust
the tradeoff between natural accuracy and robustness. Extensive experiments
show that our proposal boosts the robustness of adversarial training
- …