199 research outputs found
Adversarial robustness of amortized Bayesian inference
Bayesian inference usually requires running potentially costly inference
procedures separately for every new observation. In contrast, the idea of
amortized Bayesian inference is to initially invest computational cost in
training an inference network on simulated data, which can subsequently be used
to rapidly perform inference (i.e., to return estimates of posterior
distributions) for new observations. This approach has been applied to many
real-world models in the sciences and engineering, but it is unclear how robust
the approach is to adversarial perturbations in the observed data. Here, we
study the adversarial robustness of amortized Bayesian inference, focusing on
simulation-based estimation of multi-dimensional posterior distributions. We
show that almost unrecognizable, targeted perturbations of the observations can
lead to drastic changes in the predicted posterior and highly unrealistic
posterior predictive samples, across several benchmark tasks and a real-world
example from neuroscience. We propose a computationally efficient
regularization scheme based on penalizing the Fisher information of the
conditional density estimator, and show how it improves the adversarial
robustness of amortized Bayesian inference
Learning Provably Robust Estimators for Inverse Problems via Jittering
Deep neural networks provide excellent performance for inverse problems such
as denoising. However, neural networks can be sensitive to adversarial or
worst-case perturbations. This raises the question of whether such networks can
be trained efficiently to be worst-case robust. In this paper, we investigate
whether jittering, a simple regularization technique that adds isotropic
Gaussian noise during training, is effective for learning worst-case robust
estimators for inverse problems. While well studied for prediction in
classification tasks, the effectiveness of jittering for inverse problems has
not been systematically investigated. In this paper, we present a novel
analytical characterization of the optimal -worst-case robust estimator
for linear denoising and show that jittering yields optimal robust denoisers.
Furthermore, we examine jittering empirically via training deep neural networks
(U-nets) for natural image denoising, deconvolution, and accelerated magnetic
resonance imaging (MRI). The results show that jittering significantly enhances
the worst-case robustness, but can be suboptimal for inverse problems beyond
denoising. Moreover, our results imply that training on real data which often
contains slight noise is somewhat robustness enhancing
Gaussian class-conditional simplex loss for accurate, adversarially robust deep classifier training
In this work, we present the Gaussian Class-Conditional Simplex (GCCS) loss: a novel approach for training deep robust multiclass classifiers that improves over the state-of-the-art in terms of classification accuracy and adversarial robustness, with little extra cost for network training. The proposed method learns a mapping of the input classes onto Gaussian target distributions in a latent space such that a hyperplane can be used as the optimal decision surface. Instead of maximizing the likelihood of target labels for individual samples, our loss function pushes the network to produce feature distributions yielding high inter-class separation and low intra-class separation. The mean values of the learned distributions are centered on the vertices of a simplex such that each class is at the same distance
from every other class. We show that the regularization of the latent space based on our approach yields excellent classification accuracy. Moreover, GCCS provides improved robustness against adversarial perturbations, outperforming models trained with conventional adversarial training (AT). In particular, our model learns a decision space that minimizes the presence of short paths toward neighboring decision regions. We provide a comprehensive empirical evaluation that shows how GCCS outperforms state-of-the-art approaches over challenging datasets for targeted and untargeted gradient-based, as well as gradient-free adversarial attacks, both in terms of classification accuracy and adversarial robustness
- …