14 research outputs found
Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets
Deep neural nets (DNNs) compression is crucial for adaptation to mobile
devices. Though many successful algorithms exist to compress naturally trained
DNNs, developing efficient and stable compression algorithms for robustly
trained DNNs remains widely open. In this paper, we focus on a co-design of
efficient DNN compression algorithms and sparse neural architectures for robust
and accurate deep learning. Such a co-design enables us to advance the goal of
accommodating both sparsity and robustness. With this objective in mind, we
leverage the relaxed augmented Lagrangian based algorithms to prune the weights
of adversarially trained DNNs, at both structured and unstructured levels.
Using a Feynman-Kac formalism principled robust and sparse DNNs, we can at
least double the channel sparsity of the adversarially trained ResNet20 for
CIFAR10 classification, meanwhile, improve the natural accuracy by \% and
the robust accuracy under the benchmark iterations of IFGSM attack by
\%. The code is available at
\url{https://github.com/BaoWangMath/rvsm-rgsm-admm}.Comment: 16 pages, 7 figure
Deep Limits of Residual Neural Networks
Neural networks have been very successful in many applications; we often,
however, lack a theoretical understanding of what the neural networks are
actually learning. This problem emerges when trying to generalise to new data
sets. The contribution of this paper is to show that, for the residual neural
network model, the deep layer limit coincides with a parameter estimation
problem for a nonlinear ordinary differential equation. In particular, whilst
it is known that the residual neural network model is a discretisation of an
ordinary differential equation, we show convergence in a variational sense.
This implies that optimal parameters converge in the deep layer limit. This is
a stronger statement than saying for a fixed parameter the residual neural
network model converges (the latter does not in general imply the former). Our
variational analysis provides a discrete-to-continuum -convergence
result for the objective function of the residual neural network training step
to a variational problem constrained by a system of ordinary differential
equations; this rigorously connects the discrete setting to a continuum
problem