1,235 research outputs found
Efficient Neural Network Robustness Certification with General Activation Functions
Finding minimum distortion of adversarial examples and thus certifying
robustness in neural network classifiers for given data points is known to be a
challenging problem. Nevertheless, recently it has been shown to be possible to
give a non-trivial certified lower bound of minimum adversarial distortion, and
some recent progress has been made towards this direction by exploiting the
piece-wise linear nature of ReLU activations. However, a generic robustness
certification for general activation functions still remains largely
unexplored. To address this issue, in this paper we introduce CROWN, a general
framework to certify robustness of neural networks with general activation
functions for given input data points. The novelty in our algorithm consists of
bounding a given activation function with linear and quadratic functions, hence
allowing it to tackle general activation functions including but not limited to
four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we
facilitate the search for a tighter certified lower bound by adaptively
selecting appropriate surrogates for each neuron activation. Experimental
results show that CROWN on ReLU networks can notably improve the certified
lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while
having comparable computational efficiency. Furthermore, CROWN also
demonstrates its effectiveness and flexibility on networks with general
activation functions, including tanh, sigmoid and arctan.Comment: Accepted by NIPS 2018. Huan Zhang and Tsui-Wei Weng contributed
equall
TSS: Transformation-Specific Smoothing for Robustness Certification
As machine learning (ML) systems become pervasive, safeguarding their
security is critical. However, recently it has been demonstrated that motivated
adversaries are able to mislead ML systems by perturbing test data using
semantic transformations. While there exists a rich body of research providing
provable robustness guarantees for ML models against norm bounded
adversarial perturbations, guarantees against semantic perturbations remain
largely underexplored. In this paper, we provide TSS -- a unified framework for
certifying ML robustness against general adversarial semantic transformations.
First, depending on the properties of each transformation, we divide common
transformations into two categories, namely resolvable (e.g., Gaussian blur)
and differentially resolvable (e.g., rotation) transformations. For the former,
we propose transformation-specific randomized smoothing strategies and obtain
strong robustness certification. The latter category covers transformations
that involve interpolation errors, and we propose a novel approach based on
stratified sampling to certify the robustness. Our framework TSS leverages
these certification strategies and combines with consistency-enhanced training
to provide rigorous certification of robustness. We conduct extensive
experiments on over ten types of challenging semantic transformations and show
that TSS significantly outperforms the state of the art. Moreover, to the best
of our knowledge, TSS is the first approach that achieves nontrivial certified
robustness on the large-scale ImageNet dataset. For instance, our framework
achieves 30.4% certified robust accuracy against rotation attack (within ) on ImageNet. Moreover, to consider a broader range of
transformations, we show TSS is also robust against adaptive attacks and
unforeseen image corruptions such as CIFAR-10-C and ImageNet-C.Comment: 2021 ACM SIGSAC Conference on Computer and Communications Security
(CCS '21
- …