788 research outputs found
On Certifying Non-Uniform Bounds against Adversarial Attacks
This work studies the robustness certification problem of neural network models, which aims to find certified adversary-free regions as large as possible around data points. In contrast to the existing approaches that seek regions bounded uniformly along all input features, we consider non-uniform bounds and use it to study the decision boundary of neural network models. We formulate our target as an optimization problem with nonlinear constraints. Then, a framework applicable for general feedforward neural networks is proposed to bound the output logits so that the relaxed problem can be solved by the augmented Lagrangian method. Our experiments show the non-uniform bounds have larger volumes than uniform ones and the geometric similarity of the non-uniform bounds gives a quantitative, dataagnostic metric of input features’ robustness. Further, compared with normal models, the robust models have even larger non-uniform bounds and better interpretability
Efficient Neural Network Robustness Certification with General Activation Functions
Finding minimum distortion of adversarial examples and thus certifying
robustness in neural network classifiers for given data points is known to be a
challenging problem. Nevertheless, recently it has been shown to be possible to
give a non-trivial certified lower bound of minimum adversarial distortion, and
some recent progress has been made towards this direction by exploiting the
piece-wise linear nature of ReLU activations. However, a generic robustness
certification for general activation functions still remains largely
unexplored. To address this issue, in this paper we introduce CROWN, a general
framework to certify robustness of neural networks with general activation
functions for given input data points. The novelty in our algorithm consists of
bounding a given activation function with linear and quadratic functions, hence
allowing it to tackle general activation functions including but not limited to
four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we
facilitate the search for a tighter certified lower bound by adaptively
selecting appropriate surrogates for each neuron activation. Experimental
results show that CROWN on ReLU networks can notably improve the certified
lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while
having comparable computational efficiency. Furthermore, CROWN also
demonstrates its effectiveness and flexibility on networks with general
activation functions, including tanh, sigmoid and arctan.Comment: Accepted by NIPS 2018. Huan Zhang and Tsui-Wei Weng contributed
equall
Trust, But Verify: A Survey of Randomized Smoothing Techniques
Machine learning models have demonstrated remarkable success across diverse
domains but remain vulnerable to adversarial attacks. Empirical defence
mechanisms often fall short, as new attacks constantly emerge, rendering
existing defences obsolete. A paradigm shift from empirical defences to
certification-based defences has been observed in response. Randomized
smoothing has emerged as a promising technique among notable advancements. This
study reviews the theoretical foundations, empirical effectiveness, and
applications of randomized smoothing in verifying machine learning classifiers.
We provide an in-depth exploration of the fundamental concepts underlying
randomized smoothing, highlighting its theoretical guarantees in certifying
robustness against adversarial perturbations. Additionally, we discuss the
challenges of existing methodologies and offer insightful perspectives on
potential solutions. This paper is novel in its attempt to systemise the
existing knowledge in the context of randomized smoothing
- …