Finding minimum distortion of adversarial examples and thus certifying
robustness in neural network classifiers for given data points is known to be a
challenging problem. Nevertheless, recently it has been shown to be possible to
give a non-trivial certified lower bound of minimum adversarial distortion, and
some recent progress has been made towards this direction by exploiting the
piece-wise linear nature of ReLU activations. However, a generic robustness
certification for general activation functions still remains largely
unexplored. To address this issue, in this paper we introduce CROWN, a general
framework to certify robustness of neural networks with general activation
functions for given input data points. The novelty in our algorithm consists of
bounding a given activation function with linear and quadratic functions, hence
allowing it to tackle general activation functions including but not limited to
four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we
facilitate the search for a tighter certified lower bound by adaptively
selecting appropriate surrogates for each neuron activation. Experimental
results show that CROWN on ReLU networks can notably improve the certified
lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while
having comparable computational efficiency. Furthermore, CROWN also
demonstrates its effectiveness and flexibility on networks with general
activation functions, including tanh, sigmoid and arctan.Comment: Accepted by NIPS 2018. Huan Zhang and Tsui-Wei Weng contributed
equall