Search CORE

5,519 research outputs found

Robustness Verification for Classifier Ensembles

Author: Gross Dennis
Jansen Nils
Pérez Guillermo A.
Raaijmakers Stephan
Publication venue
Publication date: 01/01/2020
Field of study

We give a formal verification procedure that decides whether a classifier ensemble is robust against arbitrary randomized attacks. Such attacks consist of a set of deterministic attacks and a distribution over this set. The robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack that induces a certain expected loss against all classifiers. We show the NP-hardness of the problem and provide an upper bound on the number of attacks that is sufficient to form an optimal randomized attack. These results provide an effective way to reason about the robustness of a classifier ensemble. We provide SMT and MILP encodings to compute optimal randomized attacks or prove that there is no attack inducing a certain expected loss. In the latter case, the classifier ensemble is provably robust. Our prototype implementation verifies multiple neural-network ensembles trained for image-classification tasks. The experimental results using the MILP encoding are promising both in terms of scalability and the general applicability of our verification procedure

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Improved Generalization Bounds for Robust Learning

Author: Attias Idan
Kontorovich Aryeh
Mansour Yishay
Publication venue
Publication date: 02/03/2019
Field of study

We consider a model of robust learning in an adversarial environment. The learner gets uncorrupted training data with access to possible corruptions that may be affected by the adversary during testing. The learner's goal is to build a robust classifier that would be tested on future adversarial examples. We use a zero-sum game between the learner and the adversary as our game theoretic framework. The adversary is limited to

k

possible corruptions for each input. Our model is closely related to the adversarial examples model of Schmidt et al. (2018); Madry et al. (2017). Our main results consist of generalization bounds for the binary and multi-class classification, as well as the real-valued case (regression). For the binary classification setting, we both tighten the generalization bound of Feige, Mansour, and Schapire (2015), and also are able to handle an infinite hypothesis class

H

. The sample complexity is improved from

O(\frac{1}{\epsilon^4}\log(\frac{|H|}{\delta}))

O(\frac{1}{\epsilon^2}(k\log(k)VC(H)+\log\frac{1}{\delta}))

. Additionally, we extend the algorithm and generalization bound from the binary to the multiclass and real-valued cases. Along the way, we obtain results on fat-shattering dimension and Rademacher complexity of

k

-fold maxima over function classes; these may be of independent interest. For binary classification, the algorithm of Feige et al. (2015) uses a regret minimization algorithm and an ERM oracle as a blackbox; we adapt it for the multi-class and regression settings. The algorithm provides us with near-optimal policies for the players on a given training sample.Comment: Appearing at the 30th International Conference on Algorithmic Learning Theory (ALT 2019

arXiv.org e-Print Archive