Search CORE

1,603 research outputs found

Detecting adversarial manipulation using inductive Venn-ABERS predictors

Author: Goossens Bart
Peck Jonathan
Saeys Yvan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Inductive Venn-ABERS predictors (IVAPs) are a type of probabilistic predictors with the theoretical guarantee that their predictions are perfectly calibrated. In this paper, we propose to exploit this calibration property for the detection of adversarial examples in binary classification tasks. By rejecting predictions if the uncertainty of the IVAP is too high, we obtain an algorithm that is both accurate on the original test set and resistant to adversarial examples. This robustness is observed on adversarials for the underlying model as well as adversarials that were generated by taking the IVAP into account. The method appears to offer competitive robustness compared to the state-of-the-art in adversarial defense yet it is computationally much more tractable

Ghent University Academic Bibliography