1 research outputs found
Property-driven Training: All You (N)Ever Wanted to Know About
Neural networks are known for their ability to detect general patterns in
noisy data. This makes them a popular tool for perception components in complex
AI systems. Paradoxically, they are also known for being vulnerable to
adversarial attacks. In response, various methods such as adversarial training,
data-augmentation and Lipschitz robustness training have been proposed as means
of improving their robustness. However, as this paper explores, these training
methods each optimise for a different definition of robustness. We perform an
in-depth comparison of these different definitions, including their
relationship, assumptions, interpretability and verifiability after training.
We also look at constraint-driven training, a general approach designed to
encode arbitrary constraints, and show that not all of these definitions are
directly encodable. Finally we perform experiments to compare the applicability
and efficacy of the training methods at ensuring the network obeys these
different definitions. These results highlight that even the encoding of such a
simple piece of knowledge such as robustness in neural network training is
fraught with difficult choices and pitfalls.Comment: 10 pages, under revie