Search CORE

4 research outputs found

Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Author: Dathathri Sumanth
Gao Sicun
Murray Richard M.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/02/2019
Field of study

Neural networks in real-world applications have to satisfy critical properties such as safety and reliability. The analysis of such properties typically requires extracting information through computing pre-images of the network transformations, but it is well-known that explicit computation of pre-images is intractable. We introduce new methods for computing compact symbolic abstractions of pre-images by computing their overapproximations and underapproximations through all layers. The abstraction of pre-images enables formal analysis and knowledge extraction without affecting standard learning algorithms. We use inverse abstractions to automatically extract simple control laws and compact representations for pre-images corresponding to unsafe outputs. We illustrate that the extracted abstractions are interpretable and can be used for analyzing complex properties

Caltech Authors

Association for the Advancement of Artificial Intelligence: AAAI Publications

Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Author: Dathathri Sumanth
Gao Sicun
Murray Richard M.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/02/2019
Field of study

Scalable Inference of Symbolic Adversarial Examples

Author: Dimitrov Dimitar I.
Gehr Timon
Singh Gagandeep
Vechev Martin
Publication venue
Publication date: 26/07/2020
Field of study

We present a novel method for generating symbolic adversarial examples: input regions guaranteed to only contain adversarial examples for the given neural network. These regions can generate real-world adversarial examples as they summarize trillions of adversarial examples. We theoretically show that computing optimal symbolic adversarial examples is computationally expensive. We present a method for approximating optimal examples in a scalable manner. Our method first selectively uses adversarial attacks to generate a candidate region and then prunes this region with hyperplanes that fit points obtained via specialized sampling. It iterates until arriving at a symbolic adversarial example for which it can prove, via state-of-the-art convex relaxation techniques, that the region only contains adversarial examples. Our experimental results demonstrate that our method is practically effective: it only needs a few thousand attacks to infer symbolic summaries guaranteed to contain

\approx 10^{258}

adversarial examples

arXiv.org e-Print Archive

Repository for Publications and Research Data