Search CORE

200 research outputs found

Regularisation of Neural Networks by Enforcing Lipschitz Continuity

Author: Cree Michael J.
Frank Eibe
Gouk Henry
Pfahringer Bernhard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

We investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs. To this end, we provide a simple technique for computing an upper bound to the Lipschitz constant---for multiple

p

-norms---of a feed forward neural network composed of commonly used layer types. Our technique is then used to formulate training a neural network with a bounded Lipschitz constant as a constrained optimisation problem that can be solved using projected stochastic gradient methods. Our evaluation study shows that the performance of the resulting models exceeds that of models trained with other common regularisers. We also provide evidence that the hyperparameters are intuitive to tune, demonstrate how the choice of norm for computing the Lipschitz constant impacts the resulting model, and show that the performance gains provided by our method are particularly noticeable when only a small amount of training data is available

arXiv.org e-Print Archive

Research Commons@Waikato

Edinburgh Research Explorer

Optimising Network Architectures for Provable Adversarial Robustness

Author: Gouk Henry
Hospedales Timothy M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/11/2020
Field of study

Crossref

Edinburgh Research Explorer

MaxGain: Regularisation of neural networks by constraining activation magnitudes

Author: Cree Michael J.
Frank Eibe
Gouk Henry
Pfahringer Bernhard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2018
Field of study

Effective regularisation of neural networks is essential to combat overfitting due to the large number of parameters involved. We present an empirical analogue to the Lipschitz constant of a feed-forward neural network, which we refer to as the maximum gain. We hypothesise that constraining the gain of a network will have a regularising effect, similar to how constraining the Lipschitz constant of a network has been shown to improve generalisation. A simple algorithm is provided that involves rescaling the weight matrix of each layer after each parameter update. We conduct a series of studies on common benchmark datasets, and also a novel dataset that we introduce to enable easier significance testing for experiments using convolutional networks. Performance on these datasets compares favourably with other common regularisation techniques. Data related to this paper is available at: https://www.cs.waikato.ac.nz/~ml/sins10/

arXiv.org e-Print Archive

Research Commons@Waikato

CLIP: Cheap Lipschitz Training of Neural Networks

Author: Bungert Leon
Raab René
Roith Tim
Schwinn Leo
Tenbrinck Daniel
Publication venue
Publication date: 23/03/2021
Field of study

Despite the large success of deep neural networks (DNN) in recent years, most neural networks still lack mathematical guarantees in terms of stability. For instance, DNNs are vulnerable to small or even imperceptible input perturbations, so called adversarial examples, that can cause false predictions. This instability can have severe consequences in applications which influence the health and safety of humans, e.g., biomedical imaging or autonomous driving. While bounding the Lipschitz constant of a neural network improves stability, most methods rely on restricting the Lipschitz constants of each layer which gives a poor bound for the actual Lipschitz constant. In this paper we investigate a variational regularization method named CLIP for controlling the Lipschitz constant of a neural network, which can easily be integrated into the training procedure. We mathematically analyze the proposed model, in particular discussing the impact of the chosen regularization parameter on the output of the network. Finally, we numerically evaluate our method on both a nonlinear regression problem and the MNIST and Fashion-MNIST classification databases, and compare our results with a weight regularization approach.Comment: 12 pages, 2 figures, accepted at SSVM 202

arXiv.org e-Print Archive