202 research outputs found
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory
This paper tackles the problem of Lipschitz regularization of Convolutional
Neural Networks. Lipschitz regularity is now established as a key property of
modern deep learning with implications in training stability, generalization,
robustness against adversarial examples, etc. However, computing the exact
value of the Lipschitz constant of a neural network is known to be NP-hard.
Recent attempts from the literature introduce upper bounds to approximate this
constant that are either efficient but loose or accurate but computationally
expensive. In this work, by leveraging the theory of Toeplitz matrices, we
introduce a new upper bound for convolutional layers that is both tight and
easy to compute. Based on this result we devise an algorithm to train Lipschitz
regularized Convolutional Neural Networks
Robust low-rank training via approximate orthonormal constraints
With the growth of model and data sizes, a broad effort has been made to
design pruning techniques that reduce the resource demand of deep learning
pipelines, while retaining model performance. In order to reduce both inference
and training costs, a prominent line of work uses low-rank matrix
factorizations to represent the network weights. Although able to retain
accuracy, we observe that low-rank methods tend to compromise model robustness
against adversarial perturbations. By modeling robustness in terms of the
condition number of the neural network, we argue that this loss of robustness
is due to the exploding singular values of the low-rank weight matrices. Thus,
we introduce a robust low-rank training algorithm that maintains the network's
weights on the low-rank matrix manifold while simultaneously enforcing
approximate orthonormal constraints. The resulting model reduces both training
and inference costs while ensuring well-conditioning and thus better
adversarial robustness, without compromising model accuracy. This is shown by
extensive numerical evidence and by our main approximation theorem that shows
the computed robust low-rank network well-approximates the ideal full model,
provided a highly performing low-rank sub-network exists
Theoretical Perspectives on Deep Learning Methods in Inverse Problems
In recent years, there have been significant advances
in the use of deep learning methods in inverse problems such as
denoising, compressive sensing, inpainting, and super-resolution.
While this line of works has predominantly been driven by
practical algorithms and experiments, it has also given rise to
a variety of intriguing theoretical problems. In this paper, we
survey some of the prominent theoretical developments in this line
of works, focusing in particular on generative priors, untrained
neural network priors, and unfolding algorithms. In addition to
summarizing existing results in these topics, we highlight several
ongoing challenges and open problems
Deep Learning for Inverse Problems: Performance Characterizations, Learning Algorithms, and Applications
Deep learning models have witnessed immense empirical success over the last decade. However, in spite of their widespread adoption, a profound understanding of the generalization behaviour of these over-parameterized architectures is still missing. In this thesis, we provide one such way via a data-dependent characterizations of the generalization capability of deep neural networks based data representations. In particular, by building on the algorithmic robustness framework, we offer a generalisation error bound that encapsulates key ingredients associated with the learning problem such as the complexity of the data space, the cardinality of the training set, and the Lipschitz properties of a deep neural network.
We then specialize our analysis to a specific class of model based regression problems, namely the inverse problems. These problems often come with well defined forward operators that map variables of interest to the observations. It is therefore natural to ask whether such knowledge of the forward operator can be exploited in deep learning approaches increasingly used to solve inverse problems. We offer a generalisation error bound that -- apart from the other factors -- depends on the Jacobian of the composition of the forward operator with the neural network.
Motivated by our analysis, we then propose a `plug-and-play' regulariser that leverages the knowledge of the forward map to improve the generalization of the network. We likewise also provide a method allowing us to tightly upper bound the norms of the Jacobians of the relevant operators that is much more {computationally} efficient than existing ones. We demonstrate the efficacy of our model-aware regularised deep learning algorithms against other state-of-the-art approaches on inverse problems involving various sub-sampling operators such as those used in classical compressed sensing setup and inverse problems that are of interest in the biomedical imaging setup
Spectral Bounding : Strictly Satisfying the 1-Lipschitz Property for Generative Adversarial Networks
Principles of Neural Network Architecture Design - Invertibility and Domain Knowledge
Neural networks architectures allow a tremendous variety of design choices. In this work, we study two principles underlying these architectures: First, the design and application of invertible neural networks (INNs). Second, the incorporation of domain knowledge into neural network architectures. After introducing the mathematical foundations of deep learning, we address the invertibility of standard feedforward neural networks from a mathematical perspective. These results serve as a motivation for our proposed invertible residual networks (i-ResNets). This architecture class is then studied in two scenarios: First, we propose ways to use i-ResNets as a normalizing flow and demonstrate the applicability for high-dimensional generative modeling. Second, we study the excessive invariance of common deep image classifiers and discuss consequences for adversarial robustness. We finish with a study of convolutional neural networks for tumor classification based on imaging mass spectrometry (IMS) data. For this application, we propose an adapted architecture guided by our knowledge of the domain of IMS data and show its superior performance on two challenging tumor classification datasets
- …