8 research outputs found
Recommended from our members
Structure-preserving machine learning for inverse problems
Inverse problems naturally arise in many scientific settings, and the study of these problems has been crucial in the development of important technologies such as medical imaging. In inverse problems, the goal is to estimate an underlying ground truth u∗, typically an image, from corresponding measurements y, where u∗ and y are related by
y = N(A(u∗)) (1)
for some forward operator A and noise-generating process N (both of which are generally assumed to be known). Variational regularisation is a well-established approach that can be used to approximately solve inverse problems such as Problem (1). In this approach an image is reconstructed from measurements y by solving a minimisation problem such as
uˆ = argmin d(A(u),y) +αJ(u). (2)
While this approach has proven very successful, it generally requires the parts that make up the optimisation problem to be carefully chosen, and the optimisation problem may require considerable computational effort to solve. There is an active line of research into overcoming these issues using data-driven approaches, which aim to use multiple instances of data to inform a method that can be used on similar data. In this dissertation we investigate ways in which favourable properties of the variational regularisation approach can be combined with a data-driven approach to solving inverse problems.
In the first chapter of the dissertation, we propose a bilevel optimisation framework that can be used to optimise sampling patterns and regularisation parameters for variational image reconstruction in accelerated magnetic resonance imaging. We use this framework to learn sampling patterns that result in better image reconstructions than standard random variable density sampling patterns that sample with the same rate.
In the second chapter of the dissertation, we study the use of group symmetries in learned reconstruction methods for inverse problems. We show that group invariance of a functional implies that the corresponding proximal operator satisfies a group equivariance property. Applying this idea to model proximal operators as roto-translationally equivariant in an unrolled iterative reconstruction method, we show that reconstruction performance is more robust when tested on images in orientations not seen during training (compared to similar methods that model proximal operators to just be translationally equivariant) and that good methods can be learned with less training data.
In the final chapter of the dissertation, we propose a ResNet-styled neural network architecture that is provably nonexpansive. This architecture can be thought of as composing discretisations of gradient flows along learnable convex potentials. Appealing to a classical result on the numerical integration of ODEs, we show that constraining the operator norms of the weight operators is sufficient to give nonexpansiveness, and additional analysis in the case that the numerical integrator is the forward Euler method shows that the neural network is an averaged operator. This guarantees that its fixed point iterations are convergent, and makes it a natural candidate for a learned denoiser in a Plug-and-Play approach to solving inverse problemsCantab Capital Institute for the Mathematics of Informatio
Contractive Systems Improve Graph Neural Networks Against Adversarial Attacks
Graph Neural Networks (GNNs) have established themselves as a key component
in addressing diverse graph-based tasks. Despite their notable successes, GNNs
remain susceptible to input perturbations in the form of adversarial attacks.
This paper introduces an innovative approach to fortify GNNs against
adversarial perturbations through the lens of contractive dynamical systems.
Our method introduces graph neural layers based on differential equations with
contractive properties, which, as we show, improve the robustness of GNNs. A
distinctive feature of the proposed approach is the simultaneous learned
evolution of both the node features and the adjacency matrix, yielding an
intrinsic enhancement of model robustness to perturbations in the input
features and the connectivity of the graph. We mathematically derive the
underpinnings of our novel architecture and provide theoretical insights to
reason about its expected behavior. We demonstrate the efficacy of our method
through numerous real-world benchmarks, reading on par or improved performance
compared to existing methods
Dynamical systems' based neural networks
Neural networks have gained much interest because of their effectiveness in
many applications. However, their mathematical properties are generally not
well understood. If there is some underlying geometric structure inherent to
the data or to the function to approximate, it is often desirable to take this
into account in the design of the neural network. In this work, we start with a
non-autonomous ODE and build neural networks using a suitable,
structure-preserving, numerical time-discretisation. The structure of the
neural network is then inferred from the properties of the ODE vector field.
Besides injecting more structure into the network architectures, this modelling
procedure allows a better theoretical understanding of their behaviour. We
present two universal approximation results and demonstrate how to impose some
particular properties on the neural networks. A particular focus is on
1-Lipschitz architectures including layers that are not 1-Lipschitz. These
networks are expressive and robust against adversarial attacks, as shown for
the CIFAR-10 and CIFAR-100 datasets
Equivariant neural networks for inverse problems.
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of R d , but other examples are the scaling symmetry of R d and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time
Learning the Sampling Pattern for MRI.
The discovery of the theory of compressed sensing brought the realisation that many inverse problems can be solved even when measurements are "incomplete". This is particularly interesting in magnetic resonance imaging (MRI), where long acquisition times can limit its use. In this work, we consider the problem of learning a sparse sampling pattern that can be used to optimally balance acquisition time versus quality of the reconstructed image. We use a supervised learning approach, making the assumption that our training data is representative enough of new data acquisitions. We demonstrate that this is indeed the case, even if the training data consists of just 7 training pairs of measurements and ground-truth images; with a training set of brain images of size 192 by 192, for instance, one of the learned patterns samples only 35% of k-space, however results in reconstructions with mean SSIM 0.914 on a test set of similar images. The proposed framework is general enough to learn arbitrary sampling patterns, including common patterns such as Cartesian, spiral and radial sampling
Learning the Sampling Pattern for MRI
The discovery of the theory of compressed sensing brought the realisation that many inverse problems can be solved even when measurements are "incomplete". This is particularly interesting in magnetic resonance imaging (MRI), where long acquisition times can limit its use. In this work, we consider the problem of learning a sparse sampling pattern that can be used to optimally balance acquisition time versus quality of the reconstructed image. We use a supervised learning approach, making the assumption that our training data is representative enough of new data acquisitions. We demonstrate that this is indeed the case, even if the training data consists of just 5 training pairs of measurements and ground-truth images; with a training set of brain images of size 192 by 192, for instance, one of the learned patterns samples only 32% of k-space, however results in reconstructions with mean SSIM 0.956 on a test set of similar images. The proposed framework is general enough to learn arbitrary sampling patterns, including common patterns such as Cartesian, spiral and radial sampling