7 research outputs found
Equivariant Neural Networks for Indirect Measurements
In the recent years, deep learning techniques have shown great success in
various tasks related to inverse problems, where a target quantity of interest
can only be observed through indirect measurements by a forward operator.
Common approaches apply deep neural networks in a post-processing step to the
reconstructions obtained by classical reconstruction methods. However, the
latter methods can be computationally expensive and introduce artifacts that
are not present in the measured data and, in turn, can deteriorate the
performance on the given task. To overcome these limitations, we propose a
class of equivariant neural networks that can be directly applied to the
measurements to solve the desired task. To this end, we build appropriate
network structures by developing layers that are equivariant with respect to
data transformations induced by well-known symmetries in the domain of the
forward operator. We rigorously analyze the relation between the measurement
operator and the resulting group representations and prove a representer
theorem that characterizes the class of linear operators that translate between
a given pair of group actions. Based on this theory, we extend the existing
concepts of Lie group equivariant deep learning to inverse problems and
introduce new representations that result from the involved measurement
operations. This allows us to efficiently solve classification, regression or
even reconstruction tasks based on indirect measurements also for very sparse
data problems, where a classical reconstruction-based approach may be hard or
even impossible. We illustrate the effectiveness of our approach in numerical
experiments and compare with existing methods.Comment: 22 pages, 6 figure
Bayesian view on the training of invertible residual networks for solving linear inverse problems
Learning-based methods for inverse problems, adapting to the data's inherent
structure, have become ubiquitous in the last decade. Besides empirical
investigations of their often remarkable performance, an increasing number of
works addresses the issue of theoretical guarantees. Recently, [3] exploited
invertible residual networks (iResNets) to learn provably convergent
regularizations given reasonable assumptions. They enforced these guarantees by
approximating the linear forward operator with an iResNet. Supervised training
on relevant samples introduces data dependency into the approach. An open
question in this context is to which extent the data's inherent structure
influences the training outcome, i.e., the learned reconstruction scheme. Here
we address this delicate interplay of training design and data dependency from
a Bayesian perspective and shed light on opportunities and limitations. We
resolve these limitations by analyzing reconstruction-based training of the
inverses of iResNets, where we show that this optimization strategy introduces
a level of data-dependency that cannot be achieved by approximation training.
We further provide and discuss a series of numerical experiments underpinning
and extending the theoretical findings
Deep Learning Methods for Partial Differential Equations and Related Parameter Identification Problems
Recent years have witnessed a growth in mathematics for deep learning--which
seeks a deeper understanding of the concepts of deep learning with mathematics
and explores how to make it more robust--and deep learning for mathematics,
where deep learning algorithms are used to solve problems in mathematics. The
latter has popularised the field of scientific machine learning where deep
learning is applied to problems in scientific computing. Specifically, more and
more neural network architectures have been developed to solve specific classes
of partial differential equations (PDEs). Such methods exploit properties that
are inherent to PDEs and thus solve the PDEs better than standard feed-forward
neural networks, recurrent neural networks, or convolutional neural networks.
This has had a great impact in the area of mathematical modeling where
parametric PDEs are widely used to model most natural and physical processes
arising in science and engineering. In this work, we review such methods as
well as their extensions for parametric studies and for solving the related
inverse problems. We equally proceed to show their relevance in some industrial
applications
Invertible residual networks in the context of regularization theory for linear inverse problems
Learned inverse problem solvers exhibit remarkable performance in
applications like image reconstruction tasks. These data-driven reconstruction
methods often follow a two-step scheme. First, one trains the often neural
network-based reconstruction scheme via a dataset. Second, one applies the
scheme to new measurements to obtain reconstructions. We follow these steps but
parameterize the reconstruction scheme with invertible residual networks
(iResNets). We demonstrate that the invertibility enables investigating the
influence of the training and architecture choices on the resulting
reconstruction scheme. For example, assuming local approximation properties of
the network, we show that these schemes become convergent regularizations. In
addition, the investigations reveal a formal link to the linear regularization
theory of linear inverse problems and provide a nonlinear spectral
regularization for particular architecture classes. On the numerical side, we
investigate the local approximation property of selected trained architectures
and present a series of experiments on the MNIST dataset that underpin and
extend our theoretical findings
Recommended from our members
Invertible residual networks in the context of regularization theory for linear inverse problems
Abstract
Learned inverse problem solvers exhibit remarkable performance in applications like image reconstruction tasks. These data-driven reconstruction methods often follow a two-step procedure. First, one trains the often neural network-based reconstruction scheme via a dataset. Second, one applies the scheme to new measurements to obtain reconstructions. We follow these steps but parameterize the reconstruction scheme with invertible residual networks (iResNets). We demonstrate that the invertibility enables investigating the influence of the training and architecture choices on the resulting reconstruction scheme. For example, assuming local approximation properties of the network, we show that these schemes become convergent regularizations. In addition, the investigations reveal a formal link to the linear regularization theory of linear inverse problems and provide a nonlinear spectral regularization for particular architecture classes. On the numerical side, we investigate the local approximation property of selected trained architectures and present a series of experiments on the MNIST dataset that underpin and extend our theoretical findings.</jats:p
Deeply Supervised UNet for Semantic Segmentation to Assist Dermatopathological Assessment of Basal Cell Carcinoma
Accurate and fast assessment of resection margins is an essential part of a dermatopathologist’s clinical routine. In this work, we successfully develop a deep learning method to assist the dermatopathologists by marking critical regions that have a high probability of exhibiting pathological features in whole slide images (WSI). We focus on detecting basal cell carcinoma (BCC) through semantic segmentation using several models based on the UNet architecture. The study includes 650 WSI with 3443 tissue sections in total. Two clinical dermatopathologists annotated the data, marking tumor tissues’ exact location on 100 WSI. The rest of the data, with ground-truth sectionwise labels, are used to further validate and test the models. We analyze two different encoders for the first part of the UNet network and two additional training strategies: (a) deep supervision, (b) linear combination of decoder outputs, and obtain some interpretations about what the network’s decoder does in each case. The best model achieves over 96%, accuracy, sensitivity, and specificity on the Test set