254 research outputs found
Is attention all you need in medical image analysis? A review
Medical imaging is a key component in clinical diagnosis, treatment planning
and clinical trial design, accounting for almost 90% of all healthcare data.
CNNs achieved performance gains in medical image analysis (MIA) over the last
years. CNNs can efficiently model local pixel interactions and be trained on
small-scale MI data. The main disadvantage of typical CNN models is that they
ignore global pixel relationships within images, which limits their
generalisation ability to understand out-of-distribution data with different
'global' information. The recent progress of Artificial Intelligence gave rise
to Transformers, which can learn global relationships from data. However, full
Transformer models need to be trained on large-scale data and involve
tremendous computational complexity. Attention and Transformer compartments
(Transf/Attention) which can well maintain properties for modelling global
relationships, have been proposed as lighter alternatives of full Transformers.
Recently, there is an increasing trend to co-pollinate complementary
local-global properties from CNN and Transf/Attention architectures, which led
to a new era of hybrid models. The past years have witnessed substantial growth
in hybrid CNN-Transf/Attention models across diverse MIA problems. In this
systematic review, we survey existing hybrid CNN-Transf/Attention models,
review and unravel key architectural designs, analyse breakthroughs, and
evaluate current and future opportunities as well as challenges. We also
introduced a comprehensive analysis framework on generalisation opportunities
of scientific and clinical impact, based on which new data-driven domain
generalisation and adaptation methods can be stimulated
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Denoising method for dynamic contrast-enhanced CT perfusion studies using three-dimensional deep image prior as a simultaneous spatial and temporal regularizer
This study aimed to propose a denoising method for dynamic contrast-enhanced
computed tomography (DCE-CT) perfusion studies using a three-dimensional deep
image prior (DIP), and to investigate its usefulness in comparison with total
variation (TV)-based methods with different regularization parameter (alpha)
values through simulation studies. In the proposed DIP method, the DIP was
incorporated into the constrained optimization problem for image denoising as a
simultaneous spatial and temporal regularizer, which was solved using the
alternating direction method of multipliers. In the simulation studies, DCE-CT
images were generated using a digital brain phantom and their noise level was
varied using the X-ray exposure noise model with different exposures (15, 30,
50, 75, and 100 mAs). Cerebral blood flow (CBF) images were generated from the
original contrast enhancement (CE) images and those obtained by the DIP and TV
methods using block-circulant singular value decomposition. The quality of the
CE images was evaluated using the peak signal-to-noise ratio (PSNR) and
structural similarity index (SSIM). To compare the CBF images obtained by the
different methods and those generated from the ground truth images, linear
regression analysis was performed. When using the DIP method, the PSNR and SSIM
were not significantly dependent on the exposure, and the SSIM was the highest
for all exposures. When using the TV methods, they were significantly dependent
on the exposure and alpha values. The results of the linear regression analysis
suggested that the linearity of the CBF images obtained by the DIP method was
superior to those obtained from the original CE images and by the TV methods.
Our preliminary results suggest that the DIP method is useful for denoising
DCE-CT images at ultra-low to low exposures and for improving the accuracy of
the CBF images generated from them
Noise2Contrast: Multi-Contrast Fusion Enables Self-Supervised Tomographic Image Denoising
Self-supervised image denoising techniques emerged as convenient methods that
allow training denoising models without requiring ground-truth noise-free data.
Existing methods usually optimize loss metrics that are calculated from
multiple noisy realizations of similar images, e.g., from neighboring
tomographic slices. However, those approaches fail to utilize the multiple
contrasts that are routinely acquired in medical imaging modalities like MRI or
dual-energy CT. In this work, we propose the new self-supervised training
scheme Noise2Contrast that combines information from multiple measured image
contrasts to train a denoising model. We stack denoising with domain-transfer
operators to utilize the independent noise realizations of different image
contrasts to derive a self-supervised loss. The trained denoising operator
achieves convincing quantitative and qualitative results, outperforming
state-of-the-art self-supervised methods by 4.7-11.0%/4.8-7.3% (PSNR/SSIM) on
brain MRI data and by 43.6-50.5%/57.1-77.1% (PSNR/SSIM) on dual-energy CT X-ray
microscopy data with respect to the noisy baseline. Our experiments on
different real measured data sets indicate that Noise2Contrast training
generalizes to other multi-contrast imaging modalities
On the Benefit of Dual-domain Denoising in a Self-supervised Low-dose CT Setting
Computed tomography (CT) is routinely used for three-dimensional non-invasive
imaging. Numerous data-driven image denoising algorithms were proposed to
restore image quality in low-dose acquisitions. However, considerably less
research investigates methods already intervening in the raw detector data due
to limited access to suitable projection data or correct reconstruction
algorithms. In this work, we present an end-to-end trainable CT reconstruction
pipeline that contains denoising operators in both the projection and the image
domain and that are optimized simultaneously without requiring ground-truth
high-dose CT data. Our experiments demonstrate that including an additional
projection denoising operator improved the overall denoising performance by
82.4-94.1%/12.5-41.7% (PSNR/SSIM) on abdomen CT and 1.5-2.9%/0.4-0.5%
(PSNR/SSIM) on XRM data relative to the low-dose baseline. We make our entire
helical CT reconstruction framework publicly available that contains a raw
projection rebinning step to render helical projection data suitable for
differentiable fan-beam reconstruction operators and end-to-end learning.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Physics-Informed Computer Vision: A Review and Perspectives
Incorporation of physical information in machine learning frameworks are
opening and transforming many application domains. Here the learning process is
augmented through the induction of fundamental knowledge and governing physical
laws. In this work we explore their utility for computer vision tasks in
interpreting and understanding visual data. We present a systematic literature
review of formulation and approaches to computer vision tasks guided by
physical laws. We begin by decomposing the popular computer vision pipeline
into a taxonomy of stages and investigate approaches to incorporate governing
physical equations in each stage. Existing approaches in each task are analyzed
with regard to what governing physical processes are modeled, formulated and
how they are incorporated, i.e. modify data (observation bias), modify networks
(inductive bias), and modify losses (learning bias). The taxonomy offers a
unified view of the application of the physics-informed capability,
highlighting where physics-informed learning has been conducted and where the
gaps and opportunities are. Finally, we highlight open problems and challenges
to inform future research. While still in its early days, the study of
physics-informed computer vision has the promise to develop better computer
vision models that can improve physical plausibility, accuracy, data efficiency
and generalization in increasingly realistic applications
- …