412 research outputs found
A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond
Over the past decade, deep learning technologies have greatly advanced the
field of medical image registration. The initial developments, such as
ResNet-based and U-Net-based networks, laid the groundwork for deep
learning-driven image registration. Subsequent progress has been made in
various aspects of deep learning-based registration, including similarity
measures, deformation regularizations, and uncertainty estimation. These
advancements have not only enriched the field of deformable image registration
but have also facilitated its application in a wide range of tasks, including
atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D
registration. In this paper, we present a comprehensive overview of the most
recent advancements in deep learning-based image registration. We begin with a
concise introduction to the core concepts of deep learning-based image
registration. Then, we delve into innovative network architectures, loss
functions specific to registration, and methods for estimating registration
uncertainty. Additionally, this paper explores appropriate evaluation metrics
for assessing the performance of deep learning models in registration tasks.
Finally, we highlight the practical applications of these novel techniques in
medical imaging and discuss the future prospects of deep learning-based image
registration
Neural Network Pruning for Real-time Polyp Segmentation
Computer-assisted treatment has emerged as a viable application of medical
imaging, owing to the efficacy of deep learning models. Real-time inference
speed remains a key requirement for such applications to help medical
personnel. Even though there generally exists a trade-off between performance
and model size, impressive efforts have been made to retain near-original
performance by compromising model size. Neural network pruning has emerged as
an exciting area that aims to eliminate redundant parameters to make the
inference faster. In this study, we show an application of neural network
pruning in polyp segmentation. We compute the importance score of convolutional
filters and remove the filters having the least scores, which to some value of
pruning does not degrade the performance. For computing the importance score,
we use the Taylor First Order (TaylorFO) approximation of the change in network
output for the removal of certain filters. Specifically, we employ a
gradient-normalized backpropagation for the computation of the importance
score. Through experiments in the polyp datasets, we validate that our approach
can significantly reduce the parameter count and FLOPs retaining similar
performance
AiAReSeg: Catheter Detection and Segmentation in Interventional Ultrasound using Transformers
To date, endovascular surgeries are performed using the golden standard of
Fluoroscopy, which uses ionising radiation to visualise catheters and
vasculature. Prolonged Fluoroscopic exposure is harmful for the patient and the
clinician, and may lead to severe post-operative sequlae such as the
development of cancer. Meanwhile, the use of interventional Ultrasound has
gained popularity, due to its well-known benefits of small spatial footprint,
fast data acquisition, and higher tissue contrast images. However, ultrasound
images are hard to interpret, and it is difficult to localise vessels,
catheters, and guidewires within them. This work proposes a solution using an
adaptation of a state-of-the-art machine learning transformer architecture to
detect and segment catheters in axial interventional Ultrasound image
sequences. The network architecture was inspired by the Attention in Attention
mechanism, temporal tracking networks, and introduced a novel 3D segmentation
head that performs 3D deconvolution across time. In order to facilitate
training of such deep learning networks, we introduce a new data synthesis
pipeline that used physics-based catheter insertion simulations, along with a
convolutional ray-casting ultrasound simulator to produce synthetic ultrasound
images of endovascular interventions. The proposed method is validated on a
hold-out validation dataset, thus demonstrated robustness to ultrasound noise
and a wide range of scanning angles. It was also tested on data collected from
silicon-based aorta phantoms, thus demonstrated its potential for translation
from sim-to-real. This work represents a significant step towards safer and
more efficient endovascular surgery using interventional ultrasound.Comment: This work has been submitted to the IEEE for possible publicatio
CATS v2: Hybrid encoders for robust medical segmentation
Convolutional Neural Networks (CNNs) have exhibited strong performance in
medical image segmentation tasks by capturing high-level (local) information,
such as edges and textures. However, due to the limited field of view of
convolution kernel, it is hard for CNNs to fully represent global information.
Recently, transformers have shown good performance for medical image
segmentation due to their ability to better model long-range dependencies.
Nevertheless, transformers struggle to capture high-level spatial features as
effectively as CNNs. A good segmentation model should learn a better
representation from local and global features to be both precise and
semantically accurate. In our previous work, we proposed CATS, which is a
U-shaped segmentation network augmented with transformer encoder. In this work,
we further extend this model and propose CATS v2 with hybrid encoders.
Specifically, hybrid encoders consist of a CNN-based encoder path paralleled to
a transformer path with a shifted window, which better leverage both local and
global information to produce robust 3D medical image segmentation. We fuse the
information from the convolutional encoder and the transformer at the skip
connections of different resolutions to form the final segmentation. The
proposed method is evaluated on two public challenge datasets: Cross-Modality
Domain Adaptation (CrossMoDA) and task 5 of Medical Segmentation Decathlon
(MSD-5), to segment vestibular schwannoma (VS) and prostate, respectively.
Compared with the state-of-the-art methods, our approach demonstrates superior
performance in terms of higher Dice scores
Self-training with dual uncertainty for semi-supervised medical image segmentation
In the field of semi-supervised medical image segmentation, the shortage of
labeled data is the fundamental problem. How to effectively learn image
features from unlabeled images to improve segmentation accuracy is the main
research direction in this field. Traditional self-training methods can
partially solve the problem of insufficient labeled data by generating pseudo
labels for iterative training. However, noise generated due to the model's
uncertainty during training directly affects the segmentation results.
Therefore, we added sample-level and pixel-level uncertainty to stabilize the
training process based on the self-training framework. Specifically, we saved
several moments of the model during pre-training, and used the difference
between their predictions on unlabeled samples as the sample-level uncertainty
estimate for that sample. Then, we gradually add unlabeled samples from easy to
hard during training. At the same time, we added a decoder with different
upsampling methods to the segmentation network and used the difference between
the outputs of the two decoders as pixel-level uncertainty. In short, we
selectively retrained unlabeled samples and assigned pixel-level uncertainty to
pseudo labels to optimize the self-training process. We compared the
segmentation results of our model with five semi-supervised approaches on the
public 2017 ACDC dataset and 2018 Prostate dataset. Our proposed method
achieves better segmentation performance on both datasets under the same
settings, demonstrating its effectiveness, robustness, and potential
transferability to other medical image segmentation tasks. Keywords: Medical
image segmentation, semi-supervised learning, self-training, uncertainty
estimatio
Multi-level feature fusion network combining attention mechanisms for polyp segmentation
Clinically, automated polyp segmentation techniques have the potential to
significantly improve the efficiency and accuracy of medical diagnosis, thereby
reducing the risk of colorectal cancer in patients. Unfortunately, existing
methods suffer from two significant weaknesses that can impact the accuracy of
segmentation. Firstly, features extracted by encoders are not adequately
filtered and utilized. Secondly, semantic conflicts and information redundancy
caused by feature fusion are not attended to. To overcome these limitations, we
propose a novel approach for polyp segmentation, named MLFF-Net, which
leverages multi-level feature fusion and attention mechanisms. Specifically,
MLFF-Net comprises three modules: Multi-scale Attention Module (MAM),
High-level Feature Enhancement Module (HFEM), and Global Attention Module
(GAM). Among these, MAM is used to extract multi-scale information and polyp
details from the shallow output of the encoder. In HFEM, the deep features of
the encoders complement each other by aggregation. Meanwhile, the attention
mechanism redistributes the weight of the aggregated features, weakening the
conflicting redundant parts and highlighting the information useful to the
task. GAM combines features from the encoder and decoder features, as well as
computes global dependencies to prevent receptive field locality. Experimental
results on five public datasets show that the proposed method not only can
segment multiple types of polyps but also has advantages over current
state-of-the-art methods in both accuracy and generalization ability
Co-Learning Semantic-aware Unsupervised Segmentation for Pathological Image Registration
The registration of pathological images plays an important role in medical
applications. Despite its significance, most researchers in this field
primarily focus on the registration of normal tissue into normal tissue. The
negative impact of focal tissue, such as the loss of spatial correspondence
information and the abnormal distortion of tissue, are rarely considered. In
this paper, we propose GIRNet, a novel unsupervised approach for pathological
image registration by incorporating segmentation and inpainting through the
principles of Generation, Inpainting, and Registration (GIR). The registration,
segmentation, and inpainting modules are trained simultaneously in a
co-learning manner so that the segmentation of the focal area and the
registration of inpainted pairs can improve collaboratively. Overall, the
registration of pathological images is achieved in a completely unsupervised
learning framework. Experimental results on multiple datasets, including
Magnetic Resonance Imaging (MRI) of T1 sequences, demonstrate the efficacy of
our proposed method. Our results show that our method can accurately achieve
the registration of pathological images and identify lesions even in
challenging imaging modalities. Our unsupervised approach offers a promising
solution for the efficient and cost-effective registration of pathological
images. Our code is available at
https://github.com/brain-intelligence-lab/GIRNet.Comment: 13 pages, 7 figures, published in Medical Image Computing and
Computer Assisted Intervention (MICCAI) 202
Dynamic Data Augmentation via MCTS for Prostate MRI Segmentation
Medical image data are often limited due to the expensive acquisition and
annotation process. Hence, training a deep-learning model with only raw data
can easily lead to overfitting. One solution to this problem is to augment the
raw data with various transformations, improving the model's ability to
generalize to new data. However, manually configuring a generic augmentation
combination and parameters for different datasets is non-trivial due to
inconsistent acquisition approaches and data distributions. Therefore,
automatic data augmentation is proposed to learn favorable augmentation
strategies for different datasets while incurring large GPU overhead. To this
end, we present a novel method, called Dynamic Data Augmentation (DDAug), which
is efficient and has negligible computation cost. Our DDAug develops a
hierarchical tree structure to represent various augmentations and utilizes an
efficient Monte-Carlo tree searching algorithm to update, prune, and sample the
tree. As a result, the augmentation pipeline can be optimized for each dataset
automatically. Experiments on multiple Prostate MRI datasets show that our
method outperforms the current state-of-the-art data augmentation strategies
- …