64 research outputs found
Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion
As a fundamental part of computational healthcare, Computer Tomography (CT)
and Magnetic Resonance Imaging (MRI) provide volumetric data, making the
development of algorithms for 3D image analysis a necessity. Despite being
computationally cheap, 2D Convolutional Neural Networks can only extract
spatial information. In contrast, 3D CNNs can extract three-dimensional
features, but they have higher computational costs and latency, which is a
limitation for clinical practice that requires fast and efficient models.
Inspired by the field of video action recognition we propose a new 2D-based
model dubbed Slice SHift UNet (SSH-UNet) which encodes three-dimensional
features at 2D CNN's complexity. More precisely multi-view features are
collaboratively learned by performing 2D convolutions along the three
orthogonal planes of a volume and imposing a weights-sharing mechanism. The
third dimension, which is neglected by the 2D convolution, is reincorporated by
shifting a portion of the feature maps along the slices' axis. The
effectiveness of our approach is validated in Multi-Modality Abdominal
Multi-Organ Segmentation (AMOS) and Multi-Atlas Labeling Beyond the Cranial
Vault (BTCV) datasets, showing that SSH-UNet is more efficient while on par in
performance with state-of-the-art architectures
RVD: A Handheld Device-Based Fundus Video Dataset for Retinal Vessel Segmentation
Retinal vessel segmentation is generally grounded in image-based datasets
collected with bench-top devices. The static images naturally lose the dynamic
characteristics of retina fluctuation, resulting in diminished dataset
richness, and the usage of bench-top devices further restricts dataset
scalability due to its limited accessibility. Considering these limitations, we
introduce the first video-based retinal dataset by employing handheld devices
for data acquisition. The dataset comprises 635 smartphone-based fundus videos
collected from four different clinics, involving 415 patients from 50 to 75
years old. It delivers comprehensive and precise annotations of retinal
structures in both spatial and temporal dimensions, aiming to advance the
landscape of vasculature segmentation. Specifically, the dataset provides three
levels of spatial annotations: binary vessel masks for overall retinal
structure delineation, general vein-artery masks for distinguishing the vein
and artery, and fine-grained vein-artery masks for further characterizing the
granularities of each artery and vein. In addition, the dataset offers temporal
annotations that capture the vessel pulsation characteristics, assisting in
detecting ocular diseases that require fine-grained recognition of hemodynamic
fluctuation. In application, our dataset exhibits a significant domain shift
with respect to data captured by bench-top devices, thus posing great
challenges to existing methods. In the experiments, we provide evaluation
metrics and benchmark results on our dataset, reflecting both the potential and
challenges it offers for vessel segmentation tasks. We hope this challenging
dataset would significantly contribute to the development of eye disease
diagnosis and early prevention
Carotid artery lumen segmentation in 3D free-hand ultrasound images using surface graph cuts
We present a new approach for automated segmentation of the carotid lumen bifurcation from 3D free-hand ultrasound using a 3D surface graph cut method. The method requires only the manual selection of single seed points in the internal, external, and common carotid arteries. Subsequently, the centerline between these points is automatically traced, and the optimal lumen surface is found around the centerline using graph cuts. To refine the result, the latter process was iterated. The method was tested on twelve carotid arteries from six subjects including three patients with a moderate carotid artery stenosis. Our method successfully segmented the lumen in all cases. We obtained an average dice overlap with respect to a manual segmentation of 84% for healthy volunteers. For the patient data, we obtained a dice overlap of 66.7%
Skin Lesion Correspondence Localization in Total Body Photography
Longitudinal tracking of skin lesions - finding correspondence, changes in
morphology, and texture - is beneficial to the early detection of melanoma.
However, it has not been well investigated in the context of full-body imaging.
We propose a novel framework combining geometric and texture information to
localize skin lesion correspondence from a source scan to a target scan in
total body photography (TBP). Body landmarks or sparse correspondence are first
created on the source and target 3D textured meshes. Every vertex on each of
the meshes is then mapped to a feature vector characterizing the geodesic
distances to the landmarks on that mesh. Then, for each lesion of interest
(LOI) on the source, its corresponding location on the target is first coarsely
estimated using the geometric information encoded in the feature vectors and
then refined using the texture information. We evaluated the framework
quantitatively on both a public and a private dataset, for which our success
rates (at 10 mm criterion) are comparable to the only reported longitudinal
study. As full-body 3D capture becomes more prevalent and has higher quality,
we expect the proposed method to constitute a valuable step in the longitudinal
tracking of skin lesions.Comment: MICCAI-202
Bi-Modality Medical Image Synthesis Using Semi-Supervised Sequential Generative Adversarial Networks
In this paper, we propose a bi-modality medical image synthesis approach
based on sequential generative adversarial network (GAN) and semi-supervised
learning. Our approach consists of two generative modules that synthesize
images of the two modalities in a sequential order. A method for measuring the
synthesis complexity is proposed to automatically determine the synthesis order
in our sequential GAN. Images of the modality with a lower complexity are
synthesized first, and the counterparts with a higher complexity are generated
later. Our sequential GAN is trained end-to-end in a semi-supervised manner. In
supervised training, the joint distribution of bi-modality images are learned
from real paired images of the two modalities by explicitly minimizing the
reconstruction losses between the real and synthetic images. To avoid
overfitting limited training images, in unsupervised training, the marginal
distribution of each modality is learned based on unpaired images by minimizing
the Wasserstein distance between the distributions of real and fake images. We
comprehensively evaluate the proposed model using two synthesis tasks based on
three types of evaluate metrics and user studies. Visual and quantitative
results demonstrate the superiority of our method to the state-of-the-art
methods, and reasonable visual quality and clinical significance. Code is made
publicly available at
https://github.com/hustlinyi/Multimodal-Medical-Image-Synthesis
A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond
Over the past decade, deep learning technologies have greatly advanced the
field of medical image registration. The initial developments, such as
ResNet-based and U-Net-based networks, laid the groundwork for deep
learning-driven image registration. Subsequent progress has been made in
various aspects of deep learning-based registration, including similarity
measures, deformation regularizations, and uncertainty estimation. These
advancements have not only enriched the field of deformable image registration
but have also facilitated its application in a wide range of tasks, including
atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D
registration. In this paper, we present a comprehensive overview of the most
recent advancements in deep learning-based image registration. We begin with a
concise introduction to the core concepts of deep learning-based image
registration. Then, we delve into innovative network architectures, loss
functions specific to registration, and methods for estimating registration
uncertainty. Additionally, this paper explores appropriate evaluation metrics
for assessing the performance of deep learning models in registration tasks.
Finally, we highlight the practical applications of these novel techniques in
medical imaging and discuss the future prospects of deep learning-based image
registration
Precision Monitoring for Disease Progression in Patients with Multiple Sclerosis: A Deep Learning Approach
Artificial intelligence has tremendous potential in a range of clinical applications. Leveraging recent advances in deep learning, the works in this thesis has generated a range of technologies for patients with Multiple Sclerosis (MS) that facilitate precision monitoring using routine MRI and clinical assessments; and contribute to realising the goal of personalised disease management.
MS is a chronic inflammatory demyelinating disease of the central nervous system (CNS), characterised by focal demyelinating plaques in the brain and spinal cord; and progressive neurodegeneration. Despite success in cohort studies and clinical trials, the measurement of disease activity using conventional imaging biomarkers in real-world clinical practice is limited to qualitative assessment of lesion activity, which is time consuming and prone to human error. Quantitative measures, such as T2 lesion load, volumetric assessment of lesion activity and brain atrophy, are constrained by challenges associated with handling real-world data variances. In this thesis, DeepBVC was developed for robust brain atrophy assessment through imaging synthesis, while a lesion segmentation model was developed using a novel federated learning framework, Fed-CoT, to leverage large data collaborations. With existing quantitative brain structural analyses, this work has developed an effective deep learning analysis pipeline, which delivers a fully automated suite of MS-specific clinical imaging biomarkers to facilitate the precision monitoring of patients with MS and response to disease modifying therapy. The framework for individualised MRI-guided management in this thesis was complemented by a disease prognosis model, based on a Large Language Model, providing insights into the risks of clinical worsening over the subsequent 3 years. The value and performance of the MS biomarkers in this thesis are underpinned by extensive validation in real-world, multi-centre data from more than 1030 patients
- …