64 research outputs found

    Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion

    Full text link
    As a fundamental part of computational healthcare, Computer Tomography (CT) and Magnetic Resonance Imaging (MRI) provide volumetric data, making the development of algorithms for 3D image analysis a necessity. Despite being computationally cheap, 2D Convolutional Neural Networks can only extract spatial information. In contrast, 3D CNNs can extract three-dimensional features, but they have higher computational costs and latency, which is a limitation for clinical practice that requires fast and efficient models. Inspired by the field of video action recognition we propose a new 2D-based model dubbed Slice SHift UNet (SSH-UNet) which encodes three-dimensional features at 2D CNN's complexity. More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three orthogonal planes of a volume and imposing a weights-sharing mechanism. The third dimension, which is neglected by the 2D convolution, is reincorporated by shifting a portion of the feature maps along the slices' axis. The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ Segmentation (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets, showing that SSH-UNet is more efficient while on par in performance with state-of-the-art architectures

    RVD: A Handheld Device-Based Fundus Video Dataset for Retinal Vessel Segmentation

    Full text link
    Retinal vessel segmentation is generally grounded in image-based datasets collected with bench-top devices. The static images naturally lose the dynamic characteristics of retina fluctuation, resulting in diminished dataset richness, and the usage of bench-top devices further restricts dataset scalability due to its limited accessibility. Considering these limitations, we introduce the first video-based retinal dataset by employing handheld devices for data acquisition. The dataset comprises 635 smartphone-based fundus videos collected from four different clinics, involving 415 patients from 50 to 75 years old. It delivers comprehensive and precise annotations of retinal structures in both spatial and temporal dimensions, aiming to advance the landscape of vasculature segmentation. Specifically, the dataset provides three levels of spatial annotations: binary vessel masks for overall retinal structure delineation, general vein-artery masks for distinguishing the vein and artery, and fine-grained vein-artery masks for further characterizing the granularities of each artery and vein. In addition, the dataset offers temporal annotations that capture the vessel pulsation characteristics, assisting in detecting ocular diseases that require fine-grained recognition of hemodynamic fluctuation. In application, our dataset exhibits a significant domain shift with respect to data captured by bench-top devices, thus posing great challenges to existing methods. In the experiments, we provide evaluation metrics and benchmark results on our dataset, reflecting both the potential and challenges it offers for vessel segmentation tasks. We hope this challenging dataset would significantly contribute to the development of eye disease diagnosis and early prevention

    Carotid artery lumen segmentation in 3D free-hand ultrasound images using surface graph cuts

    Get PDF
    We present a new approach for automated segmentation of the carotid lumen bifurcation from 3D free-hand ultrasound using a 3D surface graph cut method. The method requires only the manual selection of single seed points in the internal, external, and common carotid arteries. Subsequently, the centerline between these points is automatically traced, and the optimal lumen surface is found around the centerline using graph cuts. To refine the result, the latter process was iterated. The method was tested on twelve carotid arteries from six subjects including three patients with a moderate carotid artery stenosis. Our method successfully segmented the lumen in all cases. We obtained an average dice overlap with respect to a manual segmentation of 84% for healthy volunteers. For the patient data, we obtained a dice overlap of 66.7%

    Skin Lesion Correspondence Localization in Total Body Photography

    Full text link
    Longitudinal tracking of skin lesions - finding correspondence, changes in morphology, and texture - is beneficial to the early detection of melanoma. However, it has not been well investigated in the context of full-body imaging. We propose a novel framework combining geometric and texture information to localize skin lesion correspondence from a source scan to a target scan in total body photography (TBP). Body landmarks or sparse correspondence are first created on the source and target 3D textured meshes. Every vertex on each of the meshes is then mapped to a feature vector characterizing the geodesic distances to the landmarks on that mesh. Then, for each lesion of interest (LOI) on the source, its corresponding location on the target is first coarsely estimated using the geometric information encoded in the feature vectors and then refined using the texture information. We evaluated the framework quantitatively on both a public and a private dataset, for which our success rates (at 10 mm criterion) are comparable to the only reported longitudinal study. As full-body 3D capture becomes more prevalent and has higher quality, we expect the proposed method to constitute a valuable step in the longitudinal tracking of skin lesions.Comment: MICCAI-202

    Bi-Modality Medical Image Synthesis Using Semi-Supervised Sequential Generative Adversarial Networks

    Full text link
    In this paper, we propose a bi-modality medical image synthesis approach based on sequential generative adversarial network (GAN) and semi-supervised learning. Our approach consists of two generative modules that synthesize images of the two modalities in a sequential order. A method for measuring the synthesis complexity is proposed to automatically determine the synthesis order in our sequential GAN. Images of the modality with a lower complexity are synthesized first, and the counterparts with a higher complexity are generated later. Our sequential GAN is trained end-to-end in a semi-supervised manner. In supervised training, the joint distribution of bi-modality images are learned from real paired images of the two modalities by explicitly minimizing the reconstruction losses between the real and synthetic images. To avoid overfitting limited training images, in unsupervised training, the marginal distribution of each modality is learned based on unpaired images by minimizing the Wasserstein distance between the distributions of real and fake images. We comprehensively evaluate the proposed model using two synthesis tasks based on three types of evaluate metrics and user studies. Visual and quantitative results demonstrate the superiority of our method to the state-of-the-art methods, and reasonable visual quality and clinical significance. Code is made publicly available at https://github.com/hustlinyi/Multimodal-Medical-Image-Synthesis

    A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond

    Full text link
    Over the past decade, deep learning technologies have greatly advanced the field of medical image registration. The initial developments, such as ResNet-based and U-Net-based networks, laid the groundwork for deep learning-driven image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, and uncertainty estimation. These advancements have not only enriched the field of deformable image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration

    Precision Monitoring for Disease Progression in Patients with Multiple Sclerosis: A Deep Learning Approach

    Get PDF
    Artificial intelligence has tremendous potential in a range of clinical applications. Leveraging recent advances in deep learning, the works in this thesis has generated a range of technologies for patients with Multiple Sclerosis (MS) that facilitate precision monitoring using routine MRI and clinical assessments; and contribute to realising the goal of personalised disease management. MS is a chronic inflammatory demyelinating disease of the central nervous system (CNS), characterised by focal demyelinating plaques in the brain and spinal cord; and progressive neurodegeneration. Despite success in cohort studies and clinical trials, the measurement of disease activity using conventional imaging biomarkers in real-world clinical practice is limited to qualitative assessment of lesion activity, which is time consuming and prone to human error. Quantitative measures, such as T2 lesion load, volumetric assessment of lesion activity and brain atrophy, are constrained by challenges associated with handling real-world data variances. In this thesis, DeepBVC was developed for robust brain atrophy assessment through imaging synthesis, while a lesion segmentation model was developed using a novel federated learning framework, Fed-CoT, to leverage large data collaborations. With existing quantitative brain structural analyses, this work has developed an effective deep learning analysis pipeline, which delivers a fully automated suite of MS-specific clinical imaging biomarkers to facilitate the precision monitoring of patients with MS and response to disease modifying therapy. The framework for individualised MRI-guided management in this thesis was complemented by a disease prognosis model, based on a Large Language Model, providing insights into the risks of clinical worsening over the subsequent 3 years. The value and performance of the MS biomarkers in this thesis are underpinned by extensive validation in real-world, multi-centre data from more than 1030 patients
    corecore