209 research outputs found

    U-Net and its variants for medical image segmentation: theory and applications

    Full text link
    U-net is an image segmentation technique developed primarily for medical image analysis that can precisely segment images using a scarce amount of training data. These traits provide U-net with a very high utility within the medical imaging community and have resulted in extensive adoption of U-net as the primary tool for segmentation tasks in medical imaging. The success of U-net is evident in its widespread use in all major image modalities from CT scans and MRI to X-rays and microscopy. Furthermore, while U-net is largely a segmentation tool, there have been instances of the use of U-net in other applications. As the potential of U-net is still increasing, in this review we look at the various developments that have been made in the U-net architecture and provide observations on recent trends. We examine the various innovations that have been made in deep learning and discuss how these tools facilitate U-net. Furthermore, we look at image modalities and application areas where U-net has been applied.Comment: 42 pages, in IEEE Acces

    From Fully-Supervised Single-Task to Semi-Supervised Multi-Task Deep Learning Architectures for Segmentation in Medical Imaging Applications

    Get PDF
    Medical imaging is routinely performed in clinics worldwide for the diagnosis and treatment of numerous medical conditions in children and adults. With the advent of these medical imaging modalities, radiologists can visualize both the structure of the body as well as the tissues within the body. However, analyzing these high-dimensional (2D/3D/4D) images demands a significant amount of time and effort from radiologists. Hence, there is an ever-growing need for medical image computing tools to extract relevant information from the image data to help radiologists perform efficiently. Image analysis based on machine learning has pivotal potential to improve the entire medical imaging pipeline, providing support for clinical decision-making and computer-aided diagnosis. To be effective in addressing challenging image analysis tasks such as classification, detection, registration, and segmentation, specifically for medical imaging applications, deep learning approaches have shown significant improvement in performance. While deep learning has shown its potential in a variety of medical image analysis problems including segmentation, motion estimation, etc., generalizability is still an unsolved problem and many of these successes are achieved at the cost of a large pool of datasets. For most practical applications, getting access to a copious dataset can be very difficult, often impossible. Annotation is tedious and time-consuming. This cost is further amplified when annotation must be done by a clinical expert in medical imaging applications. Additionally, the applications of deep learning in the real-world clinical setting are still limited due to the lack of reliability caused by the limited prediction capabilities of some deep learning models. Moreover, while using a CNN in an automated image analysis pipeline, it’s critical to understand which segmentation results are problematic and require further manual examination. To this extent, the estimation of uncertainty calibration in a semi-supervised setting for medical image segmentation is still rarely reported. This thesis focuses on developing and evaluating optimized machine learning models for a variety of medical imaging applications, ranging from fully-supervised, single-task learning to semi-supervised, multi-task learning that makes efficient use of annotated training data. The contributions of this dissertation are as follows: (1) developing a fully-supervised, single-task transfer learning for the surgical instrument segmentation from laparoscopic images; and (2) utilizing supervised, single-task, transfer learning for segmenting and digitally removing the surgical instruments from endoscopic/laparoscopic videos to allow the visualization of the anatomy being obscured by the tool. The tool removal algorithms use a tool segmentation mask and either instrument-free reference frames or previous instrument-containing frames to fill in (inpaint) the instrument segmentation mask; (3) developing fully-supervised, single-task learning via efficient weight pruning and learned group convolution for accurate left ventricle (LV), right ventricle (RV) blood pool and myocardium localization and segmentation from 4D cine cardiac MR images; (4) demonstrating the use of our fully-supervised memory-efficient model to generate dynamic patient-specific right ventricle (RV) models from cine cardiac MRI dataset via an unsupervised learning-based deformable registration field; and (5) integrating a Monte Carlo dropout into our fully-supervised memory-efficient model with inherent uncertainty estimation, with the overall goal to estimate the uncertainty associated with the obtained segmentation and error, as a means to flag regions that feature less than optimal segmentation results; (6) developing semi-supervised, single-task learning via self-training (through meta pseudo-labeling) in concert with a Teacher network that instructs the Student network by generating pseudo-labels given unlabeled input data; (7) proposing largely-unsupervised, multi-task learning to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two of the foremost critical tasks in medical imaging — segmentation of cardiac structures and reconstruction of the cine cardiac MR images; (8) demonstrating the use of 3D semi-supervised, multi-task learning for jointly learning multiple tasks in a single backbone module – uncertainty estimation, geometric shape generation, and cardiac anatomical structure segmentation of the left atrial cavity from 3D Gadolinium-enhanced magnetic resonance (GE-MR) images. This dissertation summarizes the impact of the contributions of our work in terms of demonstrating the adaptation and use of deep learning architectures featuring different levels of supervision to build a variety of image segmentation tools and techniques that can be used across a wide spectrum of medical image computing applications centered on facilitating and promoting the wide-spread computer-integrated diagnosis and therapy data science

    Simulation and Synthesis for Cardiac Magnetic Resonance Image Analysis

    Get PDF

    Simulation and Synthesis for Cardiac Magnetic Resonance Image Analysis

    Get PDF

    Automatic segmentation of multiple cardiovascular structures from cardiac computed tomography angiography images using deep learning.

    Get PDF
    OBJECTIVES:To develop, demonstrate and evaluate an automated deep learning method for multiple cardiovascular structure segmentation. BACKGROUND:Segmentation of cardiovascular images is resource-intensive. We design an automated deep learning method for the segmentation of multiple structures from Coronary Computed Tomography Angiography (CCTA) images. METHODS:Images from a multicenter registry of patients that underwent clinically-indicated CCTA were used. The proximal ascending and descending aorta (PAA, DA), superior and inferior vena cavae (SVC, IVC), pulmonary artery (PA), coronary sinus (CS), right ventricular wall (RVW) and left atrial wall (LAW) were annotated as ground truth. The U-net-derived deep learning model was trained, validated and tested in a 70:20:10 split. RESULTS:The dataset comprised 206 patients, with 5.130 billion pixels. Mean age was 59.9 ± 9.4 yrs., and was 42.7% female. An overall median Dice score of 0.820 (0.782, 0.843) was achieved. Median Dice scores for PAA, DA, SVC, IVC, PA, CS, RVW and LAW were 0.969 (0.979, 0.988), 0.953 (0.955, 0.983), 0.937 (0.934, 0.965), 0.903 (0.897, 0.948), 0.775 (0.724, 0.925), 0.720 (0.642, 0.809), 0.685 (0.631, 0.761) and 0.625 (0.596, 0.749) respectively. Apart from the CS, there were no significant differences in performance between sexes or age groups. CONCLUSIONS:An automated deep learning model demonstrated segmentation of multiple cardiovascular structures from CCTA images with reasonable overall accuracy when evaluated on a pixel level

    An accurate and time-efficient deep learning-based system for automated segmentation and reporting of cardiac magnetic resonance-detected ischemic scar

    Get PDF
    Background and objectives: Myocardial infarction scar (MIS) assessment by cardiac magnetic resonance provides prognostic information and guides patients' clinical management. However, MIS segmentation is time-consuming and not performed routinely. This study presents a deep-learning-based computational workflow for the segmentation of left ventricular (LV) MIS, for the first time performed on state-of-the-art dark-blood late gadolinium enhancement (DB-LGE) images, and the computation of MIS transmurality and extent.Methods: DB-LGE short-axis images of consecutive patients with myocardial infarction were acquired at 1.5T in two centres between Jan 1, 2019, and June 1, 2021. Two convolutional neural network (CNN) mod-els based on the U-Net architecture were trained to sequentially segment the LV and MIS, by processing an incoming series of DB-LGE images. A 5-fold cross-validation was performed to assess the performance of the models. Model outputs were compared respectively with manual (LV endo-and epicardial border) and semi-automated (MIS, 4-Standard Deviation technique) ground truth to assess the accuracy of the segmentation. An automated post-processing and reporting tool was developed, computing MIS extent (expressed as relative infarcted mass) and transmurality.Results: The dataset included 1355 DB-LGE short-axis images from 144 patients (MIS in 942 images). High performance (> 0.85) as measured by the Intersection over Union metric was obtained for both the LV and MIS segmentations on the training sets. The performance for both LV and MIS segmentations was 0.83 on the test sets.Compared to the 4-Standard Deviation segmentation technique, our system was five times quicker ( <1 min versus 7 +/- 3 min), and required minimal user interaction. Conclusions: Our solution successfully addresses different issues related to automatic MIS segmentation, including accuracy, time-effectiveness, and the automatic generation of a clinical report.(c) 2022 Elsevier B.V. All rights reserved

    U-net and its variants for medical image segmentation: A review of theory and applications

    Get PDF
    U-net is an image segmentation technique developed primarily for image segmentation tasks. These traits provide U-net with a high utility within the medical imaging community and have resulted in extensive adoption of U-net as the primary tool for segmentation tasks in medical imaging. The success of U-net is evident in its widespread use in nearly all major image modalities, from CT scans and MRI to Xrays and microscopy. Furthermore, while U-net is largely a segmentation tool, there have been instances of the use of U-net in other applications. Given that U-net’s potential is still increasing, this narrative literature review examines the numerous developments and breakthroughs in the U-net architecture and provides observations on recent trends. We also discuss the many innovations that have advanced in deep learning and discuss how these tools facilitate U-net. In addition, we review the different image modalities and application areas that have been enhanced by U-net

    S-Net: a multiple cross aggregation convolutional architecture for automatic segmentation of small/thin structures for cardiovascular applications

    Get PDF
    With the success of U-Net or its variants in automatic medical image segmentation, building a fully convolutional network (FCN) based on an encoder-decoder structure has become an effective end-to-end learning approach. However, the intrinsic property of FCNs is that as the encoder deepens, higher-level features are learned, and the receptive field size of the network increases, which results in unsatisfactory performance for detecting low-level small/thin structures such as atrial walls and small arteries. To address this issue, we propose to keep the different encoding layer features at their original sizes to constrain the receptive field from increasing as the network goes deeper. Accordingly, we develop a novel S-shaped multiple cross-aggregation segmentation architecture named S-Net, which has two branches in the encoding stage, i.e., a resampling branch to capture low-level fine-grained details and thin/small structures and a downsampling branch to learn high-level discriminative knowledge. In particular, these two branches learn complementary features by residual cross-aggregation; the fusion of the complementary features from different decoding layers can be effectively accomplished through lateral connections. Meanwhile, we perform supervised prediction at all decoding layers to incorporate coarse-level features with high semantic meaning and fine-level features with high localization capability to detect multi-scale structures, especially for small/thin volumes fully. To validate the effectiveness of our S-Net, we conducted extensive experiments on the segmentation of cardiac wall and intracranial aneurysm (IA) vasculature, and quantitative and qualitative evaluations demonstrated the superior performance of our method for predicting small/thin structures in medical images

    S-Net: a multiple cross aggregation convolutional architecture for automatic segmentation of small/thin structures for cardiovascular applications

    Get PDF
    With the success of U-Net or its variants in automatic medical image segmentation, building a fully convolutional network (FCN) based on an encoder-decoder structure has become an effective end-to-end learning approach. However, the intrinsic property of FCNs is that as the encoder deepens, higher-level features are learned, and the receptive field size of the network increases, which results in unsatisfactory performance for detecting low-level small/thin structures such as atrial walls and small arteries. To address this issue, we propose to keep the different encoding layer features at their original sizes to constrain the receptive field from increasing as the network goes deeper. Accordingly, we develop a novel S-shaped multiple cross-aggregation segmentation architecture named S-Net, which has two branches in the encoding stage, i.e., a resampling branch to capture low-level fine-grained details and thin/small structures and a downsampling branch to learn high-level discriminative knowledge. In particular, these two branches learn complementary features by residual cross-aggregation; the fusion of the complementary features from different decoding layers can be effectively accomplished through lateral connections. Meanwhile, we perform supervised prediction at all decoding layers to incorporate coarse-level features with high semantic meaning and fine-level features with high localization capability to detect multi-scale structures, especially for small/thin volumes fully. To validate the effectiveness of our S-Net, we conducted extensive experiments on the segmentation of cardiac wall and intracranial aneurysm (IA) vasculature, and quantitative and qualitative evaluations demonstrated the superior performance of our method for predicting small/thin structures in medical images
    • …
    corecore