67 research outputs found

    Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation

    Full text link
    Image segmentation is a fundamental and challenging problem in computer vision with applications spanning multiple areas, such as medical imaging, remote sensing, and autonomous vehicles. Recently, convolutional neural networks (CNNs) have gained traction in the design of automated segmentation pipelines. Although CNN-based models are adept at learning abstract features from raw image data, their performance is dependent on the availability and size of suitable training datasets. Additionally, these models are often unable to capture the details of object boundaries and generalize poorly to unseen classes. In this thesis, we devise novel methodologies that address these issues and establish robust representation learning frameworks for fully-automatic semantic segmentation in medical imaging and mainstream computer vision. In particular, our contributions include (1) state-of-the-art 2D and 3D image segmentation networks for computer vision and medical image analysis, (2) an end-to-end trainable image segmentation framework that unifies CNNs and active contour models with learnable parameters for fast and robust object delineation, (3) a novel approach for disentangling edge and texture processing in segmentation networks, and (4) a novel few-shot learning model in both supervised settings and semi-supervised settings where synergies between latent and image spaces are leveraged to learn to segment images given limited training data.Comment: PhD dissertation, UCLA, 202

    Segmentation of Kidney and Renal Tumor in CT Scans Using Convolutional Networks

    Get PDF
    Accurate segmentation of kidney and renal tumor in CT images is a prerequisite step in surgery planning. However, this task remains a challenge. In this report, we use convolutional networks (ConvNet) to automatically segment kidney and renal tumor. Specifically, we adopt a 2D ConvNet to select a range of slices to be segmented in the inference phase for accelerating segmentation, while a 3D ConvNet is trained to segment regions of interest in the above narrow range. In localization phase, CT images from several publicly available datasets were used for learning localizer. This localizer aims to filter out slices impossible containing kidney and renal tumor, and it was fine-tuned from AlexNet pre-trained on ImageNet. In segmentation phase, a simple U-net with large patch size (160Ă—160Ă—80) was trained to delineate contours of kidney and renal tumor. In the 2019 MICCAI Kidney Tumor Segmentation (KiTS19) Challenge, 5-fold cross-validation was performed on the training set. 168 (80%) CT scans were used for training and remaining 42 (20%) cases were used for validation. The resulting average Dice similarity coefficients are 0.9662 and 0.7905 for kidney and renal tumor, respectively

    KiTS challenge: VNet with attention gates and deep supervision

    Get PDF
    This paper presents the 3D fully convolutional neural network extended by attention gates and deep supervision layers. The model is able to automatically segment the kidney and kidney-tumor from arterial phase abdominal computed tomography (CT) scans. It was trained on the dataset proposed by the Kidney Tumor Segmentation Challange 2019. The best solution reaches the dice score 96, 43± 1, 06 and 79, 94± 5, 33 for kidney and kidney-tumor labels, respectively. The implementation of the proposed methodology using PyTorch is publicly available at github.com/tureckova/Abdomen-CT-Image-Segmentation

    Attention Mechanisms in Medical Image Segmentation: A Survey

    Full text link
    Medical image segmentation plays an important role in computer-aided diagnosis. Attention mechanisms that distinguish important parts from irrelevant parts have been widely used in medical image segmentation tasks. This paper systematically reviews the basic principles of attention mechanisms and their applications in medical image segmentation. First, we review the basic concepts of attention mechanism and formulation. Second, we surveyed over 300 articles related to medical image segmentation, and divided them into two groups based on their attention mechanisms, non-Transformer attention and Transformer attention. In each group, we deeply analyze the attention mechanisms from three aspects based on the current literature work, i.e., the principle of the mechanism (what to use), implementation methods (how to use), and application tasks (where to use). We also thoroughly analyzed the advantages and limitations of their applications to different tasks. Finally, we summarize the current state of research and shortcomings in the field, and discuss the potential challenges in the future, including task specificity, robustness, standard evaluation, etc. We hope that this review can showcase the overall research context of traditional and Transformer attention methods, provide a clear reference for subsequent research, and inspire more advanced attention research, not only in medical image segmentation, but also in other image analysis scenarios.Comment: Submitted to Medical Image Analysis, survey paper, 34 pages, over 300 reference

    MSS U-Net: 3D segmentation of kidneys and tumors from CT images with a multi-scale supervised U-Net

    Get PDF
    Accurate segmentation of kidneys and kidney tumors is an essential step for radiomic analysis as well as developing advanced surgical planning techniques. In clinical analysis, the segmentation is currently performed by clinicians from the visual inspection of images gathered through a computed tomography (CT) scan. This process is laborious and its success significantly depends on previous experience. We present a multi-scale supervised 3D U-Net, MSS U-Net to segment kidneys and kidney tumors from CT images. Our architecture combines deep supervision with exponential logarithmic loss to increase the 3D U-Net training efficiency. Furthermore, we introduce a connected-component based post processing method to enhance the performance of the overall process. This architecture shows superior performance compared to state-of-the-art works, with the Dice coefficient of kidney and tumor up to 0.969 and 0.805 respectively. We tested MSS U-Net in the KiTS19 challenge with its corresponding dataset.</p

    Improving CT image tumor segmentation through deep supervision and attentional gates

    Get PDF
    Computer Tomography (CT) is an imaging procedure that combines many X-ray measurements taken from different angles. The segmentation of areas in the CT images provides a valuable aid to physicians and radiologists in order to better provide a patient diagnose. The CT scans of a body torso usually include different neighboring internal body organs. Deep learning has become the state-of-the-art in medical image segmentation. For such techniques, in order to perform a successful segmentation, it is of great importance that the network learns to focus on the organ of interest and surrounding structures and also that the network can detect target regions of different sizes. In this paper, we propose the extension of a popular deep learning methodology, Convolutional Neural Networks (CNN), by including deep supervision and attention gates. Our experimental evaluation shows that the inclusion of attention and deep supervision results in consistent improvement of the tumor prediction accuracy across the different datasets and training sizes while adding minimal computational overhead. © Copyright © 2020 Turečková, Tureček, Komínková Oplatková and Rodríguez-Sánchez.Internal Grant Agency of Tomas Bata University [IGA/CebiaTech/2020/001]; COST (European Cooperation in Science Technology) [CA15140]; program Projects of Large Research, Development, and Innovations Infrastructures [e-INFRA LM2018140

    UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation

    Full text link
    Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis. Transformer reformats the image into separate patches and realizes global communication via the self-attention mechanism. However, positional information between patches is hard to preserve in such 1D sequences, and loss of it can lead to sub-optimal performance when dealing with large amounts of heterogeneous tissues of various sizes in 3D medical image segmentation. Additionally, current methods are not robust and efficient for heavy-duty medical segmentation tasks such as predicting a large number of tissue classes or modeling globally inter-connected tissue structures. To address such challenges and inspired by the nested hierarchical structures in vision transformer, we proposed a novel 3D medical image segmentation method (UNesT), employing a simplified and faster-converging transformer encoder design that achieves local communication among spatially adjacent patch sequences by aggregating them hierarchically. We extensively validate our method on multiple challenging datasets, consisting of multiple modalities, anatomies, and a wide range of tissue classes, including 133 structures in the brain, 14 organs in the abdomen, 4 hierarchical components in the kidneys, inter-connected kidney tumors and brain tumors. We show that UNesT consistently achieves state-of-the-art performance and evaluate its generalizability and data efficiency. Particularly, the model achieves whole brain segmentation task complete ROI with 133 tissue classes in a single network, outperforming prior state-of-the-art method SLANT27 ensembled with 27 networks.Comment: 19 pages, 17 figures. arXiv admin note: text overlap with arXiv:2203.0243
    • …
    corecore