16 research outputs found

    A context based deep learning approach for unbalanced medical image segmentation

    Full text link
    Automated medical image segmentation is an important step in many medical procedures. Recently, deep learning networks have been widely used for various medical image segmentation tasks, with U-Net and generative adversarial nets (GANs) being some of the commonly used ones. Foreground-background class imbalance is a common occurrence in medical images, and U-Net has difficulty in handling class imbalance because of its cross entropy (CE) objective function. Similarly, GAN also suffers from class imbalance because the discriminator looks at the entire image to classify it as real or fake. Since the discriminator is essentially a deep learning classifier, it is incapable of correctly identifying minor changes in small structures. To address these issues, we propose a novel context based CE loss function for U-Net, and a novel architecture Seg-GLGAN. The context based CE is a linear combination of CE obtained over the entire image and its region of interest (ROI). In Seg-GLGAN, we introduce a novel context discriminator to which the entire image and its ROI are fed as input, thus enforcing local context. We conduct extensive experiments using two challenging unbalanced datasets: PROMISE12 and ACDC. We observe that segmentation results obtained from our methods give better segmentation metrics as compared to various baseline methods.Comment: Accepted in ISBI 202

    EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection

    Full text link
    Convolutional neural networks have been successfully applied to semantic segmentation problems. However, there are many problems that are inherently not pixel-wise classification problems but are nevertheless frequently formulated as semantic segmentation. This ill-posed formulation consequently necessitates hand-crafted scenario-specific and computationally expensive post-processing methods to convert the per pixel probability maps to final desired outputs. Generative adversarial networks (GANs) can be used to make the semantic segmentation network output to be more realistic or better structure-preserving, decreasing the dependency on potentially complex post-processing. In this work, we propose EL-GAN: a GAN framework to mitigate the discussed problem using an embedding loss. With EL-GAN, we discriminate based on learned embeddings of both the labels and the prediction at the same time. This results in more stable training due to having better discriminative information, benefiting from seeing both `fake' and `real' predictions at the same time. This substantially stabilizes the adversarial training process. We use the TuSimple lane marking challenge to demonstrate that with our proposed framework it is viable to overcome the inherent anomalies of posing it as a semantic segmentation problem. Not only is the output considerably more similar to the labels when compared to conventional methods, the subsequent post-processing is also simpler and crosses the competitive 96% accuracy threshold.Comment: 14 pages, 7 figure

    Explainable Semantic Medical Image Segmentation with Style

    Full text link
    Semantic medical image segmentation using deep learning has recently achieved high accuracy, making it appealing to clinical problems such as radiation therapy. However, the lack of high-quality semantically labelled data remains a challenge leading to model brittleness to small shifts to input data. Most works require extra data for semi-supervised learning and lack the interpretability of the boundaries of the training data distribution during training, which is essential for model deployment in clinical practice. We propose a fully supervised generative framework that can achieve generalisable segmentation with only limited labelled data by simultaneously constructing an explorable manifold during training. The proposed approach creates medical image style paired with a segmentation task driven discriminator incorporating end-to-end adversarial training. The discriminator is generalised to small domain shifts as much as permissible by the training data, and the generator automatically diversifies the training samples using a manifold of input features learnt during segmentation. All the while, the discriminator guides the manifold learning by supervising the semantic content and fine-grained features separately during the image diversification. After training, visualisation of the learnt manifold from the generator is available to interpret the model limits. Experiments on a fully semantic, publicly available pelvis dataset demonstrated that our method is more generalisable to shifts than other state-of-the-art methods while being more explainable using an explorable manifold

    I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation

    Full text link
    Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualities as well as disabling the network from properly expressing uncertainties. In this paper, we rethink adversarial training for semantic segmentation and propose to formulate the fake/real discrimination framework with a correct/incorrect training objective. More specifically, we replace the discriminator with a "gambler" network that learns to spot and distribute its budget in areas where the predictions are clearly wrong, while the segmenter network tries to leave no clear clues for the gambler where to bet. Empirical evaluation on two road-scene semantic segmentation tasks shows that not only does the proposed method re-enable expressing uncertainties, it also improves pixel-wise and structure-based metrics.Comment: 13 pages, 8 figure

    Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN

    Full text link
    Generative Adversarial Networks (GAN) have many potential medical imaging applications, including data augmentation, domain adaptation, and model explanation. Due to the limited embedded memory of Graphical Processing Units (GPUs), most current 3D GAN models are trained on low-resolution medical images. In this work, we propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by separating training and inference. During training, we adopt a hierarchical structure that simultaneously generates a low-resolution version of the image and a randomly selected sub-volume of the high-resolution image. The hierarchical design has two advantages: First, the memory demand for training on high-resolution images is amortized among subvolumes. Furthermore, anchoring the high-resolution subvolumes to a single low-resolution image ensures anatomical consistency between subvolumes. During inference, our model can directly generate full high-resolution images. We also incorporate an encoder with a similar hierarchical structure into the model to extract features from the images. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation and clinical-relevant feature extraction.Comment: 12 pages, 9 figures. Under revie
    corecore