16 research outputs found
A context based deep learning approach for unbalanced medical image segmentation
Automated medical image segmentation is an important step in many medical
procedures. Recently, deep learning networks have been widely used for various
medical image segmentation tasks, with U-Net and generative adversarial nets
(GANs) being some of the commonly used ones. Foreground-background class
imbalance is a common occurrence in medical images, and U-Net has difficulty in
handling class imbalance because of its cross entropy (CE) objective function.
Similarly, GAN also suffers from class imbalance because the discriminator
looks at the entire image to classify it as real or fake. Since the
discriminator is essentially a deep learning classifier, it is incapable of
correctly identifying minor changes in small structures. To address these
issues, we propose a novel context based CE loss function for U-Net, and a
novel architecture Seg-GLGAN. The context based CE is a linear combination of
CE obtained over the entire image and its region of interest (ROI). In
Seg-GLGAN, we introduce a novel context discriminator to which the entire image
and its ROI are fed as input, thus enforcing local context. We conduct
extensive experiments using two challenging unbalanced datasets: PROMISE12 and
ACDC. We observe that segmentation results obtained from our methods give
better segmentation metrics as compared to various baseline methods.Comment: Accepted in ISBI 202
EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
Convolutional neural networks have been successfully applied to semantic
segmentation problems. However, there are many problems that are inherently not
pixel-wise classification problems but are nevertheless frequently formulated
as semantic segmentation. This ill-posed formulation consequently necessitates
hand-crafted scenario-specific and computationally expensive post-processing
methods to convert the per pixel probability maps to final desired outputs.
Generative adversarial networks (GANs) can be used to make the semantic
segmentation network output to be more realistic or better
structure-preserving, decreasing the dependency on potentially complex
post-processing. In this work, we propose EL-GAN: a GAN framework to mitigate
the discussed problem using an embedding loss. With EL-GAN, we discriminate
based on learned embeddings of both the labels and the prediction at the same
time. This results in more stable training due to having better discriminative
information, benefiting from seeing both `fake' and `real' predictions at the
same time. This substantially stabilizes the adversarial training process. We
use the TuSimple lane marking challenge to demonstrate that with our proposed
framework it is viable to overcome the inherent anomalies of posing it as a
semantic segmentation problem. Not only is the output considerably more similar
to the labels when compared to conventional methods, the subsequent
post-processing is also simpler and crosses the competitive 96% accuracy
threshold.Comment: 14 pages, 7 figure
Explainable Semantic Medical Image Segmentation with Style
Semantic medical image segmentation using deep learning has recently achieved
high accuracy, making it appealing to clinical problems such as radiation
therapy. However, the lack of high-quality semantically labelled data remains a
challenge leading to model brittleness to small shifts to input data. Most
works require extra data for semi-supervised learning and lack the
interpretability of the boundaries of the training data distribution during
training, which is essential for model deployment in clinical practice. We
propose a fully supervised generative framework that can achieve generalisable
segmentation with only limited labelled data by simultaneously constructing an
explorable manifold during training. The proposed approach creates medical
image style paired with a segmentation task driven discriminator incorporating
end-to-end adversarial training. The discriminator is generalised to small
domain shifts as much as permissible by the training data, and the generator
automatically diversifies the training samples using a manifold of input
features learnt during segmentation. All the while, the discriminator guides
the manifold learning by supervising the semantic content and fine-grained
features separately during the image diversification. After training,
visualisation of the learnt manifold from the generator is available to
interpret the model limits. Experiments on a fully semantic, publicly available
pelvis dataset demonstrated that our method is more generalisable to shifts
than other state-of-the-art methods while being more explainable using an
explorable manifold
I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation
Adversarial training has been recently employed for realizing structured
semantic segmentation, in which the aim is to preserve higher-level scene
structural consistencies in dense predictions. However, as we show, value-based
discrimination between the predictions from the segmentation network and
ground-truth annotations can hinder the training process from learning to
improve structural qualities as well as disabling the network from properly
expressing uncertainties. In this paper, we rethink adversarial training for
semantic segmentation and propose to formulate the fake/real discrimination
framework with a correct/incorrect training objective. More specifically, we
replace the discriminator with a "gambler" network that learns to spot and
distribute its budget in areas where the predictions are clearly wrong, while
the segmenter network tries to leave no clear clues for the gambler where to
bet. Empirical evaluation on two road-scene semantic segmentation tasks shows
that not only does the proposed method re-enable expressing uncertainties, it
also improves pixel-wise and structure-based metrics.Comment: 13 pages, 8 figure
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN
Generative Adversarial Networks (GAN) have many potential medical imaging
applications, including data augmentation, domain adaptation, and model
explanation. Due to the limited embedded memory of Graphical Processing Units
(GPUs), most current 3D GAN models are trained on low-resolution medical
images. In this work, we propose a novel end-to-end GAN architecture that can
generate high-resolution 3D images. We achieve this goal by separating training
and inference. During training, we adopt a hierarchical structure that
simultaneously generates a low-resolution version of the image and a randomly
selected sub-volume of the high-resolution image. The hierarchical design has
two advantages: First, the memory demand for training on high-resolution images
is amortized among subvolumes. Furthermore, anchoring the high-resolution
subvolumes to a single low-resolution image ensures anatomical consistency
between subvolumes. During inference, our model can directly generate full
high-resolution images. We also incorporate an encoder with a similar
hierarchical structure into the model to extract features from the images.
Experiments on 3D thorax CT and brain MRI demonstrate that our approach
outperforms state of the art in image generation and clinical-relevant feature
extraction.Comment: 12 pages, 9 figures. Under revie