62 research outputs found
SoftSeg: Advantages of soft versus binary training for image segmentation
Most image segmentation algorithms are trained on binary masks formulated as
a classification task per pixel. However, in applications such as medical
imaging, this "black-and-white" approach is too constraining because the
contrast between two tissues is often ill-defined, i.e., the voxels located on
objects' edges contain a mixture of tissues. Consequently, assigning a single
"hard" label can result in a detrimental approximation. Instead, a soft
prediction containing non-binary values would overcome that limitation. We
introduce SoftSeg, a deep learning training approach that takes advantage of
soft ground truth labels, and is not bound to binary predictions. SoftSeg aims
at solving a regression instead of a classification problem. This is achieved
by using (i) no binarization after preprocessing and data augmentation, (ii) a
normalized ReLU final activation layer (instead of sigmoid), and (iii) a
regression loss function (instead of the traditional Dice loss). We assess the
impact of these three features on three open-source MRI segmentation datasets
from the spinal cord gray matter, the multiple sclerosis brain lesion, and the
multimodal brain tumor segmentation challenges. Across multiple
cross-validation iterations, SoftSeg outperformed the conventional approach,
leading to an increase in Dice score of 2.0% on the gray matter dataset
(p=0.001), 3.3% for the MS lesions, and 6.5% for the brain tumors. SoftSeg
produces consistent soft predictions at tissues' interfaces and shows an
increased sensitivity for small objects. The richness of soft labels could
represent the inter-expert variability, the partial volume effect, and
complement the model uncertainty estimation. The developed training pipeline
can easily be incorporated into most of the existing deep learning
architectures. It is already implemented in the freely-available deep learning
toolbox ivadomed (https://ivadomed.org)
Self-supervised Semantic Segmentation: Consistency over Transformation
Accurate medical image segmentation is of utmost importance for enabling
automated clinical decision procedures. However, prevailing supervised deep
learning approaches for medical image segmentation encounter significant
challenges due to their heavy dependence on extensive labeled training data. To
tackle this issue, we propose a novel self-supervised algorithm,
\textbf{S-Net}, which integrates a robust framework based on the proposed
Inception Large Kernel Attention (I-LKA) modules. This architectural
enhancement makes it possible to comprehensively capture contextual information
while preserving local intricacies, thereby enabling precise semantic
segmentation. Furthermore, considering that lesions in medical images often
exhibit deformations, we leverage deformable convolution as an integral
component to effectively capture and delineate lesion deformations for superior
object boundary definition. Additionally, our self-supervised strategy
emphasizes the acquisition of invariance to affine transformations, which is
commonly encountered in medical scenarios. This emphasis on robustness with
respect to geometric distortions significantly enhances the model's ability to
accurately model and handle such distortions. To enforce spatial consistency
and promote the grouping of spatially connected image pixels with similar
feature representations, we introduce a spatial consistency loss term. This
aids the network in effectively capturing the relationships among neighboring
pixels and enhancing the overall segmentation quality. The S-Net approach
iteratively learns pixel-level feature representations for image content
clustering in an end-to-end manner. Our experimental results on skin lesion and
lung organ segmentation tasks show the superior performance of our method
compared to the SOTA approaches. https://github.com/mindflow-institue/SSCTComment: Accepted in ICCV 2023 workshop CVAM
3D Matting: A Soft Segmentation Method Applied in Computed Tomography
Three-dimensional (3D) images, such as CT, MRI, and PET, are common in
medical imaging applications and important in clinical diagnosis. Semantic
ambiguity is a typical feature of many medical image labels. It can be caused
by many factors, such as the imaging properties, pathological anatomy, and the
weak representation of the binary masks, which brings challenges to accurate 3D
segmentation. In 2D medical images, using soft masks instead of binary masks
generated by image matting to characterize lesions can provide rich semantic
information, describe the structural characteristics of lesions more
comprehensively, and thus benefit the subsequent diagnoses and analyses. In
this work, we introduce image matting into the 3D scenes to describe the
lesions in 3D medical images. The study of image matting in 3D modality is
limited, and there is no high-quality annotated dataset related to 3D matting,
therefore slowing down the development of data-driven deep-learning-based
methods. To address this issue, we constructed the first 3D medical matting
dataset and convincingly verified the validity of the dataset through quality
control and downstream experiments in lung nodules classification. We then
adapt the four selected state-of-the-art 2D image matting algorithms to 3D
scenes and further customize the methods for CT images. Also, we propose the
first end-to-end deep 3D matting network and implement a solid 3D medical image
matting benchmark, which will be released to encourage further research.Comment: 12 pages, 7 figure
Unpaired multi-modal segmentation via knowledge distillation
Multi-modal learning is typically performed with network architectures containing modality-specific layers and shared layers, utilizing co-registered images of different modalities. We propose a novel learning scheme for unpaired cross-modality image segmentation, with a highly compact architecture achieving superior segmentation accuracy. In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI, and only employ modality-specific internal normalization layers which compute respective statistics. To effectively train such a highly compact model, we introduce a novel loss term inspired by knowledge distillation, by explicitly constraining the KL-divergence of our derived prediction distributions between modalities. We have extensively validated our approach on two multi-class segmentation problems: i) cardiac structure segmentation, and ii) abdominal organ segmentation. Different network settings, i.e., 2D dilated network and 3D U-net, are utilized to investigate our method's general efficacy. Experimental results on both tasks demonstrate that our novel multi-modal learning scheme consistently outperforms single-modal training and previous multi-modal approaches
- …