55 research outputs found
ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation
Most recent scribble-supervised segmentation methods commonly adopt a CNN framework with an encoder-decoder architecture. Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. The proposed ScribFormer model has a triple-branch structure, i.e., the hybrid of a CNN branch, a Transformer branch, and an attention-guided class activation map (ACAM) branch. Specifically, the CNN branch collaborates with the Transformer branch to fuse the local features learned from CNN with the global representations obtained from Transformer, which can effectively overcome limitations of existing scribble-supervised segmentation methods. Furthermore, the ACAM branch assists in unifying the shallow convolution features and the deep convolution features to improve model’s performance further. Extensive experiments on two public datasets and one private dataset show that our ScribFormer has superior performance over the state-of-the-art scribble-supervised segmentation methods, and achieves even better results than the fully-supervised segmentation methods. The code is released at https://github.com/HUANGLIZI/ScribFormer
Acquiring Weak Annotations for Tumor Localization in Temporal and Volumetric Data
Creating large-scale and well-annotated datasets to train AI algorithms is
crucial for automated tumor detection and localization. However, with limited
resources, it is challenging to determine the best type of annotations when
annotating massive amounts of unlabeled data. To address this issue, we focus
on polyps in colonoscopy videos and pancreatic tumors in abdominal CT scans;
both applications require significant effort and time for pixel-wise annotation
due to the high dimensional nature of the data, involving either temporary or
spatial dimensions. In this paper, we develop a new annotation strategy, termed
Drag&Drop, which simplifies the annotation process to drag and drop. This
annotation strategy is more efficient, particularly for temporal and volumetric
imaging, than other types of weak annotations, such as per-pixel, bounding
boxes, scribbles, ellipses, and points. Furthermore, to exploit our Drag&Drop
annotations, we develop a novel weakly supervised learning method based on the
watershed algorithm. Experimental results show that our method achieves better
detection and localization performance than alternative weak annotations and,
more importantly, achieves similar performance to that trained on detailed
per-pixel annotations. Interestingly, we find that, with limited resources,
allocating weak annotations from a diverse patient population can foster models
more robust to unseen images than allocating per-pixel annotations for a small
set of images. In summary, this research proposes an efficient annotation
strategy for tumor detection and localization that is less accurate than
per-pixel annotations but useful for creating large-scale datasets for
screening tumors in various medical modalities.Comment: Published in Machine Intelligence Researc
Scribble-based Domain Adaptation via Co-segmentation
Although deep convolutional networks have reached state-of-the-art
performance in many medical image segmentation tasks, they have typically
demonstrated poor generalisation capability. To be able to generalise from one
domain (e.g. one imaging modality) to another, domain adaptation has to be
performed. While supervised methods may lead to good performance, they require
to fully annotate additional data which may not be an option in practice. In
contrast, unsupervised methods don't need additional annotations but are
usually unstable and hard to train. In this work, we propose a novel
weakly-supervised method. Instead of requiring detailed but time-consuming
annotations, scribbles on the target domain are used to perform domain
adaptation. This paper introduces a new formulation of domain adaptation based
on structured learning and co-segmentation. Our method is easy to train, thanks
to the introduction of a regularised loss. The framework is validated on
Vestibular Schwannoma segmentation (T1 to T2 scans). Our proposed method
outperforms unsupervised approaches and achieves comparable performance to a
fully-supervised approach.Comment: Accepted at MICCAI 202
Data efficient deep learning for medical image analysis: A survey
The rapid evolution of deep learning has significantly advanced the field of
medical image analysis. However, despite these achievements, the further
enhancement of deep learning models for medical image analysis faces a
significant challenge due to the scarcity of large, well-annotated datasets. To
address this issue, recent years have witnessed a growing emphasis on the
development of data-efficient deep learning methods. This paper conducts a
thorough review of data-efficient deep learning methods for medical image
analysis. To this end, we categorize these methods based on the level of
supervision they rely on, encompassing categories such as no supervision,
inexact supervision, incomplete supervision, inaccurate supervision, and only
limited supervision. We further divide these categories into finer
subcategories. For example, we categorize inexact supervision into multiple
instance learning and learning with weak annotations. Similarly, we categorize
incomplete supervision into semi-supervised learning, active learning, and
domain-adaptive learning and so on. Furthermore, we systematically summarize
commonly used datasets for data efficient deep learning in medical image
analysis and investigate future research directions to conclude this survey.Comment: Under Revie
Scribble-Supervised LiDAR Semantic Segmentation
Densely annotating LiDAR point clouds remains too expensive and
time-consuming to keep up with the ever growing volume of data. While current
literature focuses on fully-supervised performance, developing efficient
methods that take advantage of realistic weak supervision have yet to be
explored. In this paper, we propose using scribbles to annotate LiDAR point
clouds and release ScribbleKITTI, the first scribble-annotated dataset for
LiDAR semantic segmentation. Furthermore, we present a pipeline to reduce the
performance gap that arises when using such weak annotations. Our pipeline
comprises of three stand-alone contributions that can be combined with any
LiDAR semantic segmentation model to achieve up to 95.7% of the
fully-supervised performance while using only 8% labeled points. Our scribble
annotations and code are available at github.com/ouenal/scribblekitti.Comment: Accepted at CVPR 2022 (ORAL
YoloCurvSeg: You Only Label One Noisy Skeleton for Vessel-style Curvilinear Structure Segmentation
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict
between data annotation cost and model performance through employing
sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown
promising performance, particularly in the image segmentation field. However,
it is still a very challenging problem due to the limited supervision,
especially when only a small number of labeled samples are available.
Additionally, almost all existing WSL segmentation methods are designed for
star-convex structures which are very different from curvilinear structures
such as vessels and nerves. In this paper, we propose a novel sparsely
annotated segmentation framework for curvilinear structures, named YoloCurvSeg,
based on image synthesis. A background generator delivers image backgrounds
that closely match real distributions through inpainting dilated skeletons. The
extracted backgrounds are then combined with randomly emulated curves generated
by a Space Colonization Algorithm-based foreground generator and through a
multilayer patch-wise contrastive learning synthesizer. In this way, a
synthetic dataset with both images and curve segmentation labels is obtained,
at the cost of only one or a few noisy skeleton annotations. Finally, a
segmenter is trained with the generated dataset and possibly an unlabeled
dataset. The proposed YoloCurvSeg is evaluated on four publicly available
datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that
YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large
margins. With only one noisy skeleton annotation (respectively 0.14%, 0.03%,
1.40%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of
the fully-supervised performance on each dataset. Code and datasets will be
released at https://github.com/llmir/YoloCurvSeg.Comment: 11 pages, 10 figures, submitted to IEEE Transactions on Medical
Imaging (TMI
- …