1,013 research outputs found
SegViz: A federated-learning based framework for multi-organ segmentation on heterogeneous data sets with partial annotations
Segmentation is one of the most primary tasks in deep learning for medical
imaging, owing to its multiple downstream clinical applications. However,
generating manual annotations for medical images is time-consuming, requires
high skill, and is an expensive effort, especially for 3D images. One potential
solution is to aggregate knowledge from partially annotated datasets from
multiple groups to collaboratively train global models using Federated
Learning. To this end, we propose SegViz, a federated learning-based framework
to train a segmentation model from distributed non-i.i.d datasets with partial
annotations. The performance of SegViz was compared against training individual
models separately on each dataset as well as centrally aggregating all the
datasets in one place and training a single model. The SegViz framework using
FedBN as the aggregation strategy demonstrated excellent performance on the
external BTCV set with dice scores of 0.93, 0.83, 0.55, and 0.75 for
segmentation of liver, spleen, pancreas, and kidneys, respectively,
significantly () better (except spleen) than the dice scores of 0.87,
0.83, 0.42, and 0.48 for the baseline models. In contrast, the central
aggregation model significantly () performed poorly on the test dataset
with dice scores of 0.65, 0, 0.55, and 0.68. Our results demonstrate the
potential of the SegViz framework to train multi-task models from distributed
datasets with partial labels. All our implementations are open-source and
available at https://anonymous.4open.science/r/SegViz-B74
Incremental Learning for Multi-organ Segmentation with Partially Labeled Datasets
There exists a large number of datasets for organ segmentation, which are
partially annotated, and sequentially constructed. A typical dataset is
constructed at a certain time by curating medical images and annotating the
organs of interest. In other words, new datasets with annotations of new organ
categories are built over time. To unleash the potential behind these partially
labeled, sequentially-constructed datasets, we propose to learn a multi-organ
segmentation model through incremental learning (IL). In each IL stage, we lose
access to the previous annotations, whose knowledge is assumingly captured by
the current model, and gain the access to a new dataset with annotations of new
organ categories, from which we learn to update the organ segmentation model to
include the new organs. We give the first attempt to conjecture that the
different distribution is the key reason for 'catastrophic forgetting' that
commonly exists in IL methods, and verify that IL has the natural adaptability
to medical image scenarios. Extensive experiments on five open-sourced datasets
are conducted to prove the effectiveness of our method and the conjecture
mentioned above
COSST: Multi-organ Segmentation with Partially Labeled Datasets Using Comprehensive Supervisions and Self-training
Deep learning models have demonstrated remarkable success in multi-organ
segmentation but typically require large-scale datasets with all organs of
interest annotated. However, medical image datasets are often low in sample
size and only partially labeled, i.e., only a subset of organs are annotated.
Therefore, it is crucial to investigate how to learn a unified model on the
available partially labeled datasets to leverage their synergistic potential.
In this paper, we systematically investigate the partial-label segmentation
problem with theoretical and empirical analyses on the prior techniques. We
revisit the problem from a perspective of partial label supervision signals and
identify two signals derived from ground truth and one from pseudo labels. We
propose a novel two-stage framework termed COSST, which effectively and
efficiently integrates comprehensive supervision signals with self-training.
Concretely, we first train an initial unified model using two ground
truth-based signals and then iteratively incorporate the pseudo label signal to
the initial model using self-training. To mitigate performance degradation
caused by unreliable pseudo labels, we assess the reliability of pseudo labels
via outlier detection in latent space and exclude the most unreliable pseudo
labels from each self-training iteration. Extensive experiments are conducted
on one public and three private partial-label segmentation tasks over 12 CT
datasets. Experimental results show that our proposed COSST achieves
significant improvement over the baseline method, i.e., individual networks
trained on each partially labeled dataset. Compared to the state-of-the-art
partial-label segmentation methods, COSST demonstrates consistent superior
performance on various segmentation tasks and with different training data
sizes
Towards Robust Partially Supervised Multi-Structure Medical Image Segmentation on Small-Scale Data
The data-driven nature of deep learning (DL) models for semantic segmentation
requires a large number of pixel-level annotations. However, large-scale and
fully labeled medical datasets are often unavailable for practical tasks.
Recently, partially supervised methods have been proposed to utilize images
with incomplete labels in the medical domain. To bridge the methodological gaps
in partially supervised learning (PSL) under data scarcity, we propose Vicinal
Labels Under Uncertainty (VLUU), a simple yet efficient framework utilizing the
human structure similarity for partially supervised medical image segmentation.
Motivated by multi-task learning and vicinal risk minimization, VLUU transforms
the partially supervised problem into a fully supervised problem by generating
vicinal labels. We systematically evaluate VLUU under the challenges of
small-scale data, dataset shift, and class imbalance on two commonly used
segmentation datasets for the tasks of chest organ segmentation and optic
disc-and-cup segmentation. The experimental results show that VLUU can
consistently outperform previous partially supervised models in these settings.
Our research suggests a new research direction in label-efficient deep learning
with partial supervision.Comment: Accepted by Applied Soft Computin
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data
Developing a generalized segmentation model capable of simultaneously
delineating multiple organs and diseases is highly desirable. Federated
learning (FL) is a key technology enabling the collaborative development of a
model without exchanging training data. However, the limited access to fully
annotated training data poses a major challenge to training generalizable
models. We propose "ConDistFL", a framework to solve this problem by combining
FL with knowledge distillation. Local models can extract the knowledge of
unlabeled organs and tumors from partially annotated data from the global model
with an adequately designed conditional probability representation. We validate
our framework on four distinct partially annotated abdominal CT datasets from
the MSD and KiTS19 challenges. The experimental results show that the proposed
framework significantly outperforms FedAvg and FedOpt baselines. Moreover, the
performance on an external test dataset demonstrates superior generalizability
compared to models trained on each dataset separately. Our ablation study
suggests that ConDistFL can perform well without frequent aggregation, reducing
the communication cost of FL. Our implementation will be available at
https://github.com/NVIDIA/NVFlare/tree/dev/research/condist-fl
AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation
Despite the considerable progress in automatic abdominal multi-organ
segmentation from CT/MRI scans in recent years, a comprehensive evaluation of
the models' capabilities is hampered by the lack of a large-scale benchmark
from diverse clinical scenarios. Constraint by the high cost of collecting and
labeling 3D medical data, most of the deep learning models to date are driven
by datasets with a limited number of organs of interest or samples, which still
limits the power of modern deep models and makes it difficult to provide a
fully comprehensive and fair estimate of various methods. To mitigate the
limitations, we present AMOS, a large-scale, diverse, clinical dataset for
abdominal organ segmentation. AMOS provides 500 CT and 100 MRI scans collected
from multi-center, multi-vendor, multi-modality, multi-phase, multi-disease
patients, each with voxel-level annotations of 15 abdominal organs, providing
challenging examples and test-bed for studying robust segmentation algorithms
under diverse targets and scenarios. We further benchmark several
state-of-the-art medical segmentation models to evaluate the status of the
existing methods on this new challenging dataset. We have made our datasets,
benchmark servers, and baselines publicly available, and hope to inspire future
research. Information can be found at https://amos22.grand-challenge.org
Focused Decoding Enables 3D Anatomical Detection by Transformers
Detection Transformers represent end-to-end object detection approaches based
on a Transformer encoder-decoder architecture, exploiting the attention
mechanism for global relation modeling. Although Detection Transformers deliver
results on par with or even superior to their highly optimized CNN-based
counterparts operating on 2D natural images, their success is closely coupled
to access to a vast amount of training data. This, however, restricts the
feasibility of employing Detection Transformers in the medical domain, as
access to annotated data is typically limited. To tackle this issue and
facilitate the advent of medical Detection Transformers, we propose a novel
Detection Transformer for 3D anatomical structure detection, dubbed Focused
Decoder. Focused Decoder leverages information from an anatomical region atlas
to simultaneously deploy query anchors and restrict the cross-attention's field
of view to regions of interest, which allows for a precise focus on relevant
anatomical structures. We evaluate our proposed approach on two publicly
available CT datasets and demonstrate that Focused Decoder not only provides
strong detection results and thus alleviates the need for a vast amount of
annotated data but also exhibits exceptional and highly intuitive
explainability of results via attention weights. Our code is available at
https://github.com/bwittmann/transoar.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) https://melba-journal.org/2023:00
Towards holistic scene understanding:Semantic segmentation and beyond
This dissertation addresses visual scene understanding and enhances
segmentation performance and generalization, training efficiency of networks,
and holistic understanding. First, we investigate semantic segmentation in the
context of street scenes and train semantic segmentation networks on
combinations of various datasets. In Chapter 2 we design a framework of
hierarchical classifiers over a single convolutional backbone, and train it
end-to-end on a combination of pixel-labeled datasets, improving
generalizability and the number of recognizable semantic concepts. Chapter 3
focuses on enriching semantic segmentation with weak supervision and proposes a
weakly-supervised algorithm for training with bounding box-level and
image-level supervision instead of only with per-pixel supervision. The memory
and computational load challenges that arise from simultaneous training on
multiple datasets are addressed in Chapter 4. We propose two methodologies for
selecting informative and diverse samples from datasets with weak supervision
to reduce our networks' ecological footprint without sacrificing performance.
Motivated by memory and computation efficiency requirements, in Chapter 5, we
rethink simultaneous training on heterogeneous datasets and propose a universal
semantic segmentation framework. This framework achieves consistent increases
in performance metrics and semantic knowledgeability by exploiting various
scene understanding datasets. Chapter 6 introduces the novel task of part-aware
panoptic segmentation, which extends our reasoning towards holistic scene
understanding. This task combines scene and parts-level semantics with
instance-level object detection. In conclusion, our contributions span over
convolutional network architectures, weakly-supervised learning, part and
panoptic segmentation, paving the way towards a holistic, rich, and sustainable
visual scene understanding.Comment: PhD Thesis, Eindhoven University of Technology, October 202
- …