8,042 research outputs found
LifeLonger: A Benchmark for Continual Disease Classification
Deep learning models have shown a great effectiveness in recognition of
findings in medical images. However, they cannot handle the ever-changing
clinical environment, bringing newly annotated medical data from different
sources. To exploit the incoming streams of data, these models would benefit
largely from sequentially learning from new samples, without forgetting the
previously obtained knowledge. In this paper we introduce LifeLonger, a
benchmark for continual disease classification on the MedMNIST collection, by
applying existing state-of-the-art continual learning methods. In particular,
we consider three continual learning scenarios, namely, task and class
incremental learning and the newly defined cross-domain incremental learning.
Task and class incremental learning of diseases address the issue of
classifying new samples without re-training the models from scratch, while
cross-domain incremental learning addresses the issue of dealing with datasets
originating from different institutions while retaining the previously obtained
knowledge. We perform a thorough analysis of the performance and examine how
the well-known challenges of continual learning, such as the catastrophic
forgetting exhibit themselves in this setting. The encouraging results
demonstrate that continual learning has a major potential to advance disease
classification and to produce a more robust and efficient learning framework
for clinical settings. The code repository, data partitions and baseline
results for the complete benchmark will be made publicly available
A Survey on Continual Semantic Segmentation: Theory, Challenge, Method and Application
Continual learning, also known as incremental learning or life-long learning,
stands at the forefront of deep learning and AI systems. It breaks through the
obstacle of one-way training on close sets and enables continuous adaptive
learning on open-set conditions. In the recent decade, continual learning has
been explored and applied in multiple fields especially in computer vision
covering classification, detection and segmentation tasks. Continual semantic
segmentation (CSS), of which the dense prediction peculiarity makes it a
challenging, intricate and burgeoning task. In this paper, we present a review
of CSS, committing to building a comprehensive survey on problem formulations,
primary challenges, universal datasets, neoteric theories and multifarious
applications. Concretely, we begin by elucidating the problem definitions and
primary challenges. Based on an in-depth investigation of relevant approaches,
we sort out and categorize current CSS models into two main branches including
\textit{data-replay} and \textit{data-free} sets. In each branch, the
corresponding approaches are similarity-based clustered and thoroughly
analyzed, following qualitative comparison and quantitative reproductions on
relevant datasets. Besides, we also introduce four CSS specialities with
diverse application scenarios and development tendencies. Furthermore, we
develop a benchmark for CSS encompassing representative references, evaluation
results and reproductions, which is available
at~\url{https://github.com/YBIO/SurveyCSS}. We hope this survey can serve as a
reference-worthy and stimulating contribution to the advancement of the
life-long learning field, while also providing valuable perspectives for
related fields.Comment: 20 pages, 12 figures. Undergoing Revie
A multifidelity approach to continual learning for physical systems
We introduce a novel continual learning method based on multifidelity deep
neural networks. This method learns the correlation between the output of
previously trained models and the desired output of the model on the current
training dataset, limiting catastrophic forgetting. On its own the
multifidelity continual learning method shows robust results that limit
forgetting across several datasets. Additionally, we show that the
multifidelity method can be combined with existing continual learning methods,
including replay and memory aware synapses, to further limit catastrophic
forgetting. The proposed continual learning method is especially suited for
physical problems where the data satisfy the same physical laws on each domain,
or for physics-informed neural networks, because in these cases we expect there
to be a strong correlation between the output of the previous model and the
model on the current training domain
Domain Generalization in Computational Pathology: Survey and Guidelines
Deep learning models have exhibited exceptional effectiveness in
Computational Pathology (CPath) by tackling intricate tasks across an array of
histology image analysis applications. Nevertheless, the presence of
out-of-distribution data (stemming from a multitude of sources such as
disparate imaging devices and diverse tissue preparation methods) can cause
\emph{domain shift} (DS). DS decreases the generalization of trained models to
unseen datasets with slightly different data distributions, prompting the need
for innovative \emph{domain generalization} (DG) solutions. Recognizing the
potential of DG methods to significantly influence diagnostic and prognostic
models in cancer studies and clinical practice, we present this survey along
with guidelines on achieving DG in CPath. We rigorously define various DS
types, systematically review and categorize existing DG approaches and
resources in CPath, and provide insights into their advantages, limitations,
and applicability. We also conduct thorough benchmarking experiments with 28
cutting-edge DG algorithms to address a complex DG problem. Our findings
suggest that careful experiment design and CPath-specific Stain Augmentation
technique can be very effective. However, there is no one-size-fits-all
solution for DG in CPath. Therefore, we establish clear guidelines for
detecting and managing DS depending on different scenarios. While most of the
concepts, guidelines, and recommendations are given for applications in CPath,
we believe that they are applicable to most medical image analysis tasks as
well.Comment: Extended Versio
Synthetic Data as Validation
This study leverages synthetic data as a validation set to reduce overfitting
and ease the selection of the best model in AI development. While synthetic
data have been used for augmenting the training set, we find that synthetic
data can also significantly diversify the validation set, offering marked
advantages in domains like healthcare, where data are typically limited,
sensitive, and from out-domain sources (i.e., hospitals). In this study, we
illustrate the effectiveness of synthetic data for early cancer detection in
computed tomography (CT) volumes, where synthetic tumors are generated and
superimposed onto healthy organs, thereby creating an extensive dataset for
rigorous validation. Using synthetic data as validation can improve AI
robustness in both in-domain and out-domain test sets. Furthermore, we
establish a new continual learning framework that continuously trains AI models
on a stream of out-domain data with synthetic tumors. The AI model trained and
validated in dynamically expanding synthetic data can consistently outperform
models trained and validated exclusively on real-world data. Specifically, the
DSC score for liver tumor segmentation improves from 26.7% (95% CI:
22.6%-30.9%) to 34.5% (30.8%-38.2%) when evaluated on an in-domain dataset and
from 31.1% (26.0%-36.2%) to 35.4% (32.1%-38.7%) on an out-domain dataset.
Importantly, the performance gain is particularly significant in identifying
very tiny liver tumors (radius < 5mm) in CT volumes, with Sensitivity improving
from 33.1% to 55.4% on an in-domain dataset and 33.9% to 52.3% on an out-domain
dataset, justifying the efficacy in early detection of cancer. The application
of synthetic data, from both training and validation perspectives, underlines a
promising avenue to enhance AI robustness when dealing with data from varying
domains
Test-Time Training for Semantic Segmentation with Output Contrastive Loss
Although deep learning-based segmentation models have achieved impressive
performance on public benchmarks, generalizing well to unseen environments
remains a major challenge. To improve the model's generalization ability to the
new domain during evaluation, the test-time training (TTT) is a challenging
paradigm that adapts the source-pretrained model in an online fashion. Early
efforts on TTT mainly focus on the image classification task. Directly
extending these methods to semantic segmentation easily experiences unstable
adaption due to segmentation's inherent characteristics, such as extreme class
imbalance and complex decision spaces. To stabilize the adaptation process, we
introduce contrastive loss (CL), known for its capability to learn robust and
generalized representations. Nevertheless, the traditional CL operates in the
representation space and cannot directly enhance predictions. In this paper, we
resolve this limitation by adapting the CL to the output space, employing a
high temperature, and simplifying the formulation, resulting in a
straightforward yet effective loss function called Output Contrastive Loss
(OCL). Our comprehensive experiments validate the efficacy of our approach
across diverse evaluation scenarios. Notably, our method excels even when
applied to models initially pre-trained using domain adaptation methods on test
domain data, showcasing its resilience and adaptability.\footnote{Code and more
information could be found at~ \url{https://github.com/dazhangyu123/OCL}
- …