185 research outputs found
Energy-Based Models for Continual Learning
We motivate Energy-Based Models (EBMs) as a promising model class for
continual learning problems. Instead of tackling continual learning via the use
of external memory, growing models, or regularization, EBMs have a natural way
to support a dynamically-growing number of tasks or classes that causes less
interference with previously learned information. Our proposed version of EBMs
for continual learning is simple, efficient and outperforms baseline methods by
a large margin on several benchmarks. Moreover, our proposed contrastive
divergence based training objective can be applied to other continual learning
methods, resulting in substantial boosts in their performance. We also show
that EBMs are adaptable to a more general continual learning setting where the
data distribution changes without the notion of explicitly delineated tasks.
These observations point towards EBMs as a class of models naturally inclined
towards the continual learning regime
IF2Net: Innately Forgetting-Free Networks for Continual Learning
Continual learning can incrementally absorb new concepts without interfering
with previously learned knowledge. Motivated by the characteristics of neural
networks, in which information is stored in weights on connections, we
investigated how to design an Innately Forgetting-Free Network (IF2Net) for
continual learning context. This study proposed a straightforward yet effective
learning paradigm by ingeniously keeping the weights relative to each seen task
untouched before and after learning a new task. We first presented the novel
representation-level learning on task sequences with random weights. This
technique refers to tweaking the drifted representations caused by
randomization back to their separate task-optimal working states, but the
involved weights are frozen and reused (opposite to well-known layer-wise
updates of weights). Then, sequential decision-making without forgetting can be
achieved by projecting the output weight updates into the parsimonious
orthogonal space, making the adaptations not disturb old knowledge while
maintaining model plasticity. IF2Net allows a single network to inherently
learn unlimited mapping rules without telling task identities at test time by
integrating the respective strengths of randomization and orthogonalization. We
validated the effectiveness of our approach in the extensive theoretical
analysis and empirical study.Comment: 16 pages, 8 figures. Under revie
Cross-Class Feature Augmentation for Class Incremental Learning
We propose a novel class incremental learning approach by incorporating a
feature augmentation technique motivated by adversarial attacks. We employ a
classifier learned in the past to complement training examples rather than
simply play a role as a teacher for knowledge distillation towards subsequent
models. The proposed approach has a unique perspective to utilize the previous
knowledge in class incremental learning since it augments features of arbitrary
target classes using examples in other classes via adversarial attacks on a
previously learned classifier. By allowing the cross-class feature
augmentations, each class in the old tasks conveniently populates samples in
the feature space, which alleviates the collapse of the decision boundaries
caused by sample deficiency for the previous tasks, especially when the number
of stored exemplars is small. This idea can be easily incorporated into
existing class incremental learning algorithms without any architecture
modification. Extensive experiments on the standard benchmarks show that our
method consistently outperforms existing class incremental learning methods by
significant margins in various scenarios, especially under an environment with
an extremely limited memory budget
Privacy-preserving continual learning methods for medical image classification: a comparative analysis
BackgroundThe implementation of deep learning models for medical image classification poses significant challenges, including gradual performance degradation and limited adaptability to new diseases. However, frequent retraining of models is unfeasible and raises concerns about healthcare privacy due to the retention of prior patient data. To address these issues, this study investigated privacy-preserving continual learning methods as an alternative solution.MethodsWe evaluated twelve privacy-preserving non-storage continual learning algorithms based deep learning models for classifying retinal diseases from public optical coherence tomography (OCT) images, in a class-incremental learning scenario. The OCT dataset comprises 108,309 OCT images. Its classes include normal (47.21%), drusen (7.96%), choroidal neovascularization (CNV) (34.35%), and diabetic macular edema (DME) (10.48%). Each class consisted of 250 testing images. For continuous training, the first task involved CNV and normal classes, the second task focused on DME class, and the third task included drusen class. All selected algorithms were further experimented with different training sequence combinations. The final model's average class accuracy was measured. The performance of the joint model obtained through retraining and the original finetune model without continual learning algorithms were compared. Additionally, a publicly available medical dataset for colon cancer detection based on histology slides was selected as a proof of concept, while the CIFAR10 dataset was included as the continual learning benchmark.ResultsAmong the continual learning algorithms, Brain-inspired-replay (BIR) outperformed the others in the continual learning-based classification of retinal diseases from OCT images, achieving an accuracy of 62.00% (95% confidence interval: 59.36-64.64%), with consistent top performance observed in different training sequences. For colon cancer histology classification, Efficient Feature Transformations (EFT) attained the highest accuracy of 66.82% (95% confidence interval: 64.23-69.42%). In comparison, the joint model achieved accuracies of 90.76% and 89.28%, respectively. The finetune model demonstrated catastrophic forgetting in both datasets.ConclusionAlthough the joint retraining model exhibited superior performance, continual learning holds promise in mitigating catastrophic forgetting and facilitating continual model updates while preserving privacy in healthcare deep learning models. Thus, it presents a highly promising solution for the long-term clinical deployment of such models
- …