1,640 research outputs found
Continual Contrastive Self-supervised Learning for Image Classification
For artificial learning systems, continual learning over time from a stream
of data is essential. The burgeoning studies on supervised continual learning
have achieved great progress, while the study of catastrophic forgetting in
unsupervised learning is still blank. Among unsupervised learning methods,
self-supervise learning method shows tremendous potential on visual
representation without any labeled data at scale. To improve the visual
representation of self-supervised learning, larger and more varied data is
needed. In the real world, unlabeled data is generated at all times. This
circumstance provides a huge advantage for the learning of the self-supervised
method. However, in the current paradigm, packing previous data and current
data together and training it again is a waste of time and resources. Thus, a
continual self-supervised learning method is badly needed. In this paper, we
make the first attempt to implement the continual contrastive self-supervised
learning by proposing a rehearsal method, which keeps a few exemplars from the
previous data. Instead of directly combining saved exemplars with the current
data set for training, we leverage self-supervised knowledge distillation to
transfer contrastive information among previous data to the current network by
mimicking similarity score distribution inferred by the old network over a set
of saved exemplars. Moreover, we build an extra sample queue to assist the
network to distinguish between previous and current data and prevent mutual
interference while learning their own feature representation. Experimental
results show that our method performs well on CIFAR100 and ImageNet-Sub.
Compared with the baselines, which learning tasks without taking any technique,
we improve the image classification top-1 accuracy by 1.60% on CIFAR100, 2.86%
on ImageNet-Sub and 1.29% on ImageNet-Full under 10 incremental steps setting
SCALE: Online Self-Supervised Lifelong Learning without Prior Knowledge
Unsupervised lifelong learning refers to the ability to learn over time while
memorizing previous patterns without supervision. Previous works assumed strong
prior knowledge about the incoming data (e.g., knowing the class boundaries)
which can be impossible to obtain in complex and unpredictable environments. In
this paper, motivated by real-world scenarios, we formally define the online
unsupervised lifelong learning problem with class-incremental streaming data,
which is non-iid and single-pass. The problem is more challenging than existing
lifelong learning problems due to the absence of labels and prior knowledge. To
address the issue, we propose Self-Supervised ContrAstive Lifelong LEarning
(SCALE) which extracts and memorizes knowledge on-the-fly. SCALE is designed
around three major components: a pseudo-supervised contrastive loss, a
self-supervised forgetting loss, and an online memory update for uniform subset
selection. All three components are designed to work collaboratively to
maximize learning performance. Our loss functions leverage pairwise similarity
thus remove the dependency on supervision or prior knowledge. We perform
comprehensive experiments of SCALE under iid and four non-iid data streams.
SCALE outperforms the best state-of-the-art algorithm on all settings with
improvements of up to 3.83%, 2.77% and 5.86% kNN accuracy on CIFAR-10,
CIFAR-100 and SubImageNet datasets.Comment: Submitted for revie
Domain-Aware Augmentations for Unsupervised Online General Continual Learning
Continual Learning has been challenging, especially when dealing with
unsupervised scenarios such as Unsupervised Online General Continual Learning
(UOGCL), where the learning agent has no prior knowledge of class boundaries or
task change information. While previous research has focused on reducing
forgetting in supervised setups, recent studies have shown that self-supervised
learners are more resilient to forgetting. This paper proposes a novel approach
that enhances memory usage for contrastive learning in UOGCL by defining and
using stream-dependent data augmentations together with some implementation
tricks. Our proposed method is simple yet effective, achieves state-of-the-art
results compared to other unsupervised approaches in all considered setups, and
reduces the gap between supervised and unsupervised continual learning. Our
domain-aware augmentation procedure can be adapted to other replay-based
methods, making it a promising strategy for continual learning.Comment: Accepted to BMVC'2
Contrastive Learning for Online Semi-Supervised General Continual Learning
We study Online Continual Learning with missing labels and propose SemiCon, a
new contrastive loss designed for partly labeled data. We demonstrate its
efficiency by devising a memory-based method trained on an unlabeled data
stream, where every data added to memory is labeled using an oracle. Our
approach outperforms existing semi-supervised methods when few labels are
available, and obtain similar results to state-of-the-art supervised methods
while using only 2.6% of labels on Split-CIFAR10 and 10% of labels on
Split-CIFAR100.Comment: Accepted at ICIP'2
Uncertainty Estimation, Explanation and Reduction with Insufficient Data
Human beings have been juggling making smart decisions under uncertainties, where we manage to trade off between swift actions and collecting sufficient evidence. It is naturally expected that a generalized artificial intelligence (GAI) to navigate through uncertainties meanwhile predicting precisely. In this thesis, we aim to propose strategies that underpin machine learning with uncertainties from three perspectives: uncertainty estimation, explanation and reduction. Estimation quantifies the variability in the model inputs and outputs. It can endow us to evaluate the model predictive confidence. Explanation provides a tool to interpret the mechanism of uncertainties and to pinpoint the potentials for uncertainty reduction, which focuses on stabilizing model training, especially when the data is insufficient. We hope that this thesis can motivate related studies on quantifying predictive uncertainties in deep learning. It also aims to raise awareness for other stakeholders in the fields of smart transportation and automated medical diagnosis where data insufficiency induces high uncertainty.
The thesis is dissected into the following sections: Introduction. we justify the necessity to investigate AI uncertainties and clarify the challenges existed in the latest studies, followed by our research objective. Literature review. We break down the the review of the state-of-the-art methods into uncertainty estimation, explanation and reduction. We make comparisons with the related fields encompassing meta learning, anomaly detection, continual learning as well. Uncertainty estimation. We introduce a variational framework, neural process that approximates Gaussian processes to handle uncertainty estimation. Two variants from the neural process families are proposed to enhance neural processes with scalability and continual learning. Uncertainty explanation. We inspect the functional distribution of neural processes to discover the global and local factors that affect the degree of predictive uncertainties. Uncertainty reduction. We validate the proposed uncertainty framework on two scenarios: urban irregular behaviour detection and neurological disorder diagnosis, where the intrinsic data insufficiency undermines the performance of existing deep learning models. Conclusion. We provide promising directions for future works and conclude the thesis
Towards Robust Feature Learning with t-vFM Similarity for Continual Learning
Continual learning has been developed using standard supervised contrastive
loss from the perspective of feature learning. Due to the data imbalance during
the training, there are still challenges in learning better representations. In
this work, we suggest using a different similarity metric instead of cosine
similarity in supervised contrastive loss in order to learn more robust
representations. We validate the our method on one of the image classification
datasets Seq-CIFAR-10 and the results outperform recent continual learning
baselines
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Lifelong audio feature extraction involves learning new sound classes
incrementally, which is essential for adapting to new data distributions over
time. However, optimizing the model only on new data can lead to catastrophic
forgetting of previously learned tasks, which undermines the model's ability to
perform well over the long term. This paper introduces a new approach to
continual audio representation learning called DeCoR. Unlike other methods that
store previous data, features, or models, DeCoR indirectly distills knowledge
from an earlier model to the latest by predicting quantization indices from a
delayed codebook. We demonstrate that DeCoR improves acoustic scene
classification accuracy and integrates well with continual self-supervised
representation learning. Our approach introduces minimal storage and
computation overhead, making it a lightweight and efficient solution for
continual learning.Comment: INTERSPEECH 202
Sy-CON: Symmetric Contrastive Loss for Continual Self-Supervised Representation Learning
We introduce a novel and general loss function, called Symmetric Contrastive
(Sy-CON) loss, for effective continual self-supervised learning (CSSL). We
first argue that the conventional loss form of continual learning which
consists of single task-specific loss (for plasticity) and a regularizer (for
stability) may not be ideal for contrastive loss based CSSL that focus on
representation learning. Our reasoning is that, in contrastive learning based
methods, the task-specific loss would suffer from decreasing diversity of
negative samples and the regularizer may hinder learning new distinctive
representations. To that end, we propose Sy-CON that consists of two losses
(one for plasticity and the other for stability) with symmetric dependence on
current and past models' negative sample embeddings. We argue our model can
naturally find good trade-off between the plasticity and stability without any
explicit hyperparameter tuning. We validate the effectiveness of our approach
through extensive experiments, demonstrating that MoCo-based implementation of
Sy-CON loss achieves superior performance compared to other state-of-the-art
CSSL methods.Comment: Preprin
OpenIncrement: A Unified Framework for Open Set Recognition and Deep Class-Incremental Learning
In most works on deep incremental learning research, it is assumed that novel
samples are pre-identified for neural network retraining. However, practical
deep classifiers often misidentify these samples, leading to erroneous
predictions. Such misclassifications can degrade model performance. Techniques
like open set recognition offer a means to detect these novel samples,
representing a significant area in the machine learning domain.
In this paper, we introduce a deep class-incremental learning framework
integrated with open set recognition. Our approach refines class-incrementally
learned features to adapt them for distance-based open set recognition.
Experimental results validate that our method outperforms state-of-the-art
incremental learning techniques and exhibits superior performance in open set
recognition compared to baseline methods
- …