9,531 research outputs found
Continual Unsupervised Representation Learning
Continual learning aims to improve the ability of modern learning systems to
deal with non-stationary distributions, typically by attempting to learn a
series of tasks sequentially. Prior art in the field has largely considered
supervised or reinforcement learning tasks, and often assumes full knowledge of
task labels and boundaries. In this work, we propose an approach (CURL) to
tackle a more general problem that we will refer to as unsupervised continual
learning. The focus is on learning representations without any knowledge about
task identity, and we explore scenarios when there are abrupt changes between
tasks, smooth transitions from one task to another, or even when the data is
shuffled. The proposed approach performs task inference directly within the
model, is able to dynamically expand to capture new concepts over its lifetime,
and incorporates additional rehearsal-based techniques to deal with
catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised
learning setting with MNIST and Omniglot, where the lack of labels ensures no
information is leaked about the task. Further, we demonstrate strong
performance compared to prior art in an i.i.d setting, or when adapting the
technique to supervised tasks such as incremental class learning.Comment: NeurIPS 201
Continual Contrastive Self-supervised Learning for Image Classification
For artificial learning systems, continual learning over time from a stream
of data is essential. The burgeoning studies on supervised continual learning
have achieved great progress, while the study of catastrophic forgetting in
unsupervised learning is still blank. Among unsupervised learning methods,
self-supervise learning method shows tremendous potential on visual
representation without any labeled data at scale. To improve the visual
representation of self-supervised learning, larger and more varied data is
needed. In the real world, unlabeled data is generated at all times. This
circumstance provides a huge advantage for the learning of the self-supervised
method. However, in the current paradigm, packing previous data and current
data together and training it again is a waste of time and resources. Thus, a
continual self-supervised learning method is badly needed. In this paper, we
make the first attempt to implement the continual contrastive self-supervised
learning by proposing a rehearsal method, which keeps a few exemplars from the
previous data. Instead of directly combining saved exemplars with the current
data set for training, we leverage self-supervised knowledge distillation to
transfer contrastive information among previous data to the current network by
mimicking similarity score distribution inferred by the old network over a set
of saved exemplars. Moreover, we build an extra sample queue to assist the
network to distinguish between previous and current data and prevent mutual
interference while learning their own feature representation. Experimental
results show that our method performs well on CIFAR100 and ImageNet-Sub.
Compared with the baselines, which learning tasks without taking any technique,
we improve the image classification top-1 accuracy by 1.60% on CIFAR100, 2.86%
on ImageNet-Sub and 1.29% on ImageNet-Full under 10 incremental steps setting
Lifelong Generative Modeling
Lifelong learning is the problem of learning multiple consecutive tasks in a
sequential manner, where knowledge gained from previous tasks is retained and
used to aid future learning over the lifetime of the learner. It is essential
towards the development of intelligent machines that can adapt to their
surroundings. In this work we focus on a lifelong learning approach to
unsupervised generative modeling, where we continuously incorporate newly
observed distributions into a learned model. We do so through a student-teacher
Variational Autoencoder architecture which allows us to learn and preserve all
the distributions seen so far, without the need to retain the past data nor the
past models. Through the introduction of a novel cross-model regularizer,
inspired by a Bayesian update rule, the student model leverages the information
learned by the teacher, which acts as a probabilistic knowledge store. The
regularizer reduces the effect of catastrophic interference that appears when
we learn over sequences of distributions. We validate our model's performance
on sequential variants of MNIST, FashionMNIST, PermutedMNIST, SVHN and Celeb-A
and demonstrate that our model mitigates the effects of catastrophic
interference faced by neural networks in sequential learning scenarios.Comment: 32 page
- …