1,612 research outputs found
Conditional Channel Gated Networks for Task-Aware Continual Learning
Convolutional Neural Networks experience catastrophic forgetting when
optimized on a sequence of learning problems: as they meet the objective of the
current training examples, their performance on previous tasks drops
drastically. In this work, we introduce a novel framework to tackle this
problem with conditional computation. We equip each convolutional layer with
task-specific gating modules, selecting which filters to apply on the given
input. This way, we achieve two appealing properties. Firstly, the execution
patterns of the gates allow to identify and protect important filters, ensuring
no loss in the performance of the model for previously learned tasks. Secondly,
by using a sparsity objective, we can promote the selection of a limited set of
kernels, allowing to retain sufficient model capacity to digest new
tasks.Existing solutions require, at test time, awareness of the task to which
each example belongs to. This knowledge, however, may not be available in many
practical scenarios. Therefore, we additionally introduce a task classifier
that predicts the task label of each example, to deal with settings in which a
task oracle is not available. We validate our proposal on four continual
learning datasets. Results show that our model consistently outperforms
existing methods both in the presence and the absence of a task oracle.
Notably, on Split SVHN and Imagenet-50 datasets, our model yields up to 23.98%
and 17.42% improvement in accuracy w.r.t. competing methods.Comment: CVPR 2020 (oral
ScrollNet: Dynamic Weight Importance for Continual Learning
The principle underlying most existing continual learning (CL) methods is to
prioritize stability by penalizing changes in parameters crucial to old tasks,
while allowing for plasticity in other parameters. The importance of weights
for each task can be determined either explicitly through learning a
task-specific mask during training (e.g., parameter isolation-based approaches)
or implicitly by introducing a regularization term (e.g., regularization-based
approaches). However, all these methods assume that the importance of weights
for each task is unknown prior to data exposure. In this paper, we propose
ScrollNet as a scrolling neural network for continual learning. ScrollNet can
be seen as a dynamic network that assigns the ranking of weight importance for
each task before data exposure, thus achieving a more favorable
stability-plasticity tradeoff during sequential task learning by reassigning
this ranking for different tasks. Additionally, we demonstrate that ScrollNet
can be combined with various CL methods, including regularization-based and
replay-based approaches. Experimental results on CIFAR100 and TinyImagenet
datasets show the effectiveness of our proposed method. We release our code at
https://github.com/FireFYF/ScrollNet.git.Comment: Accepted at Visual Continual Learning workshop (ICCV2023
A Diffusion-based Method for Multi-turn Compositional Image Generation
Multi-turn compositional image generation (M-CIG) is a challenging task that
aims to iteratively manipulate a reference image given a modification text.
While most of the existing methods for M-CIG are based on generative
adversarial networks (GANs), recent advances in image generation have
demonstrated the superiority of diffusion models over GANs. In this paper, we
propose a diffusion-based method for M-CIG named conditional denoising
diffusion with image compositional matching (CDD-ICM). We leverage CLIP as the
backbone of image and text encoders, and incorporate a gated fusion mechanism,
originally proposed for question answering, to compositionally fuse the
reference image and the modification text at each turn of M-CIG. We introduce a
conditioning scheme to generate the target image based on the fusion results.
To prioritize the semantic quality of the generated target image, we learn an
auxiliary image compositional match (ICM) objective, along with the conditional
denoising diffusion (CDD) objective in a multi-task learning framework.
Additionally, we also perform ICM guidance and classifier-free guidance to
improve performance. Experimental results show that CDD-ICM achieves
state-of-the-art results on two benchmark datasets for M-CIG, i.e., CoDraw and
i-CLEVR
Transfer without Forgetting
This work investigates the entanglement between Continual Learning (CL) and
Transfer Learning (TL). In particular, we shed light on the widespread
application of network pretraining, highlighting that it is itself subject to
catastrophic forgetting. Unfortunately, this issue leads to the
under-exploitation of knowledge transfer during later tasks. On this ground, we
propose Transfer without Forgetting (TwF), a hybrid approach building upon a
fixed pretrained sibling network, which continuously propagates the knowledge
inherent in the source domain through a layer-wise loss term. Our experiments
indicate that TwF steadily outperforms other CL methods across a variety of
settings, averaging a 4.81% gain in Class-Incremental accuracy over a variety
of datasets and different buffer sizes.Comment: 22 pages, 3 Figures. Accepted at 17th European Conference on Computer
Vision (ECCV 2022), Tel Aviv, Israe
Cross-Class Feature Augmentation for Class Incremental Learning
We propose a novel class incremental learning approach by incorporating a
feature augmentation technique motivated by adversarial attacks. We employ a
classifier learned in the past to complement training examples rather than
simply play a role as a teacher for knowledge distillation towards subsequent
models. The proposed approach has a unique perspective to utilize the previous
knowledge in class incremental learning since it augments features of arbitrary
target classes using examples in other classes via adversarial attacks on a
previously learned classifier. By allowing the cross-class feature
augmentations, each class in the old tasks conveniently populates samples in
the feature space, which alleviates the collapse of the decision boundaries
caused by sample deficiency for the previous tasks, especially when the number
of stored exemplars is small. This idea can be easily incorporated into
existing class incremental learning algorithms without any architecture
modification. Extensive experiments on the standard benchmarks show that our
method consistently outperforms existing class incremental learning methods by
significant margins in various scenarios, especially under an environment with
an extremely limited memory budget
Incremental Task Learning with Incremental Rank Updates
Incremental Task learning (ITL) is a category of continual learning that
seeks to train a single network for multiple tasks (one after another), where
training data for each task is only available during the training of that task.
Neural networks tend to forget older tasks when they are trained for the newer
tasks; this property is often known as catastrophic forgetting. To address this
issue, ITL methods use episodic memory, parameter regularization, masking and
pruning, or extensible network structures. In this paper, we propose a new
incremental task learning framework based on low-rank factorization. In
particular, we represent the network weights for each layer as a linear
combination of several rank-1 matrices. To update the network for a new task,
we learn a rank-1 (or low-rank) matrix and add that to the weights of every
layer. We also introduce an additional selector vector that assigns different
weights to the low-rank matrices learned for the previous tasks. We show that
our approach performs better than the current state-of-the-art methods in terms
of accuracy and forgetting. Our method also offers better memory efficiency
compared to episodic memory- and mask-based approaches. Our code will be
available at https://github.com/CSIPlab/task-increment-rank-update.gitComment: Code will be available at
https://github.com/CSIPlab/task-increment-rank-update.gi
- …