13 research outputs found
Collaborative Group Learning
Collaborative learning has successfully applied knowledge transfer to guide a
pool of small student networks towards robust local minima. However, previous
approaches typically struggle with drastically aggravated student
homogenization when the number of students rises. In this paper, we propose
Collaborative Group Learning, an efficient framework that aims to diversify the
feature representation and conduct an effective regularization. Intuitively,
similar to the human group study mechanism, we induce students to learn and
exchange different parts of course knowledge as collaborative groups. First,
each student is established by randomly routing on a modular neural network,
which facilitates flexible knowledge communication between students due to
random levels of representation sharing and branching. Second, to resist the
student homogenization, students first compose diverse feature sets by
exploiting the inductive bias from sub-sets of training data, and then
aggregate and distill different complementary knowledge by imitating a random
sub-group of students at each time step. Overall, the above mechanisms are
beneficial for maximizing the student population to further improve the model
generalization without sacrificing computational efficiency. Empirical
evaluations on both image and text tasks indicate that our method significantly
outperforms various state-of-the-art collaborative approaches whilst enhancing
computational efficiency.Comment: Accepted by AAAI 2021; Camera ready versio
OvA-INN: Continual Learning with Invertible Neural Networks
In the field of Continual Learning, the objective is to learn several tasks
one after the other without access to the data from previous tasks. Several
solutions have been proposed to tackle this problem but they usually assume
that the user knows which of the tasks to perform at test time on a particular
sample, or rely on small samples from previous data and most of them suffer of
a substantial drop in accuracy when updated with batches of only one class at a
time. In this article, we propose a new method, OvA-INN, which is able to learn
one class at a time and without storing any of the previous data. To achieve
this, for each class, we train a specific Invertible Neural Network to extract
the relevant features to compute the likelihood on this class. At test time, we
can predict the class of a sample by identifying the network which predicted
the highest likelihood. With this method, we show that we can take advantage of
pretrained models by stacking an Invertible Network on top of a feature
extractor. This way, we are able to outperform state-of-the-art approaches that
rely on features learning for the Continual Learning of MNIST and CIFAR-100
datasets. In our experiments, we reach 72% accuracy on CIFAR-100 after training
our model one class at a time.Comment: to be published in IJCNN 202
iTAML: An Incremental Task-Agnostic Meta-learning Approach
Humans can continuously learn new knowledge as their experience grows. In
contrast, previous learning in deep neural networks can quickly fade out when
they are trained on a new task. In this paper, we hypothesize this problem can
be avoided by learning a set of generalized parameters, that are neither
specific to old nor new tasks. In this pursuit, we introduce a novel
meta-learning approach that seeks to maintain an equilibrium between all the
encountered tasks. This is ensured by a new meta-update rule which avoids
catastrophic forgetting. In comparison to previous meta-learning techniques,
our approach is task-agnostic. When presented with a continuum of data, our
model automatically identifies the task and quickly adapts to it with just a
single update. We perform extensive experiments on five datasets in a
class-incremental setting, leading to significant improvements over the state
of the art methods (e.g., a 21.3% boost on CIFAR100 with 10 incremental tasks).
Specifically, on large-scale datasets that generally prove difficult cases for
incremental learning, our approach delivers absolute gains as high as 19.1% and
7.4% on ImageNet and MS-Celeb datasets, respectively.Comment: Accepted to CVPR 202
IF2Net: Innately Forgetting-Free Networks for Continual Learning
Continual learning can incrementally absorb new concepts without interfering
with previously learned knowledge. Motivated by the characteristics of neural
networks, in which information is stored in weights on connections, we
investigated how to design an Innately Forgetting-Free Network (IF2Net) for
continual learning context. This study proposed a straightforward yet effective
learning paradigm by ingeniously keeping the weights relative to each seen task
untouched before and after learning a new task. We first presented the novel
representation-level learning on task sequences with random weights. This
technique refers to tweaking the drifted representations caused by
randomization back to their separate task-optimal working states, but the
involved weights are frozen and reused (opposite to well-known layer-wise
updates of weights). Then, sequential decision-making without forgetting can be
achieved by projecting the output weight updates into the parsimonious
orthogonal space, making the adaptations not disturb old knowledge while
maintaining model plasticity. IF2Net allows a single network to inherently
learn unlimited mapping rules without telling task identities at test time by
integrating the respective strengths of randomization and orthogonalization. We
validated the effectiveness of our approach in the extensive theoretical
analysis and empirical study.Comment: 16 pages, 8 figures. Under revie