7 research outputs found
Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients
Continual learning aims to enable machine learning models to learn a general
solution space for past and future tasks in a sequential manner. Conventional
models tend to forget the knowledge of previous tasks while learning a new
task, a phenomenon known as catastrophic forgetting. When using Bayesian models
in continual learning, knowledge from previous tasks can be retained in two
ways: 1). posterior distributions over the parameters, containing the knowledge
gained from inference in previous tasks, which then serve as the priors for the
following task; 2). coresets, containing knowledge of data distributions of
previous tasks. Here, we show that Bayesian continual learning can be
facilitated in terms of these two means through the use of natural gradients
and Stein gradients respectively
Class-incremental learning: survey and performance evaluation
For future learning systems incremental learning is desirable, because it
allows for: efficient resource usage by eliminating the need to retrain from
scratch at the arrival of new data; reduced memory usage by preventing or
limiting the amount of data required to be stored -- also important when
privacy limitations are imposed; and learning that more closely resembles human
learning. The main challenge for incremental learning is catastrophic
forgetting, which refers to the precipitous drop in performance on previously
learned tasks after learning a new one. Incremental learning of deep neural
networks has seen explosive growth in recent years. Initial work focused on
task incremental learning, where a task-ID is provided at inference time.
Recently we have seen a shift towards class-incremental learning where the
learner must classify at inference time between all classes seen in previous
tasks without recourse to a task-ID. In this paper, we provide a complete
survey of existing methods for incremental learning, and in particular we
perform an extensive experimental evaluation on twelve class-incremental
methods. We consider several new experimental scenarios, including a comparison
of class-incremental methods on multiple large-scale datasets, investigation
into small and large domain shifts, and comparison on various network
architectures