6 research outputs found
Orthogonal Gradient Descent for Continual Learning
Neural networks are achieving state of the art and sometimes super-human performance on learning tasks across a variety of domains. Whenever these problems require learning in a continual or sequential manner, however, neural networks suffer from the problem of catastrophic forgetting; they forget how to solve previous tasks after being trained on a new task, despite having the essential capacity to solve both tasks if they were trained on both simultaneously. In this paper, we propose to address this issue from a parameter space perspective and study an approach to restrict the direction of the gradient updates to avoid forgetting previously-learned data. We present the Orthogonal Gradient Descent (OGD) method, which accomplishes this goal by projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task. Our approach utilizes the high capacity of a neural network more efficiently and does not require storing the previously learned data that might raise privacy concerns. Experiments on common benchmarks reveal the effectiveness of the proposed OGD method
Orthogonal Gradient Descent for Continual Learning
Neural networks are achieving state of the art and sometimes super-human
performance on learning tasks across a variety of domains. Whenever these
problems require learning in a continual or sequential manner, however, neural
networks suffer from the problem of catastrophic forgetting; they forget how to
solve previous tasks after being trained on a new task, despite having the
essential capacity to solve both tasks if they were trained on both
simultaneously. In this paper, we propose to address this issue from a
parameter space perspective and study an approach to restrict the direction of
the gradient updates to avoid forgetting previously-learned data. We present
the Orthogonal Gradient Descent (OGD) method, which accomplishes this goal by
projecting the gradients from new tasks onto a subspace in which the neural
network output on previous task does not change and the projected gradient is
still in a useful direction for learning the new task. Our approach utilizes
the high capacity of a neural network more efficiently and does not require
storing the previously learned data that might raise privacy concerns.
Experiments on common benchmarks reveal the effectiveness of the proposed OGD
method
Embrace concept drift : a novel solution for online continual learning
Continual learning is a critical area of research in machine learning that aims to enable
models to learn new information without forgetting the old knowledge. Online continual learning, in particular, addresses the challenges of learning from a stream of data in
real-world environments where data can be unbounded and heterogeneous. There are two
main problems to be addressed in online continual learning: the first one is catastrophic
forgetting, a phenomenon where the model forgets the previously learned knowledge
while learning new tasks; the second one is concept drift, a situation where the distribution of the data changes over time. These issues can further complicate the learning
process, compared to traditional machine learning.
In this thesis, we propose a general framework for online continual learning that leverages both regularization-based and memory-based methods to mitigate catastrophic forgetting and handle concept drift. Specifically, we introduce a novel concept drift detection
algorithm based on the confidence values of the samples. We present a novel online continual learning paradigm, which utilizes concept drift as a rehearsal signal to improve
performance by consolidating or expanding the memory center. We also apply data condensation approaches to online continual learning in order to perform memory efficient
rehearsal.
Furthermore, we evaluate the accuracy of old tasks and new tasks, comparing with
many benchmark models. We present a novel evaluation metric - Stability and Plasticity
Balance to measure the balance between old and new accuracy.
We evaluate our proposed approach on a new benchmark dataset framework, Continual Online Learning (COnL), which consists of two scenarios of online continual learning: class-incremental learning and instance-incremental learning. In this thesis, the
benchmark dataset framework randomly selects a number of incremental classes from
3 different datasets: TinyImageNet, Germany Traffic Sign and Landmarks. Our primary
results demonstrate that concept drift can be a useful tool in memory rehearsal in the online continual learning setting. Our proposed approaches provide a promising direction
for future research in online continual learning and have the potential to enable models to learn continuously from unbounded and heterogeneous data streams in real-world
environments