33 research outputs found
Frosting Weights for Better Continual Training
Training a neural network model can be a lifelong learning process and is a
computationally intensive one. A severe adverse effect that may occur in deep
neural network models is that they can suffer from catastrophic forgetting
during retraining on new data. To avoid such disruptions in the continuous
learning, one appealing property is the additive nature of ensemble models. In
this paper, we propose two generic ensemble approaches, gradient boosting and
meta-learning, to solve the catastrophic forgetting problem in tuning
pre-trained neural network models
Overcoming Catastrophic Forgetting by Neuron-level Plasticity Control
To address the issue of catastrophic forgetting in neural networks, we
propose a novel, simple, and effective solution called neuron-level plasticity
control (NPC). While learning a new task, the proposed method preserves the
knowledge for the previous tasks by controlling the plasticity of the network
at the neuron level. NPC estimates the importance value of each neuron and
consolidates important \textit{neurons} by applying lower learning rates,
rather than restricting individual connection weights to stay close to certain
values. The experimental results on the incremental MNIST (iMNIST) and
incremental CIFAR100 (iCIFAR100) datasets show that neuron-level consolidation
is substantially more effective compared to the connection-level consolidation
approaches.Comment: 8 page
Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization
Artificial autonomous agents and robots interacting in complex environments
are required to continually acquire and fine-tune knowledge over sustained
periods of time. The ability to learn from continuous streams of information is
referred to as lifelong learning and represents a long-standing challenge for
neural network models due to catastrophic forgetting. Computational models of
lifelong learning typically alleviate catastrophic forgetting in experimental
scenarios with given datasets of static images and limited complexity, thereby
differing significantly from the conditions artificial agents are exposed to.
In more natural settings, sequential information may become progressively
available over time and access to previous experience may be restricted. In
this paper, we propose a dual-memory self-organizing architecture for lifelong
learning scenarios. The architecture comprises two growing recurrent networks
with the complementary tasks of learning object instances (episodic memory) and
categories (semantic memory). Both growing networks can expand in response to
novel sensory experience: the episodic memory learns fine-grained
spatiotemporal representations of object instances in an unsupervised fashion
while the semantic memory uses task-relevant signals to regulate structural
plasticity levels and develop more compact representations from episodic
experience. For the consolidation of knowledge in the absence of external
sensory input, the episodic memory periodically replays trajectories of neural
reactivations. We evaluate the proposed model on the CORe50 benchmark dataset
for continuous object recognition, showing that we significantly outperform
current methods of lifelong learning in three different incremental learning
scenario