36 research outputs found
Sample Re-weighting for Replay-based Continual Learning in Neural Networks
Artificial intelligence empowered by neural networks (NNs) has tremendously improved the state-of-the-art results in various domains of applications over the past years. Inspired by human intelligence, there has been a growing interest in the real-world learning settings that pose multiple challenges to the learning agents. Continual learning (CL) is a learning paradigm in which the agent learns new tasks continually from a never-ending and non-stationary stream of data. A CL agent should be plastic enough to learn new tasks, while stable enough not to forget acquired knowledge of previous tasks; in a severe case, the latter situation is called catastrophic forgetting. Replay-based CL methods try to retain the performance by rehearsing stored raw input data or generated samples of the previous tasks. Another challenging situation for ML agents is to gain an acceptable overall performance on skewed data distribution of imbalanced datasets. In this thesis, for the first time in the literature, we analyze the performance of replay-based CL methods on imbalanced datasets in class-incremental scenario. Moreover, we suggest a new application of an adaptive sample re-weighting strategy called Meta-Weight Net for replay-based CL methods. Meta-Weight Net trains a NN to estimate the sample weights based on their respective loss value, which will define the adaptive weighted loss function. Our experiments show that using this strategy not only improves performance over an imbalanced dataset but also can help with more complex datasets where the data is balanced
A Neural Network Model of Continual Learning with Cognitive Control
Neural networks struggle in continual learning settings from catastrophic
forgetting: when trials are blocked, new learning can overwrite the learning
from previous blocks. Humans learn effectively in these settings, in some cases
even showing an advantage of blocking, suggesting the brain contains mechanisms
to overcome this problem. Here, we build on previous work and show that neural
networks equipped with a mechanism for cognitive control do not exhibit
catastrophic forgetting when trials are blocked. We further show an advantage
of blocking over interleaving when there is a bias for active maintenance in
the control signal, implying a tradeoff between maintenance and the strength of
control. Analyses of map-like representations learned by the networks provided
additional insights into these mechanisms. Our work highlights the potential of
cognitive control to aid continual learning in neural networks, and offers an
explanation for the advantage of blocking that has been observed in humans.Comment: 7 pages, 5 figures, paper accepted as a talk to CogSci 2022
(https://escholarship.org/uc/item/3gn3w58z
Organizing recurrent network dynamics by task-computation to enable continual learning
Biological systems face dynamic environments that require continual learning. It is not well understood how these systems balance the tension between flexibility for learning and robustness for memory of previous behaviors. Continual learning without catastrophic interference also remains a challenging problem in machine learning. Here, we develop a novel learning rule designed to minimize interference between sequentially learned tasks in recurrent networks. Our learning rule preserves network dynamics within activity-defined subspaces used for previously learned tasks. It encourages dynamics associated with new tasks that might otherwise interfere to instead explore orthogonal subspaces, and it allows for reuse of previously established dynamical motifs where possible. Employing a set of tasks used in neuroscience, we demonstrate that our approach successfully eliminates catastrophic interference and offers a substantial improvement over previous continual learning algorithms. Using dynamical systems analysis, we show that networks trained using our approach can reuse similar dynamical structures across similar tasks. This possibility for shared computation allows for faster learning during sequential training. Finally, we identify organizational differences that emerge when training tasks sequentially versus simultaneously
Grounding for Artificial Intelligence
A core function of intelligence is grounding, which is the process of
connecting the natural language and abstract knowledge to the internal
representation of the real world in an intelligent being, e.g., a human. Human
cognition is grounded in our sensorimotor experiences in the external world and
subjective feelings in our internal world. We use languages to communicate with
each other and the languages are grounded on our shared sensorimotor
experiences and feelings. Without this shard grounding, it is impossible for us
to understand each other because all natural languages are highly abstract and
are only able to describe a tiny portion of what has happened or is happening
in the real world. Although grounding at high or abstract levels has been
studied in different fields and applications, to our knowledge, limited
systematic work at fine-grained levels has been done. With the rapid progress
of large language models (LLMs), it is imperative that we have a sound
understanding of grounding in order to move to the next level of intelligence.
It is also believed that grounding is necessary for Artificial General
Intelligence (AGI). This paper makes an attempt to systematically study this
problem
CLVOS23: A Long Video Object Segmentation Dataset for Continual Learning
Continual learning in real-world scenarios is a major challenge. A general
continual learning model should have a constant memory size and no predefined
task boundaries, as is the case in semi-supervised Video Object Segmentation
(VOS), where continual learning challenges particularly present themselves in
working on long video sequences. In this article, we first formulate the
problem of semi-supervised VOS, specifically online VOS, as a continual
learning problem, and then secondly provide a public VOS dataset, CLVOS23,
focusing on continual learning. Finally, we propose and implement a
regularization-based continual learning approach on LWL, an existing online VOS
baseline, to demonstrate the efficacy of continual learning when applied to
online VOS and to establish a CLVOS23 baseline. We apply the proposed baseline
to the Long Videos dataset as well as to two short video VOS datasets, DAVIS16
and DAVIS17. To the best of our knowledge, this is the first time that VOS has
been defined and addressed as a continual learning problem
Latent Replay for Real-Time Continual Learning
Training deep neural networks at the edge on light computational devices,
embedded systems and robotic platforms is nowadays very challenging. Continual
learning techniques, where complex models are incrementally trained on small
batches of new data, can make the learning problem tractable even for CPU-only
embedded devices enabling remarkable levels of adaptiveness and autonomy.
However, a number of practical problems need to be solved: catastrophic
forgetting before anything else. In this paper we introduce an original
technique named "Latent Replay" where, instead of storing a portion of past
data in the input space, we store activations volumes at some intermediate
layer. This can significantly reduce the computation and storage required by
native rehearsal. To keep the representation stable and the stored activations
valid we propose to slow-down learning at all the layers below the latent
replay one, leaving the layers above free to learn at full pace. In our
experiments we show that Latent Replay, combined with existing continual
learning techniques, achieves state-of-the-art performance on complex video
benchmarks such as CORe50 NICv2 (with nearly 400 small and highly non-i.i.d.
batches) and OpenLORIS. Finally, we demonstrate the feasibility of nearly
real-time continual learning on the edge through the deployment of the proposed
technique on a smartphone device.Comment: Pre-print v3: 13 pages, 9 figures, 10 tables, 1 algorith