3 research outputs found
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
In lifelong learning systems, especially those based on artificial neural
networks, one of the biggest obstacles is the severe inability to retain old
knowledge as new information is encountered. This phenomenon is known as
catastrophic forgetting. In this article, we propose a new kind of
connectionist architecture, the Sequential Neural Coding Network, that is
robust to forgetting when learning from streams of data points and, unlike
networks of today, does not learn via the immensely popular back-propagation of
errors. Grounded in the neurocognitive theory of predictive processing, our
model adapts its synapses in a biologically-plausible fashion, while another,
complementary neural system rapidly learns to direct and control this
cortex-like structure by mimicking the task-executive control functionality of
the basal ganglia. In our experiments, we demonstrate that our self-organizing
system experiences significantly less forgetting as compared to standard neural
models and outperforms a wide swath of previously proposed methods even though
it is trained across task datasets in a stream-like fashion. The promising
performance of our complementary system on benchmarks, e.g., SplitMNIST, Split
Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating
mechanisms prominent in real neuronal systems, such as competition, sparse
activation patterns, and iterative input processing, a new possibility for
tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split
mnist/fmnist/not-mnist. Task selection/basal ganglia model has been
integrate
Class-incremental lifelong object learning for domestic robots
Traditionally, robots have been confined to settings where they operate in isolation and in highly
controlled and structured environments to execute well-defined non-varying tasks. As a result,
they usually operate without the need to perceive their surroundings or to adapt to changing
stimuli. However, as robots start to move towards human-centred environments and share the
physical space with people, there is an urgent need to endow them with the flexibility to learn
and adapt given the changing nature of the stimuli they receive and the evolving requirements
of their users. Standard machine learning is not suitable for these types of applications because
it operates under the assumption that data samples are independent and identically distributed,
and requires access to all the data in advance. If any of these assumptions is broken, the model
fails catastrophically, i.e., either it does not learn or it forgets all that was previously learned.
Therefore, different strategies are required to address this problem.
The focus of this thesis is on lifelong object learning, whereby a model is able to learn
from data that becomes available over time. In particular we address the problem of classincremental learning with an emphasis on algorithms that can enable interactive learning with
a user. In class-incremental learning, models learn from sequential data batches where each
batch can contain samples coming from ideally a single class. The emphasis on interactive
learning capabilities poses additional requirements in terms of the speed with which model
updates are performed as well as how the interaction is handled.
The work presented in this thesis can be divided into two main lines of work. First,
we propose two versions of a lifelong learning algorithm composed of a feature extractor
based on pre-trained residual networks, an array of growing self-organising networks and a
classifier. Self-organising networks are able to adapt their structure based on the input data
distribution, and learn representative prototypes of the data. These prototypes can then be
used to train a classifier. The proposed approaches are evaluated on various benchmarks under
several conditions and the results show that they outperform competing approaches in each
case. Second, we propose a robot architecture to address lifelong object learning through
interactions with a human partner using natural language. The architecture consists of an
object segmentation, tracking and preprocessing pipeline, a dialogue system, and a learning
module based on the algorithm developed in the first part of the thesis. Finally, the thesis also
includes an exploration into the contributions that different preprocessing operations have on
performance when learning from both RGB and Depth images.James Watt Scholarshi
Neurocomputational model for learning, memory consolidation and schemas
This thesis investigates how through experience the brain acquires and stores memories,
and uses these to extract and modify knowledge. This question is being studied
by both computational and experimental neuroscientists as it is of relevance for neuroscience,
but also for artificial systems that need to develop knowledge about the world
from limited, sequential data. It is widely assumed that new memories are initially
stored in the hippocampus, and later are slowly reorganised into distributed cortical
networks that represent knowledge. This memory reorganisation is called systems consolidation.
In recent years, experimental studies have revealed complex hippocampal-neocortical
interactions that have blurred the lines between the two memory systems,
challenging the traditional understanding of memory processes. In particular, the prior
existence of cortical knowledge frameworks (also known as schemas) was found to
speed up learning and consolidation, which seemingly is at odds with previous models
of systems consolidation. However, the underlying mechanisms of this effect are not
known.
In this work, we present a computational framework to explore potential interactions
between the hippocampus, the prefrontal cortex, and associative cortical areas
during learning as well as during sleep. To model the associative cortical areas, where
the memories are gradually consolidated, we have implemented an artificial neural network
(Restricted Boltzmann Machine) so as to get insight into potential neural mechanisms
of memory acquisition, recall, and consolidation.
We analyse the network’s properties using two tasks inspired by neuroscience experiments.
The network gradually built a semantic schema in the associative cortical
areas through the consolidation of multiple related memories, a process promoted by
hippocampal-driven replay during sleep. To explain the experimental data we suggest
that, as the neocortical schema develops, the prefrontal cortex extracts characteristics
shared across multiple memories. We call this information meta-schema. In our model,
the semantic schema and meta-schema in the neocortex are used to compute consistency,
conflict and novelty signals. We propose that the prefrontal cortex uses these
signals to modulate memory formation in the hippocampus during learning, which in
turn influences consolidation during sleep replay.
Together, these results provide theoretical framework to explain experimental findings
and produce predictions for hippocampal-neocortical interactions during learning
and systems consolidation