3 research outputs found

    Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

    Full text link
    In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this article, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts its synapses in a biologically-plausible fashion, while another, complementary neural system rapidly learns to direct and control this cortex-like structure by mimicking the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting as compared to standard neural models and outperforms a wide swath of previously proposed methods even though it is trained across task datasets in a stream-like fashion. The promising performance of our complementary system on benchmarks, e.g., SplitMNIST, Split Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, a new possibility for tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split mnist/fmnist/not-mnist. Task selection/basal ganglia model has been integrate

    Class-incremental lifelong object learning for domestic robots

    Get PDF
    Traditionally, robots have been confined to settings where they operate in isolation and in highly controlled and structured environments to execute well-defined non-varying tasks. As a result, they usually operate without the need to perceive their surroundings or to adapt to changing stimuli. However, as robots start to move towards human-centred environments and share the physical space with people, there is an urgent need to endow them with the flexibility to learn and adapt given the changing nature of the stimuli they receive and the evolving requirements of their users. Standard machine learning is not suitable for these types of applications because it operates under the assumption that data samples are independent and identically distributed, and requires access to all the data in advance. If any of these assumptions is broken, the model fails catastrophically, i.e., either it does not learn or it forgets all that was previously learned. Therefore, different strategies are required to address this problem. The focus of this thesis is on lifelong object learning, whereby a model is able to learn from data that becomes available over time. In particular we address the problem of classincremental learning with an emphasis on algorithms that can enable interactive learning with a user. In class-incremental learning, models learn from sequential data batches where each batch can contain samples coming from ideally a single class. The emphasis on interactive learning capabilities poses additional requirements in terms of the speed with which model updates are performed as well as how the interaction is handled. The work presented in this thesis can be divided into two main lines of work. First, we propose two versions of a lifelong learning algorithm composed of a feature extractor based on pre-trained residual networks, an array of growing self-organising networks and a classifier. Self-organising networks are able to adapt their structure based on the input data distribution, and learn representative prototypes of the data. These prototypes can then be used to train a classifier. The proposed approaches are evaluated on various benchmarks under several conditions and the results show that they outperform competing approaches in each case. Second, we propose a robot architecture to address lifelong object learning through interactions with a human partner using natural language. The architecture consists of an object segmentation, tracking and preprocessing pipeline, a dialogue system, and a learning module based on the algorithm developed in the first part of the thesis. Finally, the thesis also includes an exploration into the contributions that different preprocessing operations have on performance when learning from both RGB and Depth images.James Watt Scholarshi

    Neurocomputational model for learning, memory consolidation and schemas

    Get PDF
    This thesis investigates how through experience the brain acquires and stores memories, and uses these to extract and modify knowledge. This question is being studied by both computational and experimental neuroscientists as it is of relevance for neuroscience, but also for artificial systems that need to develop knowledge about the world from limited, sequential data. It is widely assumed that new memories are initially stored in the hippocampus, and later are slowly reorganised into distributed cortical networks that represent knowledge. This memory reorganisation is called systems consolidation. In recent years, experimental studies have revealed complex hippocampal-neocortical interactions that have blurred the lines between the two memory systems, challenging the traditional understanding of memory processes. In particular, the prior existence of cortical knowledge frameworks (also known as schemas) was found to speed up learning and consolidation, which seemingly is at odds with previous models of systems consolidation. However, the underlying mechanisms of this effect are not known. In this work, we present a computational framework to explore potential interactions between the hippocampus, the prefrontal cortex, and associative cortical areas during learning as well as during sleep. To model the associative cortical areas, where the memories are gradually consolidated, we have implemented an artificial neural network (Restricted Boltzmann Machine) so as to get insight into potential neural mechanisms of memory acquisition, recall, and consolidation. We analyse the network’s properties using two tasks inspired by neuroscience experiments. The network gradually built a semantic schema in the associative cortical areas through the consolidation of multiple related memories, a process promoted by hippocampal-driven replay during sleep. To explain the experimental data we suggest that, as the neocortical schema develops, the prefrontal cortex extracts characteristics shared across multiple memories. We call this information meta-schema. In our model, the semantic schema and meta-schema in the neocortex are used to compute consistency, conflict and novelty signals. We propose that the prefrontal cortex uses these signals to modulate memory formation in the hippocampus during learning, which in turn influences consolidation during sleep replay. Together, these results provide theoretical framework to explain experimental findings and produce predictions for hippocampal-neocortical interactions during learning and systems consolidation
    corecore