2 research outputs found

    A Complementary Learning Systems approach to Temporal Difference Learning

    Get PDF
    Complementary Learning Systems (CLS) theory suggests that the brain uses a 'neocortical' and a 'hippocampal' learning system to achieve complex behavior. These two systems are complementary in that the 'neocortical' system relies on slow learning of distributed representations while the 'hippocampal' system relies on fast learning of pattern-separated representations. Both of these systems project to the striatum, which is a key neural structure in the brain's implementation of Reinforcement Learning (RL). Current deep RL approaches share similarities with a 'neocortical' system because they slowly learn distributed representations through backpropagation in Deep Neural Networks (DNNs). An ongoing criticism of such approaches is that they are data inefficient and lack exibility. CLS theory suggests that the addition of a 'hippocampal' system could address these criticisms. In the present study we propose a novel algorithm known as Complementary Temporal Difference Learning (CTDL), which combines a DNN with a Self-Organising Map (SOM) to obtain the benefits of both a 'neocortical' and a 'hippocampal' system. Key features of CTDL include the use of Temporal Difference (TD) error to update a SOM and the combination of a SOM and DNN to calculate action values. We evaluate CTDL on Grid World, Cart-Pole and Continuous Mountain Car tasks and show several benefits over the classic Deep Q-Network (DQN) approach. These results demonstrate (1) the utility of complementary learning systems for the evaluation of actions, (2) that the TD error signal is a useful form of communication between the two systems and (3) that our approach extends to both discrete and continuous state and action spaces

    Understanding efficient reinforcement learning in humans and machines

    Get PDF
    One of the primary mechanisms thought to underlie action selection in the brain is Reinforcement Learning (RL). Recently, the use of Deep Neural Networks in models of RL (Deep RL) has led to human-level performance on complex reward-driven perceptual-motor tasks. However, Deep RL is persistently criticised for being data inefficient compared to human learning because it lacks the ability to: (1) rapidly learn from new information and (2) transfer knowledge from past experiences. The purpose of this thesis is to form an analogy between the brain and Deep RL to understand how the brain performs these two processes. To investigate the internal computations supporting rapid learning and transfer we use Complementary Learning Systems (CLS) theory. This allows us to focus on the computational properties of key learning systems in the brain and their interactions. We review recent advances in Deep RL and how they relate to the CLS framework. This results in the presentation of two novel Deep RL algorithms, which highlight key properties of the brain that support rapid learning and transfer: the fast learning of pattern-separated representations in the hippocampus, and the selective attention mechanisms of the pre-frontal cortex. External factors in the environment can also impact upon rapid learning and transfer in the brain. We therefore conduct behavioural experiments that investigate how the degree of perceptual similarity between consecutive experiences affects people’s ability to perform transfer. To do this we use naturalistic 2D video games that vary in perceptual features but rely on the same underlying rules. We discuss the results of these experiments with respect to Deep RL, analogical reasoning and category learning. We hope that the analogy formed over the course of this thesis between the brain and Deep RL can inform future research into efficient RL in humans and machines
    corecore