3 research outputs found

    [Re] How Attention Can Create Synaptic Tags for the Learning of Working Memories in Sequential Tasks

    Get PDF
    International audienceThe reference paper introduces a new reinforcement learning model called Attention-Gated MEmory Tagging (AuGMEnT). The results presented suggest new approachesin understanding the acquisition of tasks requiring working memory and attentionalfeedback, as well as biologically plausible learning mechanisms. The model also improveson previous reinforcement learning schemes by allowing tasks to be expressedmore naturally as a sequence of inputs and outputs.A Python implementation of the model is available on the author’s GitHub pagewhich helped to verify the correctness of the computations. The script written forthis replication also uses Python along with NumPy

    Continuous-time on-policy neural reinforcement learning of working memory tasks

    Get PDF
    As living organisms, one of our primary characteristics is the ability to rapidly process and react to unknown and unexpected events. To this end, we are able to recognize an event or a sequence of events and learn to respond properly. Despite advances in machine learning, current cognitive robotic systems are not able to rapidly and efficiently respond in the real world: the challenge is to learn to recognize both what is important, and also when to act. Reinforcement Learning (RL) is typically used to solve complex tasks: to learn the how. To respond quickly - to learn when - the environment has to be sampled often enough. For “enough”, a programmer has to decide on the step-size as a time-representation, choosing between a fine-grained representation of time (many state-transitions; difficult to learn with RL) or to a coarse temporal resolution (easier to learn with RL but lacking precise timing). Here, we derive a continuous-time version of on-policy SARSA-learning in a working-memory neural network model, AuGMEnT. Using a neural working memory network resolves the what problem, our when solution is built on the notion that in the real world, instantaneous actions of duration dt are actually impossible. We demonstrate how we can decouple action duration from the internal time-steps in the neural RL model using an action selection system. The resultant CT-AuGMEnT successfully learns to react to the events of a continuous-time task, without any pre-imposed specifications about the duration of the events or the delays between them
    corecore