907 research outputs found

    Generating Long-term Trajectories Using Deep Hierarchical Networks

    Get PDF
    We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when myopic modeling lead to the desired behavior. The key difficulty is that conventional approaches are "shallow" models that only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on learning to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.Comment: Published in NIPS 201

    Continuous-time spike-based reinforcement learning for working memory tasks

    Get PDF
    As the brain purportedly employs on-policy reinforcement learning compatible with SARSA learning, and most interesting cognitive tasks require some form of memory while taking place in continuous-time, recent work has developed plausible reinforcement learning schemes that are compatible with these requirements. Lacking is a formulation of both computation and learning in terms of spiking neurons. Such a formulation creates both a closer mapping to biology, and also expresses such learning in terms of asynchronous and sparse neural computation. We present a spiking neural network with memory that learns cognitive tasks in continuous time. Learning is biologically plausibly implemented using the AuGMeNT framework, and we show how separate spiking forward and feedback networks suffice for learning the tasks just as fast the analog CT-AuGMeNT counterpart, while computing efficiently using very few spikes: 1–20 Hz on average

    Dimensions of Timescales in Neuromorphic Computing Systems

    Get PDF
    This article is a public deliverable of the EU project "Memory technologies with multi-scale time constants for neuromorphic architectures" (MeMScales, https://memscales.eu, Call ICT-06-2019 Unconventional Nanoelectronics, project number 871371). This arXiv version is a verbatim copy of the deliverable report, with administrative information stripped. It collects a wide and varied assortment of phenomena, models, research themes and algorithmic techniques that are connected with timescale phenomena in the fields of computational neuroscience, mathematics, machine learning and computer science, with a bias toward aspects that are relevant for neuromorphic engineering. It turns out that this theme is very rich indeed and spreads out in many directions which defy a unified treatment. We collected several dozens of sub-themes, each of which has been investigated in specialized settings (in the neurosciences, mathematics, computer science and machine learning) and has been documented in its own body of literature. The more we dived into this diversity, the more it became clear that our first effort to compose a survey must remain sketchy and partial. We conclude with a list of insights distilled from this survey which give general guidelines for the design of future neuromorphic systems

    Flexible Working Memory Through Selective Gating and Attentional Tagging

    Get PDF
    Working memory is essential: it serves to guide intelligent behavior of humans and nonhuman primates when task-relevant stimuli are no longer present to the senses. Moreover, complex tasks often require that multiple working memory representations can be flexibly and independently maintained, prioritized, and updated according to changing task demands. Thus far, neural network models of working memory have been unable to offer an integrative account of how such control mechanisms can be acquired in a biologically plausible manner. Here, we present WorkMATe, a neural network architecture that models cognitive control over working memory content and learns the appropriate control operations needed to solve complex working memory tasks. Key components of the model include a gated memory circuit that is controlled by internal actions, encoding sensory information through untrained connections, and a neural circuit that matches sensory inputs to memory content. The network is trained by means of a biologically plausible reinforcement learning rule that relies on attentional feedback and reward prediction errors to guide synaptic updates. We demonstrate that the model successfully acquires policies to solve classical working memory tasks, such as delayed recognition and delayed pro-saccade/anti-saccade tasks. In addition, the model solves much more complex tasks, including the hierarchical 12-AX task or the ABAB ordered recognition task, both of which demand an agent to independently store and updated multiple items separately in memory. Furthermore, the control strategies that the model acquires for these tasks subsequently generalize to new task contexts with novel stimuli, thus bringing symbolic production rule qualities to a neural network architecture. As such, WorkMATe provides a new solution for the neural implementation of flexible memory control
    • …
    corecore