91,871 research outputs found

    Embodying a Computational Model of Hippocampal Replay for Robotic Reinforcement Learning

    Get PDF
    Hippocampal reverse replay has been speculated to play an important role in biological reinforcement learning since its discovery over a decade ago. Whilst a number of computational models have recently emerged in an attempt to understand the dynamics of hippocampal replay, there has been little progress in testing and implementing these models in real-world robotics settings. Presented first in this body of work then is a bio-inspired hippocampal CA3 network model. It runs in real-time to produce reverse replays of recent spatio-temporal sequences, represented as place cell activities, in a robotic spatial navigation task. The model is based on two very recent computational models of hippocampal reverse replay. An analysis of these models show that, in their original forms, they are each insufficient for effective performance when applied to a robot. As such, choosing particular elements from each allows for a computational model that is sufficient for application in a robotic task. Having a model of reverse replay applied successfully in a robot provides the groundwork necessary for testing the ways in which reverse replay contributes to reinforcement learning. The second portion of the work presented here builds on a previous reinforcement learning neural network model of a basic hippocampal-striatal circuit using a three-factor learning rule. By integrating reverse replays into this reinforcement learning model, results show that reverse replay, with its ability to replay the recent trajectory both in the hippocampal circuit and the striatal circuit, can speed up the learning process. In addition, for situations where the original reinforcement learning model performs poorly, such as when its time dynamics do not sufficiently store enough of the robot's behavioural history for effective learning, the reverse replay model can compensate for this by replaying the recent history. These results are inline with experimental findings showing that disruption of awake hippocampal replay events severely diminishes, but does not entirely eliminate, reinforcement learning. This work provides possible insights into the important role that reverse replays could contribute to mnemonic function, and reinforcement learning in particular; insights that could benefit the robotic, AI, and neuroscience communities. However, there is still much to be done. How reverse replays are initiated is still an ongoing research problem, for instance. Furthermore, the model presented here generates place cells heuristically, but there are computational models tackling the problem of how hippocampal cells such as place cells, but also grid cells and head direction cells, emerge. This leads to the pertinent question of asking how these models, which make assumptions about their network architectures and dynamics, could integrate with the computational models of hippocampal replay which make their own assumptions on network architectures and dynamics

    Modeling Avoidance in Mood and Anxiety Disorders Using Reinforcement Learning.

    Get PDF
    BACKGROUND: Serious and debilitating symptoms of anxiety are the most common mental health problem worldwide, accounting for around 5% of all adult years lived with disability in the developed world. Avoidance behavior-avoiding social situations for fear of embarrassment, for instance-is a core feature of such anxiety. However, as for many other psychiatric symptoms the biological mechanisms underlying avoidance remain unclear. METHODS: Reinforcement learning models provide formal and testable characterizations of the mechanisms of decision making; here, we examine avoidance in these terms. A total of 101 healthy participants and individuals with mood and anxiety disorders completed an approach-avoidance go/no-go task under stress induced by threat of unpredictable shock. RESULTS: We show an increased reliance in the mood and anxiety group on a parameter of our reinforcement learning model that characterizes a prepotent (Pavlovian) bias to withhold responding in the face of negative outcomes. This was particularly the case when the mood and anxiety group was under stress. CONCLUSIONS: This formal description of avoidance within the reinforcement learning framework provides a new means of linking clinical symptoms with biophysically plausible models of neural circuitry and, as such, takes us closer to a mechanistic understanding of mood and anxiety disorders

    Learning

    Get PDF
    Learning and evolution are adaptive or “backward-looking” models of social and biological systems. Learning changes the probability distribution of traits within an individual through direct and vicarious reinforcement, while evolution changes the probability distribution of traits within a population through reproduction and selection. Compared to forward-looking models of rational calculation that identify equilibrium outcomes, adaptive models pose fewer cognitive requirements and reveal both equilibrium and out-of-equilibrium dynamics. However, they are also less general than analytical models and require relatively stable environments. In this chapter, we review the conceptual and practical foundations of several approaches to models of learning that offer powerful tools for modeling social processes. These include the Bush-Mosteller stochastic learning model, the Roth-Erev matching model, feed-forward and attractor neural networks, and belief learning. Evolutionary approaches include replicator dynamics and genetic algorithms. A unifying theme is showing how complex patterns can arise from relatively simple adaptive rules.</p

    Robust learning algorithms for spiking and rate-based neural networks

    Get PDF
    Inspired by the remarkable properties of the human brain, the fields of machine learning, computational neuroscience and neuromorphic engineering have achieved significant synergistic progress in the last decade. Powerful neural network models rooted in machine learning have been proposed as models for neuroscience and for applications in neuromorphic engineering. However, the aspect of robustness is often neglected in these models. Both biological and engineered substrates show diverse imperfections that deteriorate the performance of computation models or even prohibit their implementation. This thesis describes three projects aiming at implementing robust learning with local plasticity rules in neural networks. First, we demonstrate the advantages of neuromorphic computations in a pilot study on a prototype chip. Thereby, we quantify the speed and energy consumption of the system compared to a software simulation and show how on-chip learning contributes to the robustness of learning. Second, we present an implementation of spike-based Bayesian inference on accelerated neuromorphic hardware. The model copes, via learning, with the disruptive effects of the imperfect substrate and benefits from the acceleration. Finally, we present a robust model of deep reinforcement learning using local learning rules. It shows how backpropagation combined with neuromodulation could be implemented in a biologically plausible framework. The results contribute to the pursuit of robust and powerful learning networks for biological and neuromorphic substrates

    Predator-prey survival pressure is sufficient to evolve swarming behaviors

    Full text link
    The comprehension of how local interactions arise in global collective behavior is of utmost importance in both biological and physical research. Traditional agent-based models often rely on static rules that fail to capture the dynamic strategies of the biological world. Reinforcement learning has been proposed as a solution, but most previous methods adopt handcrafted reward functions that implicitly or explicitly encourage the emergence of swarming behaviors. In this study, we propose a minimal predator-prey coevolution framework based on mixed cooperative-competitive multiagent reinforcement learning, and adopt a reward function that is solely based on the fundamental survival pressure, that is, prey receive a reward of 1-1 if caught by predators while predators receive a reward of +1+1. Surprisingly, our analysis of this approach reveals an unexpectedly rich diversity of emergent behaviors for both prey and predators, including flocking and swirling behaviors for prey, as well as dispersion tactics, confusion, and marginal predation phenomena for predators. Overall, our study provides novel insights into the collective behavior of organisms and highlights the potential applications in swarm robotics

    Pseudorehearsal in value function approximation

    Full text link
    Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters

    Phenomenological models of synaptic plasticity based on spike timing

    Get PDF
    Synaptic plasticity is considered to be the biological substrate of learning and memory. In this document we review phenomenological models of short-term and long-term synaptic plasticity, in particular spike-timing dependent plasticity (STDP). The aim of the document is to provide a framework for classifying and evaluating different models of plasticity. We focus on phenomenological synaptic models that are compatible with integrate-and-fire type neuron models where each neuron is described by a small number of variables. This implies that synaptic update rules for short-term or long-term plasticity can only depend on spike timing and, potentially, on membrane potential, as well as on the value of the synaptic weight, or on low-pass filtered (temporally averaged) versions of the above variables. We examine the ability of the models to account for experimental data and to fulfill expectations derived from theoretical considerations. We further discuss their relations to teacher-based rules (supervised learning) and reward-based rules (reinforcement learning). All models discussed in this paper are suitable for large-scale network simulation
    corecore