146,730 research outputs found

    Correlates of reward-predictive value in learning-related hippocampal neural activity

    Get PDF
    Temporal difference learning (TD) is a popular algorithm in machine learning. Two learning signals that are derived from this algorithm, the predictive value and the prediction error, have been shown to explain changes in neural activity and behavior during learning across species. Here, the predictive value signal is used to explain the time course of learning-related changes in the activity of hippocampal neurons in monkeys performing an associative learning task. The TD algorithm serves as the centerpiece of a joint probability model for the learning-related neural activity and the behavioral responses recorded during the task. The neural component of the model consists of spiking neurons that compete and learn the reward-predictive value of task-relevant input signals. The predictive-value signaled by these neurons influences the behavioral response generated by a stochastic decision stage, which constitutes the behavioral component of the model. It is shown that the time course of the changes in neural activity and behavioral performance generated by the model exhibits key features of the experimental data. The results suggest that information about correct associations may be expressed in the hippocampus before it is detected in the behavior of a subject. In this way, the hippocampus may be among the earliest brain areas to express learning and drive the behavioral changes associated with learning. Correlates of reward-predictive value may be expressed in the hippocampus through rate remapping within spatial memory representations, they may represent reward-related aspects of a declarative or explicit relational memory representation of task contingencies, or they may correspond to reward-related components of episodic memory representations. These potential functions are discussed in connection with hippocampal cell assembly sequences and their reverse reactivation during the awake state. The results provide further support for the proposal that neural processes underlying learning may be implementing a temporal difference-like algorithm

    Toward the biological model of the hippocampus as the successor representation agent

    Full text link
    The hippocampus is an essential brain region for spatial memory and learning. Recently, a theoretical model of the hippocampus based on temporal difference (TD) learning has been published. Inspired by the successor representation (SR) learning algorithms, which decompose value function of TD learning into reward and state transition, they argued that the rate of firing of CA1 place cells in the hippocampus represents the probability of state transition. This theory, called predictive map theory, claims that the hippocampus representing space learns the probability of transition from the current state to the future state. The neural correlates of expecting the future state are the firing rates of the CA1 place cells. This explanation is plausible for the results recorded in behavioral experiments, but it is lacking the neurobiological implications. Modifying the SR learning algorithm added biological implications to the predictive map theory. Similar with the simultaneous needs of information of the current and future state in the SR learning algorithm, the CA1 place cells receive two inputs from CA3 and entorhinal cortex. Mathematical transformation showed that the SR learning algorithm is equivalent to the heterosynaptic plasticity rule. The heterosynaptic plasticity phenomena in CA1 were discussed and compared with the modified SR update rule. This study attempted to interpret the TD algorithm as the neurobiological mechanism occurring in place learning, and to integrate the neuroscience and artificial intelligence approaches in the field.Comment: 9 pages, 1 figur

    Rapid learning of predictive maps with STDP and theta phase precession

    Get PDF
    The predictive map hypothesis is a promising candidate principle for hippocampal function. A favoured formalisation of this hypothesis, called the successor representation, proposes that each place cell encodes the expected state occupancy of its target location in the near future. This predictive framework is supported by behavioural as well as electrophysiological evidence and has desirable consequences for both the generalisability and efficiency of reinforcement learning algorithms. However, it is unclear how the successor representation might be learnt in the brain. Error-driven temporal difference learning, commonly used to learn successor representations in artificial agents, is not known to be implemented in hippocampal networks. Instead, we demonstrate that spike-timing dependent plasticity (STDP), a form of Hebbian learning, acting on temporally compressed trajectories known as 'theta sweeps', is sufficient to rapidly learn a close approximation to the successor representation. The model is biologically plausible - it uses spiking neurons modulated by theta-band oscillations, diffuse and overlapping place cell-like state representations, and experimentally matched parameters. We show how this model maps onto known aspects of hippocampal circuitry and explains substantial variance in the temporal difference successor matrix, consequently giving rise to place cells that demonstrate experimentally observed successor representation-related phenomena including backwards expansion on a 1D track and elongation near walls in 2D. Finally, our model provides insight into the observed topographical ordering of place field sizes along the dorsal-ventral axis by showing this is necessary to prevent the detrimental mixing of larger place fields, which encode longer timescale successor representations, with more fine-grained predictions of spatial location
    corecore