1,931 research outputs found

    A Local Circuit Model of Learned Striatal and Dopamine Cell Responses under Probabilistic Schedules of Reward

    Full text link
    Before choosing, it helps to know both the expected value signaled by a predictive cue and the associated uncertainty that the reward will be forthcoming. Recently, Fiorillo et al. (2003) found the dopamine (DA) neurons of the SNc exhibit sustained responses related to the uncertainty that a cure will be followed by reward, in addition to phasic responses related to reward prediction errors (RPEs). This suggests that cue-dependent anticipations of the timing, magnitude, and uncertainty of rewards are learned and reflected in components of the DA signals broadcast by SNc neurons. What is the minimal local circuit model that can explain such multifaceted reward-related learning? A new computational model shows how learned uncertainty responses emerge robustly on single trial along with phasic RPE responses, such that both types of DA responses exhibit the empirically observed dependence on conditional probability, expected value of reward, and time since onset of the reward-predicting cue. The model includes three major pathways for computing: immediate expected values of cures, timed predictions of reward magnitudes (and RPEs), and the uncertainty associated with these predictions. The first two model pathways refine those previously modeled by Brown et al. (1999). A third, newly modeled, pathway is formed by medium spiny projection neurons (MSPNs) of the matrix compartment of the striatum, whose axons co-release GABA and a neuropeptide, substance P, both at synapses with GABAergic neurons in the SNr and with the dendrites (in SNr) of DA neurons whose somas are in ventral SNc. Co-release enables efficient computation of sustained DA uncertainty responses that are a non-monotonic function of the conditonal probability that a reward will follow the cue. The new model's incorporation of a striatal microcircuit allowed it to reveals that variability in striatal cholinergic transmission can explain observed difference, between monkeys, in the amplitutude of the non-monotonic uncertainty function. Involvement of matriceal MSPNs and striatal cholinergic transmission implpies a relation between uncertainty in the cue-reward contigency and action-selection functions of the basal ganglia. The model synthesizes anatomical, electrophysiological and behavioral data regarding the midbrain DA system in a novel way, by relating the ability to compute uncertainty, in parallel with other aspects of reward contingencies, to the unique distribution of SP inputs in ventral SN.National Science Foundation (SBE-354378); Higher Educational Council of Turkey; Canakkale Onsekiz Mart University of Turke

    Dopaminergic and Non-Dopaminergic Value Systems in Conditioning and Outcome-Specific Revaluation

    Full text link
    Animals are motivated to choose environmental options that can best satisfy current needs. To explain such choices, this paper introduces the MOTIVATOR (Matching Objects To Internal Values Triggers Option Revaluations) neural model. MOTIVATOR describes cognitiveemotional interactions between higher-order sensory cortices and an evaluative neuraxis composed of the hypothalamus, amygdala, and orbitofrontal cortex. Given a conditioned stimulus (CS), the model amygdala and lateral hypothalamus interact to calculate the expected current value of the subjective outcome that the CS predicts, constrained by the current state of deprivation or satiation. The amygdala relays the expected value information to orbitofrontal cells that receive inputs from anterior inferotemporal cells, and medial orbitofrontal cells that receive inputs from rhinal cortex. The activations of these orbitofrontal cells code the subjective values of objects. These values guide behavioral choices. The model basal ganglia detect errors in CS-specific predictions of the value and timing of rewards. Excitatory inputs from the pedunculopontine nucleus interact with timed inhibitory inputs from model striosomes in the ventral striatum to regulate dopamine burst and dip responses from cells in the substantia nigra pars compacta and ventral tegmental area. Learning in cortical and striatal regions is strongly modulated by dopamine. The model is used to address tasks that examine food-specific satiety, Pavlovian conditioning, reinforcer devaluation, and simultaneous visual discrimination. Model simulations successfully reproduce discharge dynamics of known cell types, including signals that predict saccadic reaction times and CS-dependent changes in systolic blood pressure.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Institutes of Health (R29-DC02952, R01-DC007683); National Science Foundation (IIS-97-20333, SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Which way do I go? Neural activation in response to feedback and spatial processing in a virtual T-maze

    No full text
    In 2 human event-related brain potential (ERP) experiments, we examined the feedback error-related negativity (fERN), an ERP component associated with reward processing by the midbrain dopamine system, and the N170, an ERP component thought to be generated by the medial temporal lobe (MTL), to investigate the contributions of these neural systems toward learning to find rewards in a "virtual T-maze" environment. We found that feedback indicating the absence versus presence of a reward differentially modulated fERN amplitude, but only when the outcome was not predicted by an earlier stimulus. By contrast, when a cue predicted the reward outcome, then the predictive cue (and not the feedback) differentially modulated fERN amplitude. We further found that the spatial location of the feedback stimuli elicited a large N170 at electrode sites sensitive to right MTL activation and that the latency of this component was sensitive to the spatial location of the reward, occurring slightly earlier for rewards following a right versus left turn in the maze. Taken together, these results confirm a fundamental prediction of a dopamine theory of the fERN and suggest that the dopamine and MTL systems may interact in navigational learning tasks

    Time representation in reinforcement learning models of the basal ganglia

    Get PDF
    Reinforcement learning (RL) models have been influential in understanding many aspects of basal ganglia function, from reward prediction to action selection. Time plays an important role in these models, but there is still no theoretical consensus about what kind of time representation is used by the basal ganglia. We review several theoretical accounts and their supporting evidence. We then discuss the relationship between RL models and the timing mechanisms that have been attributed to the basal ganglia. We hypothesize that a single computational system may underlie both RL and interval timing—the perception of duration in the range of seconds to hours. This hypothesis, which extends earlier models by incorporating a time-sensitive action selection mechanism, may have important implications for understanding disorders like Parkinson's disease in which both decision making and timing are impaired

    Neuropeptide Co-Release with Gaba May Explain Functional Non-Monotonic Uncertainty Responses in Dopamine Neurons

    Full text link
    Co-release of the inhibitory neurotransmitter GABA and the neuropeptide substance-P (SP) from single axons is a conspicuous feature of the basal ganglia, yet its computational role, if any, has not been resolved. In a new learning model, co-release of GABA and SP from axons of striatal projection neurons emerges as a highly efficient way to compute the uncertainty responses that are exhibited by dopamine (DA) neurons when animals adapt to probabilistic contingencies between rewards and the stimuli that predict their delivery. Such uncertainty-related dopamine release appears to be an adaptive phenotype, because it promotes behavioral switching at opportune times. Understanding the computational linkages between SP and DA in the basal ganglia is important, because Huntington's disease is characterized by massive SP depletion, whereas Parkinson's disease is characterized by massive DA depletion.National Science Foundation (SBE-354378); Higher Educational Educational Council of Turkey; Canakkale Onsekiz Mart University of Turke

    Reward prediction error and declarative memory

    Get PDF
    Learning based on reward prediction error (RPE) was originally proposed in the context of nondeclarative memory. We postulate that RPE may support declarative memory as well. Indeed, recent years have witnessed a number of independent empirical studies reporting effects of RPE on declarative memory. We provide a brief overview of these studies, identify emerging patterns, and discuss open issues such as the role of signed versus unsigned RPEs in declarative learning
    corecore