3,287 research outputs found

    Time representation in reinforcement learning models of the basal ganglia

    Get PDF
    Reinforcement learning (RL) models have been influential in understanding many aspects of basal ganglia function, from reward prediction to action selection. Time plays an important role in these models, but there is still no theoretical consensus about what kind of time representation is used by the basal ganglia. We review several theoretical accounts and their supporting evidence. We then discuss the relationship between RL models and the timing mechanisms that have been attributed to the basal ganglia. We hypothesize that a single computational system may underlie both RL and interval timing—the perception of duration in the range of seconds to hours. This hypothesis, which extends earlier models by incorporating a time-sensitive action selection mechanism, may have important implications for understanding disorders like Parkinson's disease in which both decision making and timing are impaired

    Value and prediction error in medial frontal cortex: integrating the single-unit and systems levels of analysis

    Get PDF
    The role of the anterior cingulate cortex (ACC) in cognition has been extensively investigated with several techniques, including single-unit recordings in rodents and monkeys and EEG and fMRI in humans. This has generated a rich set of data and points of view. Important theoretical functions proposed for ACC are value estimation, error detection, error-likelihood estimation, conflict monitoring, and estimation of reward volatility. A unified view is lacking at this time, however. Here we propose that online value estimation could be the key function underlying these diverse data. This is instantiated in the reward value and prediction model (RVPM). The model contains units coding for the value of cues (stimuli or actions) and units coding for the differences between such values and the actual reward (prediction errors). We exposed the model to typical experimental paradigms from single-unit, EEG, and fMRI research to compare its overall behavior with the data from these studies. The model reproduced the ACC behavior of previous single-unit, EEG, and fMRI studies on reward processing, error processing, conflict monitoring, error-likelihood estimation, and volatility estimation, unifying the interpretations of the role performed by the ACC in some aspects of cognition

    A Local Circuit Model of Learned Striatal and Dopamine Cell Responses under Probabilistic Schedules of Reward

    Full text link
    Before choosing, it helps to know both the expected value signaled by a predictive cue and the associated uncertainty that the reward will be forthcoming. Recently, Fiorillo et al. (2003) found the dopamine (DA) neurons of the SNc exhibit sustained responses related to the uncertainty that a cure will be followed by reward, in addition to phasic responses related to reward prediction errors (RPEs). This suggests that cue-dependent anticipations of the timing, magnitude, and uncertainty of rewards are learned and reflected in components of the DA signals broadcast by SNc neurons. What is the minimal local circuit model that can explain such multifaceted reward-related learning? A new computational model shows how learned uncertainty responses emerge robustly on single trial along with phasic RPE responses, such that both types of DA responses exhibit the empirically observed dependence on conditional probability, expected value of reward, and time since onset of the reward-predicting cue. The model includes three major pathways for computing: immediate expected values of cures, timed predictions of reward magnitudes (and RPEs), and the uncertainty associated with these predictions. The first two model pathways refine those previously modeled by Brown et al. (1999). A third, newly modeled, pathway is formed by medium spiny projection neurons (MSPNs) of the matrix compartment of the striatum, whose axons co-release GABA and a neuropeptide, substance P, both at synapses with GABAergic neurons in the SNr and with the dendrites (in SNr) of DA neurons whose somas are in ventral SNc. Co-release enables efficient computation of sustained DA uncertainty responses that are a non-monotonic function of the conditonal probability that a reward will follow the cue. The new model's incorporation of a striatal microcircuit allowed it to reveals that variability in striatal cholinergic transmission can explain observed difference, between monkeys, in the amplitutude of the non-monotonic uncertainty function. Involvement of matriceal MSPNs and striatal cholinergic transmission implpies a relation between uncertainty in the cue-reward contigency and action-selection functions of the basal ganglia. The model synthesizes anatomical, electrophysiological and behavioral data regarding the midbrain DA system in a novel way, by relating the ability to compute uncertainty, in parallel with other aspects of reward contingencies, to the unique distribution of SP inputs in ventral SN.National Science Foundation (SBE-354378); Higher Educational Council of Turkey; Canakkale Onsekiz Mart University of Turke

    Dopaminergic and Non-Dopaminergic Value Systems in Conditioning and Outcome-Specific Revaluation

    Full text link
    Animals are motivated to choose environmental options that can best satisfy current needs. To explain such choices, this paper introduces the MOTIVATOR (Matching Objects To Internal Values Triggers Option Revaluations) neural model. MOTIVATOR describes cognitiveemotional interactions between higher-order sensory cortices and an evaluative neuraxis composed of the hypothalamus, amygdala, and orbitofrontal cortex. Given a conditioned stimulus (CS), the model amygdala and lateral hypothalamus interact to calculate the expected current value of the subjective outcome that the CS predicts, constrained by the current state of deprivation or satiation. The amygdala relays the expected value information to orbitofrontal cells that receive inputs from anterior inferotemporal cells, and medial orbitofrontal cells that receive inputs from rhinal cortex. The activations of these orbitofrontal cells code the subjective values of objects. These values guide behavioral choices. The model basal ganglia detect errors in CS-specific predictions of the value and timing of rewards. Excitatory inputs from the pedunculopontine nucleus interact with timed inhibitory inputs from model striosomes in the ventral striatum to regulate dopamine burst and dip responses from cells in the substantia nigra pars compacta and ventral tegmental area. Learning in cortical and striatal regions is strongly modulated by dopamine. The model is used to address tasks that examine food-specific satiety, Pavlovian conditioning, reinforcer devaluation, and simultaneous visual discrimination. Model simulations successfully reproduce discharge dynamics of known cell types, including signals that predict saccadic reaction times and CS-dependent changes in systolic blood pressure.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Institutes of Health (R29-DC02952, R01-DC007683); National Science Foundation (IIS-97-20333, SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Dopamine restores reward prediction errors in old age.

    Get PDF
    Senescence affects the ability to utilize information about the likelihood of rewards for optimal decision-making. Using functional magnetic resonance imaging in humans, we found that healthy older adults had an abnormal signature of expected value, resulting in an incomplete reward prediction error (RPE) signal in the nucleus accumbens, a brain region that receives rich input projections from substantia nigra/ventral tegmental area (SN/VTA) dopaminergic neurons. Structural connectivity between SN/VTA and striatum, measured by diffusion tensor imaging, was tightly coupled to inter-individual differences in the expression of this expected reward value signal. The dopamine precursor levodopa (L-DOPA) increased the task-based learning rate and task performance in some older adults to the level of young adults. This drug effect was linked to restoration of a canonical neural RPE. Our results identify a neurochemical signature underlying abnormal reward processing in older adults and indicate that this can be modulated by L-DOPA

    An interoceptive predictive coding model of conscious presence

    Get PDF
    We describe a theoretical model of the neurocognitive mechanisms underlying conscious presence and its disturbances. The model is based on interoceptive prediction error and is informed by predictive models of agency, general models of hierarchical predictive coding and dopaminergic signaling in cortex, the role of the anterior insular cortex (AIC) in interoception and emotion, and cognitive neuroscience evidence from studies of virtual reality and of psychiatric disorders of presence, specifically depersonalization/derealization disorder. The model associates presence with successful suppression by top-down predictions of informative interoceptive signals evoked by autonomic control signals and, indirectly, by visceral responses to afferent sensory signals. The model connects presence to agency by allowing that predicted interoceptive signals will depend on whether afferent sensory signals are determined, by a parallel predictive-coding mechanism, to be self-generated or externally caused. Anatomically, we identify the AIC as the likely locus of key neural comparator mechanisms. Our model integrates a broad range of previously disparate evidence, makes predictions for conjoint manipulations of agency and presence, offers a new view of emotion as interoceptive inference, and represents a step toward a mechanistic account of a fundamental phenomenological property of consciousness

    Neural Dynamics Underlying Impaired Autonomic and Conditioned Responses Following Amygdala and Orbitofrontal Lesions

    Full text link
    A neural model is presented that explains how outcome-specific learning modulates affect, decision-making and Pavlovian conditioned approach responses. The model addresses how brain regions responsible for affective learning and habit learning interact, and answers a central question: What are the relative contributions of the amygdala and orbitofrontal cortex to emotion and behavior? In the model, the amygdala calculates outcome value while the orbitofrontal cortex influences attention and conditioned responding by assigning value information to stimuli. Model simulations replicate autonomic, electrophysiological, and behavioral data associated with three tasks commonly used to assay these phenomena: Food consumption, Pavlovian conditioning, and visual discrimination. Interactions of the basal ganglia and amygdala with sensory and orbitofrontal cortices enable the model to replicate the complex pattern of spared and impaired behavioral and emotional capacities seen following lesions of the amygdala and orbitofrontal cortex.National Science Foundation (SBE-0354378; IIS-97-20333); Office of Naval Research (N00014-01-1-0624); Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Institutes of Health (R29-DC02952
    corecore