253 research outputs found

    Human Substantia Nigra Neurons Encode Unexpected Financial Rewards

    Get PDF
    The brain's sensitivity to unexpected outcomes plays a fundamental role in an organism's ability to adapt and learn new behaviors. Emerging research suggests that midbrain dopaminergic neurons encode these unexpected outcomes. We used microelectrode recordings during deep brain stimulation surgery to study neuronal activity in the human substantia nigra (SN) while patients with Parkinson's disease engaged in a probabilistic learning task motivated by virtual financial rewards. Based on a model of the participants' expected reward, we divided trial outcomes into expected and unexpected gains and losses. SN neurons exhibited significantly higher firing rates after unexpected gains than unexpected losses. No such differences were observed after expected gains and losses. This result provides critical support for the hypothesized role of the SN in human reinforcement learning

    The causal role between phasic midbrain dopamine signals and learning

    Get PDF
    The article discusses how phasic dopamine (DA) may relate to action selection, goal-directed behavior, and behavioral flexibility of a mice. It states that optogenetic targeting of midbrain DA cells and striatal projections showed role in reward prediction and behavioral flexibility. It notes that DA activity regulates aspects related to appetitive reward learning. It mentions that DA is causally involved in flexible behavioral adaptations that occur due to changes in stimulus-reward incident

    Dopamine, reward learning, and active inference

    Get PDF
    Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings

    Temporal-Difference Reinforcement Learning with Distributed Representations

    Get PDF
    Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting “micro-Agents”, each of which has a separate discounting factor (γ). Each µAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (δ) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each µAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments

    Convergent Processing of Both Positive and Negative Motivational Signals by the VTA Dopamine Neuronal Populations

    Get PDF
    Dopamine neurons in the ventral tegmental area (VTA) have been traditionally studied for their roles in reward-related motivation or drug addiction. Here we study how the VTA dopamine neuron population may process fearful and negative experiences as well as reward information in freely behaving mice. Using multi-tetrode recording, we find that up to 89% of the putative dopamine neurons in the VTA exhibit significant activation in response to the conditioned tone that predict food reward, while the same dopamine neuron population also respond to the fearful experiences such as free fall and shake events. The majority of these VTA putative dopamine neurons exhibit suppression and offset-rebound excitation, whereas ∼25% of the recorded putative dopamine neurons show excitation by the fearful events. Importantly, VTA putative dopamine neurons exhibit parametric encoding properties: their firing change durations are proportional to the fearful event durations. In addition, we demonstrate that the contextual information is crucial for these neurons to respectively elicit positive or negative motivational responses by the same conditioned tone. Taken together, our findings suggest that VTA dopamine neurons may employ the convergent encoding strategy for processing both positive and negative experiences, intimately integrating with cues and environmental context

    Sensory regulation of dopaminergic cell activity: Phenomenology, circuitry and function

    Get PDF
    Dopaminergic neurons in a range of species are responsive to sensory stimuli. In the anesthetized preparation, responses to non-noxious and noxious sensory stimuli are usually tonic in nature, although long-duration changes in activity have been reported in the awake preparation as well. However, in the awake preparation, short-latency, phasic changes in activity are most common. These phasic responses can occur to unconditioned aversive and non-aversive stimuli, as well as to the stimuli which predict them. In both the anesthetized and awake preparations, not all dopaminergic neurons are responsive to sensory stimuli, however responsive neurons tend to respond to more than a single stimulus modality. Evidence suggests that short-latency sensory information is provided to dopaminergic neurons by relatively primitive subcortical structures – including the midbrain superior colliculus for vision and the mesopontine parabrachial nucleus for pain and possibly gustation. Although short-latency visual information is provided to dopaminergic neurons by the relatively primitive colliculus, dopaminergic neurons can discriminate between complex visual stimuli, an apparent paradox which can be resolved by the recently discovered route of information flow through to dopaminergic neurons from the cerebral cortex, via a relay in the colliculus. Given that projections from the cortex to the colliculus are extensive, such a relay potentially allows the activity of dopaminergic neurons to report the results of complex stimulus processing from widespread areas of the cortex. Furthermore, dopaminergic neurons could acquire their ability to reflect stimulus value by virtue of reward-related modification of sensory processing in the cortex. At the forebrain level, sensory-related changes in the tonic activity of dopaminergic neurons may regulate the impact of the cortex on forebrain structures such as the nucleus accumbens. In contrast, the short latency of the phasic responses to sensory stimuli in dopaminergic neurons, coupled with the activation of these neurons by non-rewarding stimuli, suggests that phasic responses of dopaminergic neurons may provide a signal to the forebrain which indicates that a salient event has occurred (and possibly an estimate of how salient that event is). A stimulus-related salience signal could be used by downstream systems to reinforce behavioral choices
    corecore