265 research outputs found

    Human Substantia Nigra Neurons Encode Unexpected Financial Rewards

    Get PDF
    The brain's sensitivity to unexpected outcomes plays a fundamental role in an organism's ability to adapt and learn new behaviors. Emerging research suggests that midbrain dopaminergic neurons encode these unexpected outcomes. We used microelectrode recordings during deep brain stimulation surgery to study neuronal activity in the human substantia nigra (SN) while patients with Parkinson's disease engaged in a probabilistic learning task motivated by virtual financial rewards. Based on a model of the participants' expected reward, we divided trial outcomes into expected and unexpected gains and losses. SN neurons exhibited significantly higher firing rates after unexpected gains than unexpected losses. No such differences were observed after expected gains and losses. This result provides critical support for the hypothesized role of the SN in human reinforcement learning

    The causal role between phasic midbrain dopamine signals and learning

    Get PDF
    The article discusses how phasic dopamine (DA) may relate to action selection, goal-directed behavior, and behavioral flexibility of a mice. It states that optogenetic targeting of midbrain DA cells and striatal projections showed role in reward prediction and behavioral flexibility. It notes that DA activity regulates aspects related to appetitive reward learning. It mentions that DA is causally involved in flexible behavioral adaptations that occur due to changes in stimulus-reward incident

    Role of Dopamine D2 Receptors in Human Reinforcement Learning

    Get PDF
    Influential neurocomputational models emphasize dopamine (DA) as an electrophysiological and neurochemical correlate of reinforcement learning. However, evidence of a specific causal role of DA receptors in learning has been less forthcoming, especially in humans. Here we combine, in a between-subjects design, administration of a high dose of the selective DA D2/3-receptor antagonist sulpiride with genetic analysis of the DA D2 receptor in a behavioral study of reinforcement learning in a sample of 78 healthy male volunteers. In contrast to predictions of prevailing models emphasizing DA's pivotal role in learning via prediction errors, we found that sulpiride did not disrupt learning, but rather induced profound impairments in choice performance. The disruption was selective for stimuli indicating reward, while loss avoidance performance was unaffected. Effects were driven by volunteers with higher serum levels of the drug, and in those with genetically-determined lower density of striatal DA D2 receptors. This is the clearest demonstration to date for a causal modulatory role of the DA D2 receptor in choice performance that might be distinct from learning. Our findings challenge current reward prediction error models of reinforcement learning, and suggest that classical animal models emphasizing a role of postsynaptic DA D2 receptors in motivational aspects of reinforcement learning may apply to humans as well.Neuropsychopharmacology accepted article peview online, 09 April 2014; doi:10.1038/npp.2014.84

    Dopamine, reward learning, and active inference

    Get PDF
    Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings

    Temporal-Difference Reinforcement Learning with Distributed Representations

    Get PDF
    Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting “micro-Agents”, each of which has a separate discounting factor (γ). Each µAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (δ) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each µAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments

    Convergent Processing of Both Positive and Negative Motivational Signals by the VTA Dopamine Neuronal Populations

    Get PDF
    Dopamine neurons in the ventral tegmental area (VTA) have been traditionally studied for their roles in reward-related motivation or drug addiction. Here we study how the VTA dopamine neuron population may process fearful and negative experiences as well as reward information in freely behaving mice. Using multi-tetrode recording, we find that up to 89% of the putative dopamine neurons in the VTA exhibit significant activation in response to the conditioned tone that predict food reward, while the same dopamine neuron population also respond to the fearful experiences such as free fall and shake events. The majority of these VTA putative dopamine neurons exhibit suppression and offset-rebound excitation, whereas ∼25% of the recorded putative dopamine neurons show excitation by the fearful events. Importantly, VTA putative dopamine neurons exhibit parametric encoding properties: their firing change durations are proportional to the fearful event durations. In addition, we demonstrate that the contextual information is crucial for these neurons to respectively elicit positive or negative motivational responses by the same conditioned tone. Taken together, our findings suggest that VTA dopamine neurons may employ the convergent encoding strategy for processing both positive and negative experiences, intimately integrating with cues and environmental context
    corecore