11 research outputs found

    Dopamine increases risky choice while D2 blockade shortens decision time

    Get PDF
    Dopamine is crucially involved in decision-making and overstimulation within dopaminergic pathways can lead to impulsive behaviour, including a desire to take risks and reduced deliberation before acting. These behavioural changes are side effects of treatment with dopaminergic drugs in Parkinson disease, but their likelihood of occurrence is difficult to predict and may be influenced by the individual’s baseline endogenous dopamine state, and indeed correlate with sensation-seeking personality traits. We here collected data on a standard gambling task in healthy volunteers given either placebo, 2.5 mg of the dopamine antagonist haloperidol or 100/25 mg of the dopamine precursor levodopa in a within-subject design. We found an increase in risky choices on levodopa. Choices were, however, made faster on haloperidol with no effect of levodopa on deliberation time. Shortened deliberation times on haloperidol occurred in low sensation-seekers only, suggesting a correlation between sensation-seeking personality trait and baseline dopamine levels. We hypothesise that levodopa increases risk-taking behaviour via overstimulation at both D1 and D2 receptor level, while a single low dose of haloperidol, as previously reported (Frank and O’Reilly 2006), may block D2 receptors pre- and post-synaptically and may paradoxically lead to higher striatal dopamine acting on remaining striatal D1 receptors, causing speedier decision without influencing risk tolerance. These effects could also fit with a recently proposed computational model of the basal ganglia (Moeller and Bogacz 2019; Moeller et al. 2021). Furthermore, our data suggest that the actual dopaminergic drug effect may be dependent on the individual’s baseline dopamine state, which may influence our therapeutic decision as clinicians in the future

    Foraging for foundations in decision neuroscience: insights from ethology

    Get PDF
    Modern decision neuroscience offers a powerful and broad account of human behaviour using computational techniques that link psychological and neuroscientific approaches to the ways that individuals can generate near-optimal choices in complex controlled environments. However, until recently, relatively little attention has been paid to the extent to which the structure of experimental environments relates to natural scenarios, and the survival problems that individuals have evolved to solve. This situation not only risks leaving decision-theoretic accounts ungrounded but also makes various aspects of the solutions, such as hard-wired or Pavlovian policies, difficult to interpret in the natural world. Here, we suggest importing concepts, paradigms and approaches from the fields of ethology and behavioural ecology, which concentrate on the contextual and functional correlates of decisions made about foraging and escape and address these lacunae

    Foraging for foundations in decision neuroscience: insights from ethology

    Get PDF
    Modern decision neuroscience offers a powerful and broad account of human behaviour using computational techniques that link psychological and neuroscientific approaches to the ways that individuals can generate near-optimal choices in complex controlled environments. However, until recently, relatively little attention has been paid to the extent to which the structure of experimental environments relates to natural scenarios, and the survival problems that individuals have evolved to solve. This situation not only risks leaving decision-theoretic accounts ungrounded but also makes various aspects of the solutions, such as hard-wired or Pavlovian policies, difficult to interpret in the natural world. Here, we suggest importing concepts, paradigms and approaches from the fields of ethology and behavioural ecology, which concentrate on the contextual and functional correlates of decisions made about foraging and escape and address these lacunae

    Uncertainty-guided learning with scaled prediction errors in the basal ganglia

    Get PDF
    To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning

    Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration

    Get PDF
    When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning

    Behavioral Paradigms to Probe Individual Mouse Differences in Value-Based Decision Making

    Get PDF
    Value-based decision making relies on distributed neural systems that weigh the benefits of actions against the cost required to obtain a given outcome. Perturbations of these systems are thought to underlie abnormalities in action selection seen across many neuropsychiatric disorders. Genetic tools in mice provide a promising opportunity to explore the cellular components of these systems and their molecular foundations. However, few tasks have been designed that robustly characterize how individual mice integrate differential reward benefits and cost in their selection of actions. Here we present a forced-choice, two-alternative task in which each option is associated with a specific reward outcome, and unique operant contingency. We employed global and individual trial measures to assess the choice patterns and behavioral flexibility of mice in response to differing “choice benefits” (modeled as varying reward magnitude ratios) and different modalities of “choice cost” (modeled as either increasing repetitive motor output to obtain reward or increased delay to reward delivery). We demonstrate that (1) mouse choice is highly sensitive to the relative benefit of outcomes; (2) choice costs are heavily discounted in environments with large discrepancies in relative reward; (3) divergent cost modalities are differentially integrated into action selection; (4) individual mouse sensitivity to reward benefit is correlated with sensitivity to reward costs. These paradigms reveal stable individual animal differences in value-based action selection, thereby providing a foundation for interrogating the neural circuit and molecular pathophysiology of goal-directed dysfunction

    Learning Reward Uncertainty in the Basal Ganglia

    No full text
    Learning the reliability of different sources of rewards is critical for making optimal choices. However, despite the existence of detailed theory describing how the expected reward is learned in the basal ganglia, it is not known how reward uncertainty is estimated in these circuits. This paper presents a class of models that encode both the mean reward and the spread of the rewards, the former in the difference between the synaptic weights of D1 and D2 neurons, and the latter in their sum. In the models, the tendency to seek (or avoid) options with variable reward can be controlled by increasing (or decreasing) the tonic level of dopamine. The models are consistent with the physiology of and synaptic plasticity in the basal ganglia, they explain the effects of dopaminergic manipulations on choices involving risks, and they make multiple experimental predictions
    corecore