120 research outputs found

    Uncertainty-guided learning with scaled prediction errors in the basal ganglia

    Get PDF
    To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning

    Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration

    Get PDF
    When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning

    Hunger improves reinforcement-driven but not planned action

    Get PDF
    Human decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system's primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalized effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits

    Computed tomography guided laser ablation of osteoid osteoma: a study of 30 cases

    Get PDF
    Background: Osteoid osteoma (OO) is a benign but painful bone lesion that primarily occurs in children and young adults 1. Male:Female ratio is 3:1. The aim of the study was to present our experience of CT guided LASER  ablation  of  radiologicaly proven Osteoid osteomas  in the various bones.Methods: Over the period of 5 years 30 cases of osteoid osteomas in various bones diagnosed on various modalities were treated by CT guided LASER ablation. Bone wise distribution of cases was spine (3), upper end of femur (11), lower end of femur (6), upper end of tibia (4), upper end of humerus (3), lower end of radius (2) and calcaneum (1). 22 patients were treated under spinal and regional anesthesia and 8 patients were treated under short general anesthesia. All the patients were treated on day care basis. The LASER fiber was inserted in the nidus under CT guidance through bone biopsy needle and 1800 joules energy delivered in the lesion continuous mode.Results: 29 (96%) patients have complete relief of pain in twenty-four hours after LASER ablation, One week after treatment all 30 patients were pain free. No neurologic complication was observed in any of our patients with spinal osteoid osteomas.Conclusions: CT guided LASER ablation is a safe, simple and effective method of treatment for osteoid osteoma

    Dopamine increases risky choice while D2 blockade shortens decision time

    Get PDF
    Dopamine is crucially involved in decision-making and overstimulation within dopaminergic pathways can lead to impulsive behaviour, including a desire to take risks and reduced deliberation before acting. These behavioural changes are side effects of treatment with dopaminergic drugs in Parkinson disease, but their likelihood of occurrence is difficult to predict and may be influenced by the individual’s baseline endogenous dopamine state, and indeed correlate with sensation-seeking personality traits. We here collected data on a standard gambling task in healthy volunteers given either placebo, 2.5 mg of the dopamine antagonist haloperidol or 100/25 mg of the dopamine precursor levodopa in a within-subject design. We found an increase in risky choices on levodopa. Choices were, however, made faster on haloperidol with no effect of levodopa on deliberation time. Shortened deliberation times on haloperidol occurred in low sensation-seekers only, suggesting a correlation between sensation-seeking personality trait and baseline dopamine levels. We hypothesise that levodopa increases risk-taking behaviour via overstimulation at both D1 and D2 receptor level, while a single low dose of haloperidol, as previously reported (Frank and O’Reilly 2006), may block D2 receptors pre- and post-synaptically and may paradoxically lead to higher striatal dopamine acting on remaining striatal D1 receptors, causing speedier decision without influencing risk tolerance. These effects could also fit with a recently proposed computational model of the basal ganglia (Moeller and Bogacz 2019; Moeller et al. 2021). Furthermore, our data suggest that the actual dopaminergic drug effect may be dependent on the individual’s baseline dopamine state, which may influence our therapeutic decision as clinicians in the future

    Dopamine promotes instrumental motivation, but reduces reward-related vigour.

    Get PDF
    We can be motivated when reward depends on performance, or merely by the prospect of a guaranteed reward. Performance-dependent (contingent) reward is instrumental, relying on an internal action-outcome model, whereas motivation by guaranteed reward may minimise opportunity cost in reward-rich environments. Competing theories propose that each type of motivation should be dependent on dopaminergic activity. We contrasted these two types of motivation with a rewarded saccade task, in patients with Parkinson's disease (PD). When PD patients were ON dopamine, they had greater response vigour (peak saccadic velocity residuals) for contingent rewards, whereas when PD patients were OFF medication, they had greater vigour for guaranteed rewards. These results support the view that reward expectation and contingency drive distinct motivational processes, and can be dissociated by manipulating dopaminergic activity. We posit that dopamine promotes goal-directed motivation, but dampens reward-driven vigour, contradictory to the prediction that increased tonic dopamine amplifies reward expectation

    Dopamine D2 receptor stimulation modulates the balance between ignoring and updating according to baseline working memory ability

    Get PDF
    BACKGROUND:Working memory (WM) deficits in neuropsychiatric disorders have often been attributed to altered dopaminergic signalling. Specifically, D2 receptor stimulation is thought to affect the ease with which items can be gated into and out of WM. In addition, this effect has been hypothesised to vary according to baseline WM ability, a putative index of dopamine synthesis levels. Moreover, whether D2 stimulation affects WM vicariously through modulating relatively WM-free cognitive control processes has not been explored. AIMS:We examined the effect of administering a dopamine agonist on the ability to ignore or update information in WM. METHOD:A single dose of cabergoline (1 mg) was administered to healthy older adult humans in a within-subject, double-blind, placebo-controlled study. In addition, we obtained measures of baseline WM ability and relatively WM-free cognitive control (overcoming response conflict). RESULTS:Consistent with predictions, baseline WM ability significantly modulated the effect that drug administration had on the proficiency of ignoring and updating. High-WM individuals were relatively better at ignoring compared to updating after drug administration. Whereas the opposite occurred in low-WM individuals. Although the ability to overcome response conflict was not affected by cabergoline, a negative relationship between the effect the drug had on response conflict performance and ignoring was observed. Thus, both response conflict and ignoring are coupled to dopaminergic stimulation levels. CONCLUSIONS:Cumulatively, these results provide evidence that dopamine affects subcomponents of cognitive control in a diverse, antagonistic fashion and that the direction of these effects is dependent upon baseline WM

    N-methyl-D-aspartate receptor- antibody encephalitis impairs maintenance of attention to items in working memory

    Get PDF
    NMDA receptors (NMDAR) may be crucial to working memory (WM). Computational models predict that they sustain neural firing and produce associative memory, which may underpin maintaining and binding information respectively. We test this in patients with antibodies to NMDAR (n=10, female) and compare them with healthy control participants (n=55, 20 male, 35 female). Patients were tested after recovery with a task that separates two aspects of WM: sustaining attention and feature binding. Participants had to remember two colored arrows. Then attention was directed to one of them. After a variable delay, they reported the direction of either the same arrow (congruent cue), or of the other arrow (incongruent cue). We asked how congruency affected recall precision and measured types of error. Patients had difficulty in both sustaining attention to an item over time and feature binding. Controls were less precise after longer delays and incongruent cues. In contrast, patients did not benefit from congruent cues at longer delays (Group x Congruency [long condition], p=0.041), indicating they could not sustain attention. Additionally, patients reported the wrong item (misbinding errors) more than controls after congruent cues (Group x Delay [congruent condition], main effect of group, p=Significance Statement Computational theories suggest NMDA receptors (NMDARs) are critical for actively maintaining information, while other theories propose they allow us to associate or "bind" objects features together. This is the first causal test in humans of the role of NMDARs in actively maintaining attention in working memory and feature binding. We find patients have difficulty with both these processes in support of computational models. Notably, we demonstrate that patients with NMDA receptor-antibody encephalitis are an ideal model condition to study roles of receptors in human cognition. Secondly, few studies follow these patients long after treatment. Our findings demonstrate a specific long-term neuropsychological deficit, previously unreported to our knowledge, that highlights the need for greater focus on neurocognitive rehabilitation with these patients

    A new toolbox to distinguish the sources of spatial memory error

    Get PDF
    • …
    corecore