20,683 research outputs found
Dopaminergic Balance between Reward Maximization and Policy Complexity
Previous reinforcement-learning models of the basal ganglia network have highlighted the role of dopamine in encoding the mismatch between prediction and reality. Far less attention has been paid to the computational goals and algorithms of the main-axis (actor). Here, we construct a top-down model of the basal ganglia with emphasis on the role of dopamine as both a reinforcement learning signal and as a pseudo-temperature signal controlling the general level of basal ganglia excitability and motor vigilance of the acting agent. We argue that the basal ganglia endow the thalamic-cortical networks with the optimal dynamic tradeoff between two constraints: minimizing the policy complexity (cost) and maximizing the expected future reward (gain). We show that this multi-dimensional optimization processes results in an experience-modulated version of the softmax behavioral policy. Thus, as in classical softmax behavioral policies, probability of actions are selected according to their estimated values and the pseudo-temperature, but in addition also vary according to the frequency of previous choices of these actions. We conclude that the computational goal of the basal ganglia is not to maximize cumulative (positive and negative) reward. Rather, the basal ganglia aim at optimization of independent gain and cost functions. Unlike previously suggested single-variable maximization processes, this multi-dimensional optimization process leads naturally to a softmax-like behavioral policy. We suggest that beyond its role in the modulation of the efficacy of the cortico-striatal synapses, dopamine directly affects striatal excitability and thus provides a pseudo-temperature signal that modulates the tradeoff between gain and cost. The resulting experience and dopamine modulated softmax policy can then serve as a theoretical framework to account for the broad range of behaviors and clinical states governed by the basal ganglia and dopamine systems
Remembering Forward: Neural Correlates of Memory and Prediction in Human Motor Adaptation
We used functional MR imaging (FMRI), a robotic manipulandum and systems identification techniques to examine neural correlates of predictive compensation for spring-like loads during goal-directed wrist movements in neurologically-intact humans. Although load changed unpredictably from one trial to the next, subjects nevertheless used sensorimotor memories from recent movements to predict and compensate upcoming loads. Prediction enabled subjects to adapt performance so that the task was accomplished with minimum effort. Population analyses of functional images revealed a distributed, bilateral network of cortical and subcortical activity supporting predictive load compensation during visual target capture. Cortical regions – including prefrontal, parietal and hippocampal cortices – exhibited trial-by-trial fluctuations in BOLD signal consistent with the storage and recall of sensorimotor memories or “states” important for spatial working memory. Bilateral activations in associative regions of the striatum demonstrated temporal correlation with the magnitude of kinematic performance error (a signal that could drive reward-optimizing reinforcement learning and the prospective scaling of previously learned motor programs). BOLD signal correlations with load prediction were observed in the cerebellar cortex and red nuclei (consistent with the idea that these structures generate adaptive fusimotor signals facilitating cancelation of expected proprioceptive feedback, as required for conditional feedback adjustments to ongoing motor commands and feedback error learning). Analysis of single subject images revealed that predictive activity was at least as likely to be observed in more than one of these neural systems as in just one. We conclude therefore that motor adaptation is mediated by predictive compensations supported by multiple, distributed, cortical and subcortical structures
Recommended from our members
Value encoding in the globus pallidus: fMRI reveals an interaction effect between reward and dopamine drive
The external part of the globus pallidus (GPe) is a core nucleus of the basal ganglia (BG) whose activity is disrupted under conditions of low dopamine release, as in Parkinson's disease. Current models assume decreased dopamine release in the dorsal striatum results in deactivation of dorsal GPe, which in turn affects motor expression via a regulatory effect on other nuclei of the BG. However, recent studies in healthy and pathological animal models have reported neural dynamics that do not match with this view of the GPe as a relay in the BG circuit. Thus, the computational role of the GPe in the BG is still to be determined. We previously proposed a neural model that revisits the functions of the nuclei of the BG, and this model predicts that GPe encodes values which are amplified under a condition of low striatal dopaminergic drive. To test this prediction, we used an fMRI paradigm involving a within-subject placebo-controlled design, using the dopamine antagonist risperidone, wherein healthy volunteers performed a motor selection and maintenance task under low and high reward conditions. ROI-based fMRI analysis revealed an interaction between reward and dopamine drive manipulations, with increased BOLD activity in GPe in a high compared to low reward condition, and under risperidone compared to placebo. These results confirm the core prediction of our computational model, and provide a new perspective on neural dynamics in the BG and their effects on motor selection and cognitive disorders
Neurosystems: brain rhythms and cognitive processing
Neuronal rhythms are ubiquitous features of brain dynamics, and are highly correlated with cognitive processing. However, the relationship between the physiological mechanisms producing these rhythms and the functions associated with the rhythms remains mysterious. This article investigates the contributions of rhythms to basic cognitive computations (such as filtering signals by coherence and/or frequency) and to major cognitive functions (such as attention and multi-modal coordination). We offer support to the premise that the physiology underlying brain rhythms plays an essential role in how these rhythms facilitate some cognitive operations.098352 - Wellcome Trust; 5R01NS067199 - NINDS NIH HH
Overcoming status quo bias in the human brain
Humans often accept the status quo when faced with conflicting choice alternatives. However, it is unknown how neural pathways connecting cognition with action modulate this status quo acceptance. Here we developed a visual detection task in which subjects tended to favor the default when making difficult, but not easy, decisions. This bias was suboptimal in that more errors were made when the default was accepted. A selective increase in subthalamic nucleus (STN) activity was found when the status quo was rejected in the face of heightened decision difficulty. Analysis of effective connectivity showed that inferior frontal cortex, a region more active for difficult decisions, exerted an enhanced modulatory influence on the STN during switches away from the status quo. These data suggest that the neural circuits required to initiate controlled, nondefault actions are similar to those previously shown to mediate outright response suppression. We conclude that specific prefrontal-basal ganglia dynamics are involved in rejecting the default, a mechanism that may be important in a range of difficult choice scenarios
Basal ganglia role in learning rewarded actions and executing previously learned choices: Healthy and diseased states
The basal ganglia (BG) is a collection of nuclei located deep beneath the cerebral cortex that is involved in learning and selection of rewarded actions. Here, we analyzed BG mechanisms that enable these functions. We implemented a rate model of a BG-thalamo-cortical loop and simulated its performance in a standard action selection task. We have shown that potentiation of corticostriatal synapses enables learning of a rewarded option. However, these synapses became redundant later as direct connections between prefrontal and premotor cortices (PFC-PMC) were potentiated by Hebbian learning. After we switched the reward to the previously unrewarded option (reversal), the BG was again responsible for switching to the new option. Due to the potentiated direct cortical connections, the system was biased to the previously rewarded choice, and establishing the new choice required a greater number of trials. Guided by physiological research, we then modified our model to reproduce pathological states of mild Parkinson's and Huntington's diseases. We found that in the Parkinsonian state PMC activity levels become extremely variable, which is caused by oscillations arising in the BG-thalamo-cortical loop. The model reproduced severe impairment of learning and predicted that this is caused by these oscillations as well as a reduced reward prediction signal. In the Huntington state, the potentiation of the PFC-PMC connections produced better learning, but altered BG output disrupted expression of the rewarded choices. This resulted in random switching between rewarded and unrewarded choices resembling an exploratory phase that never ended. Along with other computational studies, our results further reconcile the apparent contradiction between the critical involvement of the BG in execution of previously learned actions and yet no impairment of these actions after BG output is ablated by lesions or deep brain stimulation. We predict that the cortico-BG-thalamo-cortical loop conforms to previously learned choice in healthy conditions, but impedes those choices in disease states
- …