4,101 research outputs found

    Basal ganglia role in learning rewarded actions and executing previously learned choices: Healthy and diseased states

    Get PDF
    The basal ganglia (BG) is a collection of nuclei located deep beneath the cerebral cortex that is involved in learning and selection of rewarded actions. Here, we analyzed BG mechanisms that enable these functions. We implemented a rate model of a BG-thalamo-cortical loop and simulated its performance in a standard action selection task. We have shown that potentiation of corticostriatal synapses enables learning of a rewarded option. However, these synapses became redundant later as direct connections between prefrontal and premotor cortices (PFC-PMC) were potentiated by Hebbian learning. After we switched the reward to the previously unrewarded option (reversal), the BG was again responsible for switching to the new option. Due to the potentiated direct cortical connections, the system was biased to the previously rewarded choice, and establishing the new choice required a greater number of trials. Guided by physiological research, we then modified our model to reproduce pathological states of mild Parkinson's and Huntington's diseases. We found that in the Parkinsonian state PMC activity levels become extremely variable, which is caused by oscillations arising in the BG-thalamo-cortical loop. The model reproduced severe impairment of learning and predicted that this is caused by these oscillations as well as a reduced reward prediction signal. In the Huntington state, the potentiation of the PFC-PMC connections produced better learning, but altered BG output disrupted expression of the rewarded choices. This resulted in random switching between rewarded and unrewarded choices resembling an exploratory phase that never ended. Along with other computational studies, our results further reconcile the apparent contradiction between the critical involvement of the BG in execution of previously learned actions and yet no impairment of these actions after BG output is ablated by lesions or deep brain stimulation. We predict that the cortico-BG-thalamo-cortical loop conforms to previously learned choice in healthy conditions, but impedes those choices in disease states

    A hypothesis on improving foreign accents by optimizing variability in vocal learning brain circuits

    Get PDF
    Rapid vocal motor learning is observed when acquiring a language in early childhood, or learning to speak another language later in life. Accurate pronunciation is one of the hardest things for late learners to master and they are almost always left with a non-native accent. Here I propose a novel hypothesis that this accent could be improved by optimizing variability in vocal learning brain circuits during learning. Much of the neurobiology of human vocal motor learning has been inferred from studies on songbirds. Jarvis (2004) proposed the hypothesis that as in songbirds there are two pathways in humans: one for learning speech (the striatal vocal learning pathway), and one for production of previously learnt speech (the motor pathway). Learning new motor sequences necessary for accurate non-native pronunciation is challenging and I argue that in late learners of a foreign language the vocal learning pathway becomes inactive prematurely. The motor pathway is engaged once again and learners maintain their original native motor patterns for producing speech, resulting in speaking with a foreign accent. Further, I argue that variability in neural activity within vocal motor circuitry generates vocal variability that supports accurate non-native pronunciation. Recent theoretical and experimental work on motor learning suggests that variability in the motor movement is necessary for the development of expertise. I propose that there is little trial-by-trial variability when using the motor pathway. When using the vocal learning pathway variability gradually increases, reflecting an exploratory phase in which learners try out different ways of pronouncing words, before decreasing and stabilizing once the ‘best’ performance has been identified. The hypothesis proposed here could be tested using behavioral interventions that optimize variability and engage the vocal learning pathway for longer, with the prediction that this would allow learners to develop new motor patterns that result in more native-like pronunciation

    Time representation in reinforcement learning models of the basal ganglia

    Get PDF
    Reinforcement learning (RL) models have been influential in understanding many aspects of basal ganglia function, from reward prediction to action selection. Time plays an important role in these models, but there is still no theoretical consensus about what kind of time representation is used by the basal ganglia. We review several theoretical accounts and their supporting evidence. We then discuss the relationship between RL models and the timing mechanisms that have been attributed to the basal ganglia. We hypothesize that a single computational system may underlie both RL and interval timing—the perception of duration in the range of seconds to hours. This hypothesis, which extends earlier models by incorporating a time-sensitive action selection mechanism, may have important implications for understanding disorders like Parkinson's disease in which both decision making and timing are impaired

    A biologically inspired meta-control navigation system for the Psikharpax rat robot

    Get PDF
    A biologically inspired navigation system for the mobile rat-like robot named Psikharpax is presented, allowing for self-localization and autonomous navigation in an initially unknown environment. The ability of parts of the model (e. g. the strategy selection mechanism) to reproduce rat behavioral data in various maze tasks has been validated before in simulations. But the capacity of the model to work on a real robot platform had not been tested. This paper presents our work on the implementation on the Psikharpax robot of two independent navigation strategies (a place-based planning strategy and a cue-guided taxon strategy) and a strategy selection meta-controller. We show how our robot can memorize which was the optimal strategy in each situation, by means of a reinforcement learning algorithm. Moreover, a context detector enables the controller to quickly adapt to changes in the environment-recognized as new contexts-and to restore previously acquired strategy preferences when a previously experienced context is recognized. This produces adaptivity closer to rat behavioral performance and constitutes a computational proposition of the role of the rat prefrontal cortex in strategy shifting. Moreover, such a brain-inspired meta-controller may provide an advancement for learning architectures in robotics

    Saccade learning with concurrent cortical and subcortical basal ganglia loops

    Get PDF
    The Basal Ganglia is a central structure involved in multiple cortical and subcortical loops. Some of these loops are believed to be responsible for saccade target selection. We study here how the very specific structural relationships of these saccadic loops can affect the ability of learning spatial and feature-based tasks. We propose a model of saccade generation with reinforcement learning capabilities based on our previous basal ganglia and superior colliculus models. It is structured around the interactions of two parallel cortico-basal loops and one tecto-basal loop. The two cortical loops separately deal with spatial and non-spatial information to select targets in a concurrent way. The subcortical loop is used to make the final target selection leading to the production of the saccade. These different loops may work in concert or disturb each other regarding reward maximization. Interactions between these loops and their learning capabilities are tested on different saccade tasks. The results show the ability of this model to correctly learn basic target selection based on different criteria (spatial or not). Moreover the model reproduces and explains training dependent express saccades toward targets based on a spatial criterion. Finally, the model predicts that in absence of prefrontal control, the spatial loop should dominate

    Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration

    Get PDF
    When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning

    Hierarchical control over effortful behavior by rodent medial frontal cortex : a computational model

    Get PDF
    The anterior cingulate cortex (ACC) has been the focus of intense research interest in recent years. Although separate theories relate ACC function variously to conflict monitoring, reward processing, action selection, decision making, and more, damage to the ACC mostly spares performance on tasks that exercise these functions, indicating that they are not in fact unique to the ACC. Further, most theories do not address the most salient consequence of ACC damage: impoverished action generation in the presence of normal motor ability. In this study we develop a computational model of the rodent medial prefrontal cortex that accounts for the behavioral sequelae of ACC damage, unifies many of the cognitive functions attributed to it, and provides a solution to an outstanding question in cognitive control research-how the control system determines and motivates what tasks to perform. The theory derives from recent developments in the formal study of hierarchical control and learning that highlight computational efficiencies afforded when collections of actions are represented based on their conjoint goals. According to this position, the ACC utilizes reward information to select tasks that are then accomplished through top-down control over action selection by the striatum. Computational simulations capture animal lesion data that implicate the medial prefrontal cortex in regulating physical and cognitive effort. Overall, this theory provides a unifying theoretical framework for understanding the ACC in terms of the pivotal role it plays in the hierarchical organization of effortful behavior
    corecore