214 research outputs found
Contributions of the ventromedial prefrontal cortex to goal-directed action selection
In this article, it will be argued that one of the key contributions of the ventromedial prefrontal cortex (vmPFC) to goal-directed action selection lies both in retrieving the value of goals that are the putative outcomes of the decision process and in establishing a relative preference ranking for these goals by taking into account the value of each of the different goals under consideration in a given decision-making scenario. These goal-value signals are then suggested to be used as an input into the on-line computation of action values mediated by brain regions outside of the vmPFC, such as parts of the parietal cortex, supplementary motor cortex, and dorsal striatum. Collectively, these areas can be considered to be constituent elements of a multistage decision process whereby the values of different goals must first be represented and ranked before the value of different courses of action available for the pursuit of those goals can be computed
Overlapping Prediction Errors in Dorsal Striatum During Instrumental Learning With Juice and Money Reward in the Human Brain
Prediction error signals have been reported in human imaging studies in target areas of dopamine neurons such as ventral and dorsal striatum during learning with many different types of reinforcers. However, a key question that has yet to be addressed is whether prediction error signals recruit distinct or overlapping regions of striatum and elsewhere during learning with different types of reward. To address this, we scanned 17 healthy subjects with functional magnetic resonance imaging while they chose actions to obtain either a pleasant juice reward (1 ml apple juice), or a monetary gain (5 cents) and applied a computational reinforcement learning model to subjects' behavioral and imaging data. Evidence for an overlapping prediction error signal during learning with juice and money rewards was found in a region of dorsal striatum (caudate nucleus), while prediction error signals in a subregion of ventral striatum were significantly stronger during learning with money but not juice reward. These results provide evidence for partially overlapping reward prediction signals for different types of appetitive reinforcers within the striatum, a finding with important implications for understanding the nature of associative encoding in the striatum as a function of reinforcer type
Human Dorsal Striatal Activity during Choice Discriminates Reinforcement Learning Behavior from the Gambler’s Fallacy
Reinforcement learning theory has generated substantial interest in neurobiology, particularly because of the resemblance between phasic dopamine and reward prediction errors. Actor–critic theories have been adapted to account for the functions of the striatum, with parts of the dorsal striatum equated to the actor. Here, we specifically test whether the human dorsal striatum—as predicted by an actor–critic instantiation—is used on a trial-to-trial basis at the time of choice to choose in accordance with reinforcement learning theory, as opposed to a competing strategy: the gambler's fallacy. Using a partial-brain functional magnetic resonance imaging scanning protocol focused on the striatum and other ventral brain areas, we found that the dorsal striatum is more active when choosing consistent with reinforcement learning compared with the competing strategy. Moreover, an overlapping area of dorsal striatum along with the ventral striatum was found to be correlated with reward prediction errors at the time of outcome, as predicted by the actor–critic framework. These findings suggest that the same region of dorsal striatum involved in learning stimulus–response associations may contribute to the control of behavior during choice, thereby using those learned associations. Intriguingly, neither reinforcement learning nor the gambler's fallacy conformed to the optimal choice strategy on the specific decision-making task we used. Thus, the dorsal striatum may contribute to the control of behavior according to reinforcement learning even when the prescriptions of such an algorithm are suboptimal in terms of maximizing future rewards
Appetitive and Aversive Goal Values Are Encoded in the Medial Orbitofrontal Cortex at the Time of Decision Making
An essential feature of choice is the assignment of goal values (GVs) to the different options under consideration at the time of decision making. This computation is done when choosing among appetitive and aversive items. Several groups have studied the location of GV computations for appetitive stimuli, but the problem of valuation in aversive contexts at the time of decision making has been ignored. Thus, although dissociations between appetitive and aversive components of value signals have been shown in other domains such as anticipatory and outcome values, it is not known whether appetitive and aversive GVs are computed in similar brain regions or in separate ones. We investigated this question using two different functional magnetic resonance imaging studies while human subjects placed real bids in an economic auction for the right to eat/avoid eating liked/disliked foods. We found that activity in a common area of the medial orbitofrontal cortex and the dorsolateral prefrontal cortex correlated with both appetitive and aversive GVs. These findings suggest that these regions might form part of a common network
The problem with value
Neural correlates of value have been extensively reported in a diverse set of brain regions. However, in many cases it is difficult to determine whether a particular neural response pattern corresponds to a value-signal per se as opposed to an array of alternative non-value related processes, such as outcome-identity coding, informational coding, encoding of autonomic and skeletomotor consequences, alongside previously described “salience” or “attentional” effects. Here, I review a number of experimental manipulations that can be used to test for value, and I identify the challenges in ascertaining whether a particular neural response is or is not a value signal. Finally, I emphasize that some non-value related signals may be especially informative as a means of providing insight into the nature of the decision-making related computations that are being implemented in a particular brain region
A Neuro-computational Account of Arbitration between Choice Imitation and Goal Emulation during Human Observational Learning
When individuals learn from observing the behavior of others, they deploy at least two distinct strategies. Choice imitation involves repeating other agents’ previous actions, whereas emulation proceeds from inferring their goals and intentions. Despite the prevalence of observational learning in humans and other social animals, a fundamental question remains unaddressed: how does the brain decide which strategy to use in a given situation? In two fMRI studies (the second a pre-registered replication of the first), we identify a neuro-computational mechanism underlying arbitration between choice imitation and goal emulation. Computational modeling, combined with a behavioral task that dissociated the two strategies, revealed that control over behavior was adaptively and dynamically weighted toward the most reliable strategy. Emulation reliability, the model’s arbitration signal, was represented in the ventrolateral prefrontal cortex, temporoparietal junction, and rostral cingulate cortex. Our replicated findings illuminate the computations by which the brain decides to imitate or emulate others
The Decision Value Computations in the vmPFC and Striatum Use a Relative Value Code That is Guided by Visual Attention
There is a growing consensus in behavioral neuroscience that the brain makes simple choices by first assigning a value to the options under consideration and then comparing them. Two important open questions are whether the brain encodes absolute or relative value signals, and what role attention might play in these computations.Weinvestigated these questions using a human fMRI experiment with
a binary choice task in which the fixations to both stimuli were exogenously manipulated to control for the role of visual attention in the valuation computation. We found that the ventromedial prefrontal cortex and the ventral striatum encoded fixation-dependent relative value signals: activity in these areas correlated with the difference in value between the attended and the unattended items. These attention-modulated relative value signals might serve as the input of a comparator system that is used to make a choice
Contributions of the striatum to learning, motivation, and performance: an associative account
It has long been recognized that the striatum is composed of distinct functional sub-units that are part of multiple cortico-striatal-thalamic circuits. Contemporary research has focused on the contribution of striatal sub-regions to three main phenomena: learning of associations between stimuli, actions and rewards; selection between competing response alternatives; and motivational modulation of motor behavior. Recent proposals have argued for a functional division of the striatum along these lines, attributing, for example, learning to one region and performance to another. Here, we consider empirical data from human and animal studies, as well as theoretical notions from both the psychological and computational literatures, and conclude that striatal sub-regions instead differ most clearly in terms of the associations being encoded in each region
Insights from the application of computational neuroimaging to social neuroscience
A recent approach in social neuroscience has been the application of formal computational models for a particular social-cognitive process to neuroimaging data. Here we review preliminary findings from this nascent subfield, focusing on observational learning and strategic interactions. We present evidence consistent with the existence of three distinct learning systems that may contribute to social cognition: an observational-reward-learning system involved in updating expectations of future reward based on observing rewards obtained by others, an action-observational learning system involved in learning about the action tendencies of others, and a third system engaged when it is necessary to learn about the hidden mental-states or traits of another. These three systems appear to map onto distinct neuroanatomical substrates, and depend on unique computational signals
- …