51 research outputs found
A Reinforcement Learning Model of Precommitment in Decision Making
Addiction and many other disorders are linked to impulsivity, where a suboptimal choice is preferred when it is immediately available. One solution to impulsivity is precommitment: constraining one's future to avoid being offered a suboptimal choice. A form of impulsivity can be measured experimentally by offering a choice between a smaller reward delivered sooner and a larger reward delivered later. Impulsive subjects are more likely to select the smaller-sooner choice; however, when offered an option to precommit, even impulsive subjects can precommit to the larger-later choice. To precommit or not is a decision between two conditions: (A) the original choice (smaller-sooner vs. larger-later), and (B) a new condition with only larger-later available. It has been observed that precommitment appears as a consequence of the preference reversal inherent in non-exponential delay-discounting. Here we show that most models of hyperbolic discounting cannot precommit, but a distributed model of hyperbolic discounting does precommit. Using this model, we find (1) faster discounters may be more or less likely than slow discounters to precommit, depending on the precommitment delay, (2) for a constant smaller-sooner vs. larger-later preference, a higher ratio of larger reward to smaller reward increases the probability of precommitment, and (3) precommitment is highly sensitive to the shape of the discount curve. These predictions imply that manipulations that alter the discount curve, such as diet or context, may qualitatively affect precommitment
Fast Sequences of Non-spatial State Representations in Humans
SummaryFast internally generated sequences of neural representations are suggested to support learning and online planning. However, these sequences have only been studied in the context of spatial tasks and never in humans. Here, we recorded magnetoencephalography (MEG) while human subjects performed a novel non-spatial reasoning task. The task required selecting paths through a set of six visual objects. We trained pattern classifiers on the MEG activity elicited by direct presentation of the visual objects alone and tested these classifiers on activity recorded during periods when no object was presented. During these object-free periods, the brain spontaneously visited representations of approximately four objects in fast sequences lasting on the order of 120 ms. These sequences followed backward trajectories along the permissible paths in the task. Thus, spontaneous fast sequential representation of states can be measured non-invasively in humans, and these sequences may be a fundamental feature of neural computation across tasks
Generative replay underlies compositional inference in the hippocampal-prefrontal circuit
Human reasoning depends on reusing pieces of information by putting them together in new ways. However, very little is known about how compositional computation is implemented in the brain. Here, we ask participants to solve a series of problems that each require constructing a whole from a set of elements. With fMRI, we find that representations of novel constructed objects in the frontal cortex and hippocampus are relational and compositional. With MEG, we find that replay assembles elements into compounds, with each replay sequence constituting a hypothesis about a possible configuration of elements. The content of sequences evolves as participants solve each puzzle, progressing from predictable to uncertain elements and gradually converging on the correct configuration. Together, these results suggest a computational bridge between apparently distinct functions of hippocampal-prefrontal circuitry and a role for generative replay in compositional inference and hypothesis testing
Distributional reinforcement learning in prefrontal cortex
The prefrontal cortex is crucial for learning and decision-making. Classic reinforcement learning (RL) theories center on learning the expectation of potential rewarding outcomes and explain a wealth of neural data in the prefrontal cortex. Distributional RL, on the other hand, learns the full distribution of rewarding outcomes and better explains dopamine responses. In the present study, we show that distributional RL also better explains macaque anterior cingulate cortex neuronal responses, suggesting that it is a common mechanism for reward-guided learning
Hippocampal-midbrain circuit enhances the pleasure of anticipation in the prefrontal cortex
Having something to look forward to is a keystone of well-being. Anticipation of a future reward, like an upcoming vacation, can be more gratifying than the experience of reward itself. Theories of anticipation have described how it causes behaviors ranging from beneficial information-seeking to harmful addiction. Here, we investigated how the brain generates and enhances anticipatory pleasure, by analyzing brain activity of human participants who received information predictive of future pleasant outcomes in a decision-making task. Using a computational model of anticipation, we show that three regions orchestrate anticipatory pleasure. We show ventromedial prefrontal cortex (vmPFC) tracks the value of anticipation; dopaminergic midbrain responds to information that enhances anticipation, while the sustained activity in hippocampus provides for functional coupling between these regions. This coordinating role for hippocampus is consistent with its known role in the vivid imagination of future outcomes. Our findings throw new light on the neural underpinnings of how anticipation influences decision-making, while also unifying a range of phenomena associated with risk and time-delay preference
Dreading the pain of others? Altruistic responses to others' pain underestimate dread
A dislike of waiting for pain, aptly termed ‘dread’, is so great that people will increase pain to avoid delaying it. However, despite many accounts of altruistic responses to pain in others, no previous studies have tested whether people take delay into account when attempting to ameliorate others' pain. We examined the impact of delay in 2 experiments where participants (total N = 130) specified the intensity and delay of pain either for themselves or another person. Participants were willing to increase the experimental pain of another participant to avoid delaying it, indicative of dread, though did so to a lesser extent than was the case for their own pain. We observed a similar attenuation in dread when participants chose the timing of a hypothetical painful medical treatment for a close friend or relative, but no such attenuation when participants chose for a more distant acquaintance. A model in which altruism is biased to privilege pain intensity over the dread of pain parsimoniously accounts for these findings. We refer to this underestimation of others' dread as a ‘Dread Empathy Gap’
- …