1,620 research outputs found

    A Reinforcement Learning Model of Precommitment in Decision Making

    Get PDF
    Addiction and many other disorders are linked to impulsivity, where a suboptimal choice is preferred when it is immediately available. One solution to impulsivity is precommitment: constraining one's future to avoid being offered a suboptimal choice. A form of impulsivity can be measured experimentally by offering a choice between a smaller reward delivered sooner and a larger reward delivered later. Impulsive subjects are more likely to select the smaller-sooner choice; however, when offered an option to precommit, even impulsive subjects can precommit to the larger-later choice. To precommit or not is a decision between two conditions: (A) the original choice (smaller-sooner vs. larger-later), and (B) a new condition with only larger-later available. It has been observed that precommitment appears as a consequence of the preference reversal inherent in non-exponential delay-discounting. Here we show that most models of hyperbolic discounting cannot precommit, but a distributed model of hyperbolic discounting does precommit. Using this model, we find (1) faster discounters may be more or less likely than slow discounters to precommit, depending on the precommitment delay, (2) for a constant smaller-sooner vs. larger-later preference, a higher ratio of larger reward to smaller reward increases the probability of precommitment, and (3) precommitment is highly sensitive to the shape of the discount curve. These predictions imply that manipulations that alter the discount curve, such as diet or context, may qualitatively affect precommitment

    Fast Sequences of Non-spatial State Representations in Humans

    Get PDF
    SummaryFast internally generated sequences of neural representations are suggested to support learning and online planning. However, these sequences have only been studied in the context of spatial tasks and never in humans. Here, we recorded magnetoencephalography (MEG) while human subjects performed a novel non-spatial reasoning task. The task required selecting paths through a set of six visual objects. We trained pattern classifiers on the MEG activity elicited by direct presentation of the visual objects alone and tested these classifiers on activity recorded during periods when no object was presented. During these object-free periods, the brain spontaneously visited representations of approximately four objects in fast sequences lasting on the order of 120 ms. These sequences followed backward trajectories along the permissible paths in the task. Thus, spontaneous fast sequential representation of states can be measured non-invasively in humans, and these sequences may be a fundamental feature of neural computation across tasks

    The value of what’s to come: Neural mechanisms coupling prediction error and the utility of anticipation

    Get PDF
    Having something to look forward to is a keystone of well-being. Anticipation of future reward, such as an upcoming vacation, can often be more gratifying than the experience itself. Theories suggest the utility of anticipation underpins various behaviors, ranging from beneficial information-seeking to harmful addiction. However, how neural systems compute anticipatory utility remains unclear. We analyzed the brain activity of human participants as they performed a task involving choosing whether to receive information predictive of future pleasant outcomes. Using a computational model, we show three brain regions orchestrate anticipatory utility. Specifically, ventromedial prefrontal cortex tracks the value of anticipatory utility, dopaminergic midbrain correlates with information that enhances anticipation, while sustained hippocampal activity mediates a functional coupling between these regions. Our findings suggest a previously unidentified neural underpinning for anticipation’s influence over decision-making and unify a range of phenomena associated with risk and time-delay preference

    Social training reconfigures prediction errors to shape Self-Other boundaries

    Get PDF
    Selectively attributing beliefs to specific agents is core to reasoning about other people and imagining oneself in different states. Evidence suggests humans might achieve this by simulating each other’s computations in agent-specific neural circuits, but it is not known how circuits become agent-specific. Here we investigate whether agent-specificity adapts to social context. We train subjects on social learning tasks, manipulating the frequency with which self and other see the same information. Training alters the agent-specificity of prediction error (PE) circuits for at least 24 h, modulating the extent to which another agent’s PE is experienced as one’s own and influencing perspective-taking in an independent task. Ventromedial prefrontal myelin density, indexed by magnetisation transfer, correlates with the strength of this adaptation. We describe a frontotemporal learning network, which exploits relationships between different agents’ computations. Our findings suggest that Self-Other boundaries are learnable variables, shaped by the statistical structure of social experience

    Social discounting of pain

    Get PDF
    Impatience can be formalized as a delay discount rate, describing how the subjective value of reward decreases as it is delayed. By analogy, selfishness can be formalized as a social discount rate, representing how the subjective value of rewarding another person decreases with increasing social distance. Delay and social discount rates for reward are correlated across individuals. However no previous work has examined whether this relationship also holds for aversive outcomes. Neither has previous work described a functional form for social discounting of pain in humans. This is a pertinent question, since preferences over aversive outcomes formally diverge from those for reward. We addressed this issue in an experiment in which healthy adult participants (N = 67) chose the timing and intensity of hypothetical pain for themselves and others. In keeping with previous studies, participants showed a strong preference for immediate over delayed pain. Participants showed greater concern for pain in close others than for their own pain, though this hyperaltruism was steeply discounted with increasing social distance. Impatience for pain and social discounting of pain were weakly correlated across individuals. Our results extend a link between impatience and selfishness to the aversive domain

    Dreading the pain of others? Altruistic responses to others' pain underestimate dread

    Get PDF
    A dislike of waiting for pain, aptly termed 'dread', is so great that people will increase pain to avoid delaying it. However, despite many accounts of altruistic responses to pain in others, no previous studies have tested whether people take delay into account when attempting to ameliorate others' pain. We examined the impact of delay in 2 experiments where participants (total N = 130) specified the intensity and delay of pain either for themselves or another person. Participants were willing to increase the experimental pain of another participant to avoid delaying it, indicative of dread, though did so to a lesser extent than was the case for their own pain. We observed a similar attenuation in dread when participants chose the timing of a hypothetical painful medical treatment for a close friend or relative, but no such attenuation when participants chose for a more distant acquaintance. A model in which altruism is biased to privilege pain intensity over the dread of pain parsimoniously accounts for these findings. We refer to this underestimation of others' dread as a 'Dread Empathy Gap'

    Human Replay Spontaneously Reorganizes Experience

    Get PDF
    Knowledge abstracted from previous experiences can be transferred to aid new learning. Here, we asked whether such abstract knowledge immediately guides the replay of new experiences. We first trained participants on a rule defining an ordering of objects and then presented a novel set of objects in a scrambled order. Across two studies, we observed that representations of these novel objects were reactivated during a subsequent rest. As in rodents, human "replay" events occurred in sequences accelerated in time, compared to actual experience, and reversed their direction after a reward. Notably, replay did not simply recapitulate visual experience, but followed instead a sequence implied by learned abstract knowledge. Furthermore, each replay contained more than sensory representations of the relevant objects. A sensory code of object representations was preceded 50 ms by a code factorized into sequence position and sequence identity. We argue that this factorized representation facilitates the generalization of a previously learned structure to new objects

    Temporally delayed linear modelling (TDLM) measures replay in both animals and humans

    Get PDF
    There are rich structures in off-task neural activity which are hypothesised to reflect fundamental computations across a broad spectrum of cognitive functions. Here, we develop an analysis toolkit - Temporal Delayed Linear Modelling (TDLM) for analysing such activity. TDLM is a domain-general method for finding neural sequences that respect a pre-specified transition graph. It combines nonlinear classification and linear temporal modelling to test for statistical regularities in sequences of task-related reactivations. TDLM is developed on the non-invasive neuroimaging data and is designed to take care of confounds and maximize sequence detection ability. Notably, as a linear framework, TDLM can be easily extended, without loss of generality, to capture rodent replay in electrophysiology, including in continuous spaces, as well as addressing second-order inference questions, e.g., its temporal and spatial varying pattern. We hope TDLM will advance a deeper understanding of neural computation and promote a richer convergence between animal and human neuroscience

    Discounting Future Reward in an Uncertain World

    Get PDF
    Humans discount delayed relative to more immediate reward. A plausible explanation is that impatience arises partly from uncertainty, or risk, implicit in delayed reward. Existing theories of discounting-as-risk focus on a probability that delayed reward will not materialize. By contrast, we examine how uncertainty in the magnitude of delayed reward contributes to delay discounting. We propose a model wherein reward is discounted proportional to the rate of random change in its magnitude across time, termed volatility. We find evidence to support this model across three experiments (total N = 158). First, using a task where participants chose when to sell products, whose price dynamics they previously learned, we show discounting increases in line with price volatility. Second, we show that this effect pertains over naturalistic delays of up to 4 months. Using functional magnetic resonance imaging, we observe a volatility-dependent decrease in functional hippocampal–prefrontal coupling during intertemporal choice. Third, we replicate these effects in a larger online sample, finding that volatility discounting within each task correlates with baseline discounting outside of the task.We conclude that delay discounting partly reflects time-dependent uncertainty about reward magnitude, that is volatility. Our model captures how discounting adapts to volatility, thereby partly accounting for individual differences in impatience. Our imaging findings suggest a putative mechanism whereby uncertainty reduces prospective simulation of future outcomes
    • …
    corecore