50 research outputs found
Observational learning computations in neurons of the human anterior cingulate cortex
When learning from direct experience, neurons in the primate brain have been shown to encode a teaching signal used by algorithms in artificial intelligence: the reward prediction error (PE)—the difference between how rewarding an event is, and how rewarding it was expected to be. However, in humans and other species learning often takes place by observing other individuals. Here, we show that, when humans observe other players in a card game, neurons in their rostral anterior cingulate cortex (rACC) encode both the expected value of an observed choice, and the PE after the outcome was revealed. Notably, during the same task neurons recorded in the amygdala (AMY) and the rostromedial prefrontal cortex (rmPFC) do not exhibit this type of encoding. Our results suggest that humans learn by observing others, at least in part through the encoding of observational PEs in single neurons in the rACC
Counterfactual choice and learning in a neural network centred on human lateral frontopolar cortex
Decision making and learning in a real-world context require organisms to track not only the choices they make and the outcomes that follow but also other untaken, or counterfactual, choices and their outcomes. Although the neural system responsible for tracking the value of choices actually taken is increasingly well understood, whether a neural system tracks counterfactual information is currently unclear. Using a three-alternative decision-making task, a Bayesian reinforcement-learning algorithm, and fMRI, we investigated the coding of counterfactual choices and prediction errors in the human brain. Rather than representing evidence favoring multiple counterfactual choices, lateral frontal polar cortex (lFPC), doromedial frontal cortex (DMFC), and posteromedial cortex (PMC) encode the reward-based evidence favoring the best counterfactual option at future decisions. In addition to encoding counterfactual reward expectations, the network carries a signal for learning about counterfactual options when feedback is available - a counterfactual prediction error. Unlike other brain regions that have been associated with the processing of counterfactual outcomes, counterfactual prediction errors within the identified network cannot be related to regret theory. Furthermore, individual variation in counterfactual choice-related activity and prediction error-related activity, respectively, predicts variation in the propensity to switch to profitable choices in the future and the ability to learn from hypothetical feedback. Taken together, these data provide both neural and behavioral evidence to support the existence of a previously unidentified neural system responsible for tracking both counterfactual choice options and their outcomes
Observational learning computations in neurons of the human anterior cingulate cortex
When learning from direct experience, neurons in the primate brain have been shown to encode a teaching signal used by algorithms in artificial intelligence: the reward prediction error (PE)—the difference between how rewarding an event is, and how rewarding it was expected to be. However, in humans and other species learning often takes place by observing other individuals. Here, we show that, when humans observe other players in a card game, neurons in their rostral anterior cingulate cortex (rACC) encode both the expected value of an observed choice, and the PE after the outcome was revealed. Notably, during the same task neurons recorded in the amygdala (AMY) and the rostromedial prefrontal cortex (rmPFC) do not exhibit this type of encoding. Our results suggest that humans learn by observing others, at least in part through the encoding of observational PEs in single neurons in the rACC
Value, search, persistence and model updating in anterior cingulate cortex
Dorsal anterior cingulate cortex (dACC) carries a wealth of value-related information necessary for regulating behavioral flexibility and persistence. It signals error and reward events informing decisions about switching or staying with current behavior. During decision-making, it encodes the average value of exploring alternative choices (search value), even after controlling for response selection difficulty, and during learning, it encodes the degree to which internal models of the environment and current task must be updated. dACC value signals are derived in part from the history of recent reward integrated simultaneously over multiple time scales, thereby enabling comparison of experience over the recent and extended past. Such ACC signals may instigate attentionally demanding and difficult processes such as behavioral change via interactions with prefrontal cortex. However, the signal in dACC that instigates behavioral change need not itself be a conflict or difficulty signal
Testing for Fictive Learning in Decision-Making Under Uncertainty
We conduct two experiments where subjects make a sequence of binary choices between risky and ambiguous binary lotteries. Risky lotteries are defined as lotteries where the relative frequencies of outcomes are known. Ambiguous lotteries are lotteries where the relative frequencies of outcomes are not known or may not exist. The trials in each experiment are divided into three phases: pre-treatment, treatment and post-treatment. The trials in the pre-treatment and post-treatment phases are the same. As such, the trials before and after the treatment phase are dependent, clustered matched-pairs, that we analyze with the alternating logistic regression (ALR) package in SAS. In both experiments, we reveal to each subject the outcomes of her actual and counterfactual choices in the treatment phase. The treatments differ in the complexity of the random process used to generate the relative frequencies of the payoffs of the ambiguous lotteries. In the first experiment, the probabilities can be inferred from the converging sample averages of the observed actual and counterfactual outcomes of the ambiguous lotteries. In the second experiment the sample averages do not converge. If we define fictive learning in an experiment as statistically significant changes in the responses of subjects before and after the treatment phase of an experiment, then we expect fictive learning in the first experiment, but no fictive learning in the second experiment. The surprising finding in this paper is the presence of fictive learning in the second experiment. We attribute this counterintuitive result to apophenia: “seeing meaningful patterns in meaningless or random data.” A refinement of this result is the inference from a subsequent Chi-squared test, that the effects of fictive learning in the first experiment are significantly different from the effects of fictive learning in the second experiment
Two Anatomically and Computationally Distinct Learning Signals Predict Changes to Stimulus-Outcome Associations in Hippocampus
Contributions of Ventromedial Prefrontal and Frontal Polar Cortex to Reinforcement Learning and Value-Based Choice
Conceptual Representation and the Making of New Decisions
A key feature of an adaptive decision making mechanism is its ability to guide behavior even in new situations. In this issue of Neuron, Kumaran et al. report that conceptual representations, which allow generalization from one situation to another through their shared features, can guide decisions even when new problems are encountered via the hippocampus
Inferences on a Multidimensional Social Hierarchy Use a Grid-like Code
AbstractGeneralizing experiences to guide decision making in novel situations is a hallmark of flexible behavior. It has been hypothesized such flexibility depends on a cognitive map of an environment or task, but directly linking the two has proven elusive. Here, we find that discretely sampled abstract relationships between entities in an unseen two-dimensional (2-D) social hierarchy are reconstructed into a unitary 2-D cognitive map in the hippocampus and entorhinal cortex. We further show that humans utilize a grid-like code in several brain regions, including entorhinal cortex and medial prefrontal cortex, for inferred direct trajectories between entities in the reconstructed abstract space during discrete decisions. Moreover, these neural grid-like codes in the entorhinal cortex are associated with neural decision value computations in the medial prefrontal cortex and temporoparietal junction area during choice. Collectively, these findings show that grid-like codes are used by the human brain to infer novel solutions, even in abstract and discrete problems, and suggest a general mechanism underpinning flexible decision making and generalization.</jats:p
