7 research outputs found
Examinations of Biases by Model Misspecification and Parameter Reliability of Reinforcement Learning Models
Reinforcement learning models have the potential to clarify meaningful individual differences in the decision-making process. This study focused on two aspects regarding the nature of a reinforcement learning model and its parameters: the problems of model misspecification and reliability. Online participants, N=453, completed self-report measures and a probabilistic learning task twice 1.5 months apart, and data from the task were fitted using several reinforcement learning models. To address the problem of model misspecification, we compared the models with and without the influence of choice history, or perseveration. Results showed that the lack of a perseveration term in the model led to a decrease in learning rates for win and loss outcomes, with slightly different influences depending on outcome volatility, and increases in inverse temperature. We also conducted simulations to examine the mechanism of the observed biases and revealed that failure to incorporate perseveration directly affected the estimation bias in the learning rate and indirectly affected that in inverse temperature. Furthermore, in both model fittings and model simulations, the lack of perseveration caused win-stay probability underestimation and loss-shift probability overestimation. We also assessed the parameter reliability. Test-retest reliabilities were poor (learning rates) to moderate (inverse temperature and perseveration magnitude). A learning effect was noted in the inverse temperature and perseveration magnitude parameters, showing an increment of the estimates in the second session. We discuss possible misinterpretations of results and limitations considering the estimation biases and parameter reliability
Retrospective surprise: A computational component for active inference
In the free energy principle (FEP), proposed by Friston, it is supposed that agents seek to minimize the “surprise”–the negative log (marginal) likelihood of observations (i.e., sensory stimuli)–given the agents' current belief. This is achieved by minimizing the free energy, which provides an upper bound on the surprise. The FEP has been applied to action selection in a framework called “active inference,” where agents are supposed to select an action so that they minimize the “expected free energy” (EFE). While the FEP and active inference have attracted the attention of researchers in a wide range of fields such as psychology and psychiatry, as well as neuroscience, it is not clear which psychological construct EFE is related to. To facilitate the discussion and interpretation of psychological processes underlying active inference, we introduce a computational component termed the “retrospective (or residual) surprise,” which is the surprise of an observation after updating the belief given the observation itself. We show that the predicted retrospective surprise (PRS) provides a lower bound on EFE: EFE is always larger than PRS. We illustrate the properties of EFE and PRS using examples of inference for a binary hidden cause given a binary observation. Essentially, EFE and PRS show similar behavior; however, in certain situations, they provide different predictions regarding action selection. This study also provides insights into the mechanism of active inference based on EFE