1,581 research outputs found

    Parameter estimation in softmax decision-making models with linear objective functions

    Full text link
    With an eye towards human-centered automation, we contribute to the development of a systematic means to infer features of human decision-making from behavioral data. Motivated by the common use of softmax selection in models of human decision-making, we study the maximum likelihood parameter estimation problem for softmax decision-making models with linear objective functions. We present conditions under which the likelihood function is convex. These allow us to provide sufficient conditions for convergence of the resulting maximum likelihood estimator and to construct its asymptotic distribution. In the case of models with nonlinear objective functions, we show how the estimator can be applied by linearizing about a nominal parameter value. We apply the estimator to fit the stochastic UCL (Upper Credible Limit) model of human decision-making to human subject data. We show statistically significant differences in behavior across related, but distinct, tasks.Comment: In pres

    Human Dorsal Striatal Activity during Choice Discriminates Reinforcement Learning Behavior from the Gambler’s Fallacy

    Get PDF
    Reinforcement learning theory has generated substantial interest in neurobiology, particularly because of the resemblance between phasic dopamine and reward prediction errors. Actor–critic theories have been adapted to account for the functions of the striatum, with parts of the dorsal striatum equated to the actor. Here, we specifically test whether the human dorsal striatum—as predicted by an actor–critic instantiation—is used on a trial-to-trial basis at the time of choice to choose in accordance with reinforcement learning theory, as opposed to a competing strategy: the gambler's fallacy. Using a partial-brain functional magnetic resonance imaging scanning protocol focused on the striatum and other ventral brain areas, we found that the dorsal striatum is more active when choosing consistent with reinforcement learning compared with the competing strategy. Moreover, an overlapping area of dorsal striatum along with the ventral striatum was found to be correlated with reward prediction errors at the time of outcome, as predicted by the actor–critic framework. These findings suggest that the same region of dorsal striatum involved in learning stimulus–response associations may contribute to the control of behavior during choice, thereby using those learned associations. Intriguingly, neither reinforcement learning nor the gambler's fallacy conformed to the optimal choice strategy on the specific decision-making task we used. Thus, the dorsal striatum may contribute to the control of behavior according to reinforcement learning even when the prescriptions of such an algorithm are suboptimal in terms of maximizing future rewards

    Reinforcement learning with motivations for realistic agents

    Get PDF
    Believable virtual humans have important applications in various fields, including computer based video games. The challenge in programming video games is to produce a non-player controlled character that is autonomous, and capable of action selections that appear human. In this thesis, motivations are used as a basis for learning using reinforcements. With motives driving the decisions of the agents, their actions will appear less structured and repetitious, and more human in nature. This will also allow developers to easily create game agents with specific motivations, based mostly on their narrative purposes. With minimum and maximum desirable motive values, the agents use reinforcement learning to maximize their rewards across all motives. Results show that an agent can learn to satisfy as many as four motives, even with significantly delayed rewards, and motive changes that are caused by other agents. While the actions tested are simple in nature, they show the potential of a more complicated motivation driven reinforcement learning system. The game developer need only define an agent\u27s motivations, based on the game narrative, and the agent will learn to act realistically as the game progresses
    • …
    corecore