1 research outputs found
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior
Inferring intent from observed behavior has been studied extensively within
the frameworks of Bayesian inverse planning and inverse reinforcement learning.
These methods infer a goal or reward function that best explains the actions of
the observed agent, typically a human demonstrator. Another agent can use this
inferred intent to predict, imitate, or assist the human user. However, a
central assumption in inverse reinforcement learning is that the demonstrator
is close to optimal. While models of suboptimal behavior exist, they typically
assume that suboptimal actions are the result of some type of random noise or a
known cognitive bias, like temporal inconsistency. In this paper, we take an
alternative approach, and model suboptimal behavior as the result of internal
model misspecification: the reason that user actions might deviate from
near-optimal actions is that the user has an incorrect set of beliefs about the
rules -- the dynamics -- governing how actions affect the environment. Our
insight is that while demonstrated actions may be suboptimal in the real world,
they may actually be near-optimal with respect to the user's internal model of
the dynamics. By estimating these internal beliefs from observed behavior, we
arrive at a new method for inferring intent. We demonstrate in simulation and
in a user study with 12 participants that this approach enables us to more
accurately model human intent, and can be used in a variety of applications,
including offering assistance in a shared autonomy framework and inferring
human preferences.Comment: Accepted at Neural Information Processing Systems (NeurIPS) 201