5 research outputs found
Comparison of Multi-agent and Single-agent Inverse Learning on a Simulated Soccer Example
We compare the performance of Inverse Reinforcement Learning (IRL) with the
relative new model of Multi-agent Inverse Reinforcement Learning (MIRL). Before
comparing the methods, we extend a published Bayesian IRL approach that is only
applicable to the case where the reward is only state dependent to a general
one capable of tackling the case where the reward depends on both state and
action. Comparison between IRL and MIRL is made in the context of an abstract
soccer game, using both a game model in which the reward depends only on state
and one in which it depends on both state and action. Results suggest that the
IRL approach performs much worse than the MIRL approach. We speculate that the
underperformance of IRL is because it fails to capture equilibrium information
in the manner possible in MIRL.Comment: arXiv admin note: text overlap with arXiv:1403.650
Multi-agent Inverse Reinforcement Learning for Two-person Zero-sum Games
The focus of this paper is a Bayesian framework for solving a class of
problems termed multi-agent inverse reinforcement learning (MIRL). Compared to
the well-known inverse reinforcement learning (IRL) problem, MIRL is formalized
in the context of stochastic games, which generalize Markov decision processes
to game theoretic scenarios. We establish a theoretical foundation for
competitive two-agent zero-sum MIRL problems and propose a Bayesian solution
approach in which the generative model is based on an assumption that the two
agents follow a minimax bi-policy. Numerical results are presented comparing
the Bayesian MIRL method with two existing methods in the context of an
abstract soccer game. Investigation centers on relationships between the extent
of prior information and the quality of learned rewards. Results suggest that
covariance structure is more important than mean value in reward priors
When Shall I Be Empathetic? The Utility of Empathetic Parameter Estimation in Multi-Agent Interactions
Human-robot interactions (HRI) can be modeled as dynamic or differential
games with incomplete information, where each agent holds private reward
parameters. Due to the open challenge in finding perfect Bayesian equilibria of
such games, existing studies often consider approximated solutions composed of
parameter estimation and motion planning steps, in order to decouple the belief
and physical dynamics. In parameter estimation, current approaches often assume
that the reward parameters of the robot are known by the humans. We argue that
by falsely conditioning on this assumption, the robot performs non-empathetic
estimation of the humans' parameters, leading to undesirable values even in the
simplest interactions. We test this argument by studying a two-vehicle
uncontrolled intersection case with short reaction time. Results show that when
both agents are unknowingly aggressive (or non-aggressive), empathy leads to
more effective parameter estimation and higher reward values, suggesting that
empathy is necessary when the true parameters of agents mismatch with their
common belief. The proposed estimation and planning algorithms are therefore
more robust than the existing approaches, by fully acknowledging the nature of
information asymmetry in HRI. Lastly, we introduce value approximation
techniques for real-time execution of the proposed algorithms.Comment: Submitted to ICRA202
Multi-Agent Generative Adversarial Imitation Learning
Imitation learning algorithms can be used to learn a policy from expert
demonstrations without access to a reward signal. However, most existing
approaches are not applicable in multi-agent settings due to the existence of
multiple (Nash) equilibria and non-stationary environments. We propose a new
framework for multi-agent imitation learning for general Markov games, where we
build upon a generalized notion of inverse reinforcement learning. We further
introduce a practical multi-agent actor-critic algorithm with good empirical
performance. Our method can be used to imitate complex behaviors in
high-dimensional environments with multiple cooperative or competing agents
A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress
Inverse reinforcement learning is the problem of inferring the reward
function of an observed agent, given its policy or behavior. Researchers
perceive IRL both as a problem and as a class of methods. By categorically
surveying the current literature in IRL, this article serves as a reference for
researchers and practitioners in machine learning to understand the challenges
of IRL and select the approaches best suited for the problem on hand. The
survey formally introduces the IRL problem along with its central challenges
which include accurate inference, generalizability, correctness of prior
knowledge, and growth in solution complexity with problem size. The article
elaborates how the current methods mitigate these challenges. We further
discuss the extensions of traditional IRL methods: (i) inaccurate and
incomplete perception, (ii) incomplete model, (iii) multiple rewards, and (iv)
non-linear reward functions. This discussion concludes with some broad advances
in the research area and currently open research questions