65 research outputs found
A Framework for Sequential Planning in Multi-Agent Settings
This paper extends the framework of partially observable Markov decision
processes (POMDPs) to multi-agent settings by incorporating the notion of agent
models into the state space. Agents maintain beliefs over physical states of
the environment and over models of other agents, and they use Bayesian updates
to maintain their beliefs over time. The solutions map belief states to
actions. Models of other agents may include their belief states and are related
to agent types considered in games of incomplete information. We express the
agents autonomy by postulating that their models are not directly manipulable
or observable by other agents. We show that important properties of POMDPs,
such as convergence of value iteration, the rate of convergence, and piece-wise
linearity and convexity of the value functions carry over to our framework. Our
approach complements a more traditional approach to interactive settings which
uses Nash equilibria as a solution paradigm. We seek to avoid some of the
drawbacks of equilibria which may be non-unique and do not capture
off-equilibrium behaviors. We do so at the cost of having to represent, process
and continuously revise models of other agents. Since the agents beliefs may be
arbitrarily nested, the optimal solutions to decision making problems are only
asymptotically computable. However, approximate belief updates and
approximately optimal plans are computable. We illustrate our framework using a
simple application domain, and we show examples of belief updates and value
functions
Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems
A key challenge in multi-robot and multi-agent systems is generating
solutions that are robust to other self-interested or even adversarial parties
who actively try to prevent the agents from achieving their goals. The
practicality of existing works addressing this challenge is limited to only
small-scale synchronous decision-making scenarios or a single agent planning
its best response against a single adversary with fixed, procedurally
characterized strategies. In contrast this paper considers a more realistic
class of problems where a team of asynchronous agents with limited observation
and communication capabilities need to compete against multiple strategic
adversaries with changing strategies. This problem necessitates agents that can
coordinate to detect changes in adversary strategies and plan the best response
accordingly. Our approach first optimizes a set of stratagems that represent
these best responses. These optimized stratagems are then integrated into a
unified policy that can detect and respond when the adversaries change their
strategies. The near-optimality of the proposed framework is established
theoretically as well as demonstrated empirically in simulation and hardware
- …