2 research outputs found
Distributed Fictitious Play in Potential Games with Time-Varying Communication Networks
We propose a distributed algorithm for multiagent systems that aim to
optimize a common objective when agents differ in their estimates of the
objective-relevant state of the environment. Each agent keeps an estimate of
the environment and a model of the behavior of other agents. The model of other
agents' behavior assumes agents choose their actions randomly based on a
stationary distribution determined by the empirical frequencies of past
actions. At each step, each agent takes the action that maximizes its
expectation of the common objective computed with respect to its estimate of
the environment and its model of others. We propose a weighted averaging rule
with non-doubly stochastic weights for agents to estimate the empirical
frequency of past actions of all other agents by exchanging their estimates
with their neighbors over a time-varying communication network. Under this
averaging rule, we show agents' estimates converge to the actual empirical
frequencies fast enough. This implies convergence of actions to a Nash
equilibrium of the game with identical payoffs given by the expectation of the
common objective with respect to an asymptotically agreed estimate of the state
of the environment.Comment: 5 pages, 1 figure, to appear in Proceedings of Asilomar Conference on
Signals, Systems, and Computer
Decentralized Inertial Best-Response with Voluntary and Limited Communication in Random Communication Networks
Multiple autonomous agents interact over a random communication network to
maximize their individual utility functions which depend on the actions of
other agents. We consider decentralized best-response with inertia type
algorithms in which agents form beliefs about the future actions of other
players based on local information, and take an action that maximizes their
expected utility computed with respect to these beliefs or continue to take
their previous action. We show convergence of these types of algorithms to a
Nash equilibrium in weakly acyclic games under the condition that the belief
update and information exchange protocols successfully learn the actions of
other players with positive probability in finite time given a static
environment, i.e., when other agents' actions do not change. We design a
decentralized fictitious play algorithm with voluntary and limited
communication (DFP-VL) protocols that satisfy this condition. In the voluntary
communication protocol, each agent decides whom to exchange information with by
assessing the novelty of its information and the potential effect of its
information on others' assessments of their utility functions. The limited
communication protocol entails agents sending only their most frequent action
to agents that they decide to communicate with. Numerical experiments on a
target assignment game demonstrate that the voluntary and limited communication
protocol can more than halve the number of communication attempts while
retaining the same convergence rate as DFP in which agents constantly attempt
to communicate.Comment: 10 page