17 research outputs found
LP-based Covering Games with Low Price of Anarchy
We present a new class of vertex cover and set cover games. The price of
anarchy bounds match the best known constant factor approximation guarantees
for the centralized optimization problems for linear and also for submodular
costs -- in contrast to all previously studied covering games, where the price
of anarchy cannot be bounded by a constant (e.g. [6, 7, 11, 5, 2]). In
particular, we describe a vertex cover game with a price of anarchy of 2. The
rules of the games capture the structure of the linear programming relaxations
of the underlying optimization problems, and our bounds are established by
analyzing these relaxations. Furthermore, for linear costs we exhibit linear
time best response dynamics that converge to these almost optimal Nash
equilibria. These dynamics mimic the classical greedy approximation algorithm
of Bar-Yehuda and Even [3]
Cyclic game dynamics driven by iterated reasoning
Recent theories from complexity science argue that complex dynamics are
ubiquitous in social and economic systems. These claims emerge from the
analysis of individually simple agents whose collective behavior is
surprisingly complicated. However, economists have argued that iterated
reasoning--what you think I think you think--will suppress complex dynamics by
stabilizing or accelerating convergence to Nash equilibrium. We report stable
and efficient periodic behavior in human groups playing the Mod Game, a
multi-player game similar to Rock-Paper-Scissors. The game rewards subjects for
thinking exactly one step ahead of others in their group. Groups that play this
game exhibit cycles that are inconsistent with any fixed-point solution
concept. These cycles are driven by a "hopping" behavior that is consistent
with other accounts of iterated reasoning: agents are constrained to about two
steps of iterated reasoning and learn an additional one-half step with each
session. If higher-order reasoning can be complicit in complex emergent
dynamics, then cyclic and chaotic patterns may be endogenous features of
real-world social and economic systems.Comment: 8 pages, 4 figures, and supplementary informatio
Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics
Achieving convergence of multiple learning agents in general -player games
is imperative for the development of safe and reliable machine learning (ML)
algorithms and their application to autonomous systems. Yet it is known that,
outside the bounds of simple two-player games, convergence cannot be taken for
granted.
To make progress in resolving this problem, we study the dynamics of smooth
Q-Learning, a popular reinforcement learning algorithm which quantifies the
tendency for learning agents to explore their state space or exploit their
payoffs. We show a sufficient condition on the rate of exploration such that
the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in
any game. We connect this result to games for which Q-Learning is known to
converge with arbitrary exploration rates, including weighted Potential games
and weighted zero sum polymatrix games.
Finally, we examine the performance of the Q-Learning dynamic as measured by
the Time Averaged Social Welfare, and comparing this with the Social Welfare
achieved by the equilibrium. We provide a sufficient condition whereby the
Q-Learning dynamic will outperform the equilibrium even if the dynamics do not
converge.Comment: Accepted in AAMAS 202
Game Manipulators -- the Strategic Implications of Binding Contracts
Commitment devices are powerful tools that can influence and incentivise
certain behaviours by linking them to rewards or punishments. These devices are
particularly useful in decision-making, as they can steer individuals towards
specific choices. In the field of game theory, commitment devices can alter a
player's payoff matrix, ultimately changing the game's Nash equilibria.
Interestingly, agents, whom we term game manipulators and who can be external
to the original game, can leverage such devices to extract fees from players by
making them contingent offers that modify the payoffs of their actions. This
can result in a different Nash equilibrium with potentially lower payoffs for
the players compared to the original game. For this scheme to work, it is
required that all commitments be binding, meaning that once an offer is made,
it cannot be revoked. Consequently, we analyse binding contracts as the
commitment mechanism that enables game manipulation scenarios. The main focus
of this study is to formulate the logic of this setting, expand its scope to
encompass more intricate schemes, and analyse the behaviour of
regret-minimizing agents in scenarios involving game manipulation
The graph structure of two-player games
In this paper we analyse two-player games by their response graphs. The
response graph has nodes which are strategy profiles, with an arc between
profiles if they differ in the strategy of a single player, with the direction
of the arc indicating the preferred option for that player. Response graphs,
and particularly their sink strongly connected components, play an important
role in modern techniques in evolutionary game theory and multi-agent learning.
We show that the response graph is a simple and well-motivated model of
strategic interaction which captures many non-trivial properties of a game,
despite not depending on cardinal payoffs. We characterise the games which
share a response graph with a zero-sum or potential game respectively, and
demonstrate a duality between these sets. This allows us to understand the
influence of these properties on the response graph. The response graphs of
Matching Pennies and Coordination are shown to play a key role in all
two-player games: every non-iteratively-dominated strategy takes part in a
subgame with these graph structures. As a corollary, any game sharing a
response graph with both a zero-sum game and potential game must be
dominance-solvable. Finally, we demonstrate our results on some larger games.Comment: 16 pages, 11 figure