38 research outputs found
On minmax theorems for multiplayer games
We prove a generalization of von Neumann's minmax theorem to the class of separable multiplayer zero-sum games, introduced in [Bregman and Fokin 1998]. These games are polymatrix---that is, graphical games in which every edge is a two-player game between its endpoints---in which every outcome has zero total sum of players' payoffs. Our generalization of the minmax theorem implies convexity of equilibria, polynomial-time tractability, and convergence of no-regret learning algorithms to Nash equilibria. Given that Nash equilibria in 3-player zero-sum games are already PPAD-complete, this class of games, i.e. with pairwise separable utility functions, defines essentially the broadest class of multi-player constant-sum games to which we can hope to push tractability results. Our result is obtained by establishing a certain game-class collapse, showing that separable constant-sum games are payoff equivalent to pairwise constant-sum polymatrix games---polymatrix games in which all edges are constant-sum games, and invoking a recent result of [Daskalakis, Papadimitriou 2009] for these games.
We also explore generalizations to classes of non-constant-sum multi-player games. A natural candidate is polymatrix games with strictly competitive games on their edges. In the two player setting, such games are minmax solvable and recent work has shown that they are merely affine transformations of zero-sum games [Adler, Daskalakis, Papadimitriou 2009]. Surprisingly we show that a polymatrix game comprising of strictly competitive games on its edges is PPAD-complete to solve, proving a striking difference in the complexity of networks of zero-sum and strictly competitive games. Finally, we look at the role of coordination in networked interactions, studying the complexity of polymatrix games with a mixture of coordination and zero-sum games. We show that finding a pure Nash equilibrium in coordination-only polymatrix games is PLS-complete; hence, computing a mixed Nash equilibrium is in PLS ∩ PPAD, but it remains open whether the problem is in P. If, on the other hand, coordination and zero-sum games are combined, we show that the problem becomes PPAD-complete, establishing that coordination and zero-sum games achieve the full generality of PPAD.National Science Foundation (U.S.) (CAREER Award CCF-0953960)Alfred P. Sloan Foundation (Fellowship
Smoothed Efficient Algorithms and Reductions for Network Coordination Games
Worst-case hardness results for most equilibrium computation problems have
raised the need for beyond-worst-case analysis. To this end, we study the
smoothed complexity of finding pure Nash equilibria in Network Coordination
Games, a PLS-complete problem in the worst case. This is a potential game where
the sequential-better-response algorithm is known to converge to a pure NE,
albeit in exponential time. First, we prove polynomial (resp. quasi-polynomial)
smoothed complexity when the underlying game graph is a complete (resp.
arbitrary) graph, and every player has constantly many strategies. We note that
the complete graph case is reminiscent of perturbing all parameters, a common
assumption in most known smoothed analysis results.
Second, we define a notion of smoothness-preserving reduction among search
problems, and obtain reductions from -strategy network coordination games to
local-max-cut, and from -strategy games (with arbitrary ) to
local-max-cut up to two flips. The former together with the recent result of
[BCC18] gives an alternate -time smoothed algorithm for the
-strategy case. This notion of reduction allows for the extension of
smoothed efficient algorithms from one problem to another.
For the first set of results, we develop techniques to bound the probability
that an (adversarial) better-response sequence makes slow improvements on the
potential. Our approach combines and generalizes the local-max-cut approaches
of [ER14,ABPW17] to handle the multi-strategy case: it requires a careful
definition of the matrix which captures the increase in potential, a tighter
union bound on adversarial sequences, and balancing it with good enough rank
bounds. We believe that the approach and notions developed herein could be of
interest in addressing the smoothed complexity of other potential and/or
congestion games
Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence
Learning agents that are not only capable of taking tests, but also
innovating is becoming a hot topic in AI. One of the most promising paths
towards this vision is multi-agent learning, where agents act as the
environment for each other, and improving each agent means proposing new
problems for others. However, existing evaluation platforms are either not
compatible with multi-agent settings, or limited to a specific game. That is,
there is not yet a general evaluation platform for research on multi-agent
intelligence. To this end, we introduce Arena, a general evaluation platform
for multi-agent intelligence with 35 games of diverse logics and
representations. Furthermore, multi-agent intelligence is still at the stage
where many problems remain unexplored. Therefore, we provide a building toolkit
for researchers to easily invent and build novel multi-agent problems from the
provided game set based on a GUI-configurable social tree and five basic
multi-agent reward schemes. Finally, we provide Python implementations of five
state-of-the-art deep multi-agent reinforcement learning baselines. Along with
the baseline implementations, we release a set of 100 best agents/teams that we
can train with different training schemes for each game, as the base for
evaluating agents with population performance. As such, the research community
can perform comparisons under a stable and uniform standard. All the
implementations and accompanied tutorials have been open-sourced for the
community at https://sites.google.com/view/arena-unity/
Approximating Nash Equilibria in Tree Polymatrix Games
We develop a quasi-polynomial time Las Vegas algorithm for approximating Nash equilibria in polymatrix games over trees, under a mild renormalizing assumption. Our result, in particular, leads to an expected polynomial-time algorithm for computing approximate Nash equilibria of tree polymatrix games in which the number of actions per player is a fixed constant. Further, for trees with constant degree, the running time of the algorithm matches the best known upper bound for approximating Nash equilibria in bimatrix games (Lipton, Markakis, and Mehta 2003).
Notably, this work closely complements the hardness result of Rubinstein (2015), which establishes the inapproximability of Nash equilibria in polymatrix games over constant-degree bipartite graphs with two actions per player
Computing Approximate Nash Equilibria in Polymatrix Games
In an -Nash equilibrium, a player can gain at most by
unilaterally changing his behaviour. For two-player (bimatrix) games with
payoffs in , the best-known achievable in polynomial time is
0.3393. In general, for -player games an -Nash equilibrium can be
computed in polynomial time for an that is an increasing function of
but does not depend on the number of strategies of the players. For
three-player and four-player games the corresponding values of are
0.6022 and 0.7153, respectively. Polymatrix games are a restriction of general
-player games where a player's payoff is the sum of payoffs from a number of
bimatrix games. There exists a very small but constant such that
computing an -Nash equilibrium of a polymatrix game is \PPAD-hard.
Our main result is that a -Nash equilibrium of an -player
polymatrix game can be computed in time polynomial in the input size and
. Inspired by the algorithm of Tsaknakis and Spirakis, our
algorithm uses gradient descent on the maximum regret of the players. We also
show that this algorithm can be applied to efficiently find a
-Nash equilibrium in a two-player Bayesian game
Cycles in adversarial regularized learning
Regularized learning is a fundamental technique in online optimization,
machine learning and many other fields of computer science. A natural question
that arises in these settings is how regularized learning algorithms behave
when faced against each other. We study a natural formulation of this problem
by coupling regularized learning dynamics in zero-sum games. We show that the
system's behavior is Poincar\'e recurrent, implying that almost every
trajectory revisits any (arbitrarily small) neighborhood of its starting point
infinitely often. This cycling behavior is robust to the agents' choice of
regularization mechanism (each agent could be using a different regularizer),
to positive-affine transformations of the agents' utilities, and it also
persists in the case of networked competition, i.e., for zero-sum polymatrix
games.Comment: 22 pages, 4 figure