22 research outputs found
Computing Ex Ante Coordinated Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games
Computational game theory has many applications in the modern world in both
adversarial situations and the optimization of social good. While there exist
many algorithms for computing solutions in two-player interactions, finding
optimal strategies in multiplayer interactions efficiently remains an open
challenge. This paper focuses on computing the multiplayer Team-Maxmin
Equilibrium with Coordination device (TMECor) in zero-sum extensive-form games.
TMECor models scenarios when a team of players coordinates ex ante against an
adversary. Such situations can be found in card games (e.g., in Bridge and
Poker), when a team works together to beat a target player but communication is
prohibited; and also in real world, e.g., in forest-protection operations, when
coordinated groups have limited contact during interdicting illegal loggers.
The existing algorithms struggle to find a TMECor efficiently because of their
high computational costs. To compute a TMECor in larger games, we make the
following key contributions: (1) we propose a hybrid-form strategy
representation for the team, which preserves the set of equilibria; (2) we
introduce a column-generation algorithm with a guaranteed finite-time
convergence in the infinite strategy space based on a novel best-response
oracle; (3) we develop an associated-representation technique for the exact
representation of the multilinear terms in the best-response oracle; and (4) we
experimentally show that our algorithm is several orders of magnitude faster
than prior state-of-the-art algorithms in large games.Comment: AAAI 2021. This paper also is a part of the thesis: Youzhi Zhang,
February 2020. Computing Team-Maxmin Equilibria in Zero-Sum Multiplayer
Games. PhD Thesis,
https://personal.ntu.edu.sg/boan/thesis/Zhang_Youzhi_PhD_Thesis.pd
A Generic Multi-Player Transformation Algorithm for Solving Large-Scale Zero-Sum Extensive-Form Adversarial Team Games
Many recent practical and theoretical breakthroughs focus on adversarial team
multi-player games (ATMGs) in ex ante correlation scenarios. In this setting,
team members are allowed to coordinate their strategies only before the game
starts. Although there existing algorithms for solving extensive-form ATMGs,
the size of the game tree generated by the previous algorithms grows
exponentially with the number of players. Therefore, how to deal with
large-scale zero-sum extensive-form ATMGs problems close to the real world is
still a significant challenge. In this paper, we propose a generic multi-player
transformation algorithm, which can transform any multi-player game tree
satisfying the definition of AMTGs into a 2-player game tree, such that finding
a team-maxmin equilibrium with correlation (TMECor) in large-scale ATMGs can be
transformed into solving NE in 2-player games. To achieve this goal, we first
introduce a new structure named private information pre-branch, which consists
of a temporary chance node and coordinator nodes and aims to make decisions for
all potential private information on behalf of the team members. We also show
theoretically that NE in the transformed 2-player game is equivalent TMECor in
the original multi-player game. This work significantly reduces the growth of
action space and nodes from exponential to constant level. This enables our
work to outperform all the previous state-of-the-art algorithms in finding a
TMECor, with 182.89, 168.47, 694.44, and 233.98 significant improvements in the
different Kuhn Poker and Leduc Poker cases (21K3, 21K4, 21K6 and 21L33). In
addition, this work first practically solves the ATMGs in a 5-player case which
cannot be conducted by existing algorithms.Comment: 9 pages, 5 figures, NIPS 202
Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games
Two-team zero-sum games are one of the most important paradigms in game
theory. In this paper, we focus on finding an unexploitable equilibrium in
large team games. An unexploitable equilibrium is a worst-case policy, where
members in the opponent team cannot increase their team reward by taking any
policy, e.g., cooperatively changing to other joint policies. As an optimal
unexploitable equilibrium in two-team zero-sum games, correlated-team maxmin
equilibrium remains unexploitable even in the worst case where players in the
opponent team can achieve arbitrary cooperation through a joint team policy.
However, finding such an equilibrium in large games is challenging due to the
impracticality of evaluating the exponentially large number of joint policies.
To solve this problem, we first introduce a general solution concept called
restricted correlated-team maxmin equilibrium, which solves the problem of
being impossible to evaluate all joint policy by a sample factor while avoiding
an exploitation problem under the incomplete joint policy evaluation. We then
develop an efficient sequential correlation mechanism, and based on which we
propose an algorithm for approximating the unexploitable equilibrium in large
games. We show that our approach achieves lower exploitability than the
state-of-the-art baseline when encountering opponent teams with different
exploitation ability in large team games including Google Research Football
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Computing Nash equilibrium policies is a central problem in multi-agent
reinforcement learning that has received extensive attention both in theory and
in practice. However, provable guarantees have been thus far either limited to
fully competitive or cooperative scenarios or impose strong assumptions that
are difficult to meet in most practical applications. In this work, we depart
from those prior results by investigating infinite-horizon \emph{adversarial
team Markov games}, a natural and well-motivated class of games in which a team
of identically-interested players -- in the absence of any explicit
coordination or communication -- is competing against an adversarial player.
This setting allows for a unifying treatment of zero-sum Markov games and
Markov potential games, and serves as a step to model more realistic strategic
interactions that feature both competing and cooperative interests. Our main
contribution is the first algorithm for computing stationary
-approximate Nash equilibria in adversarial team Markov games with
computational complexity that is polynomial in all the natural parameters of
the game, as well as . The proposed algorithm is particularly
natural and practical, and it is based on performing independent policy
gradient steps for each player in the team, in tandem with best responses from
the side of the adversary; in turn, the policy for the adversary is then
obtained by solving a carefully constructed linear program. Our analysis
leverages non-standard techniques to establish the KKT optimality conditions
for a nonlinear program with nonconvex constraints, thereby leading to a
natural interpretation of the induced Lagrange multipliers. Along the way, we
significantly extend an important characterization of optimal policies in
adversarial (normal-form) team games due to Von Stengel and Koller (GEB `97)
Correlated vs. Uncorrelated Randomness in Adversarial Congestion Team Games
We consider team zero-sum network congestion games with senders playing
against interceptors over a graph . The senders aim to minimize their
collective cost of sending messages over paths in , which is an aggregation
of edge costs, while the interceptors aim to maximize the collective cost by
increasing some of these edge costs. To evade the interceptors, the senders
will usually use randomized strategies. We consider two cases, the correlated
case when senders have access to a shared source of randomness, and the
uncorrelated case, when each sender has access to only its own source of
randomness. We study the additional cost that uncorrelated senders have to
bear, specifically by comparing the costs incurred by senders in cost-minimal
Nash Equilibria when senders can and cannot share randomness.
We prove that for an intuitive strict subset of cost functions, the ratio
between correlated and uncorrelated costs at equilibrium is
, where is the mincut size of . This bound is
much milder compared to the most general case, where an upper bound of
on the ratio is known. We show that the senders can
approximate their optimal play by playing simple strategies which select paths
uniformly at random from subsets of disjoint paths. We then focus on two
natural cost functions. For the first, we prove that one of the simple
strategies above is an optimal strategy for senders over graphs with disjoint
paths. In complete contrast, for the second cost function we prove that none of
these simple strategies is optimal for the senders over these graphs, unless
the game instance admits a trivial optimal senders strategy
Public Information Representation for Adversarial Team Games
The peculiarity of adversarial team games resides in the asymmetric
information available to the team members during the play, which makes the
equilibrium computation problem hard even with zero-sum payoffs. The algorithms
available in the literature work with implicit representations of the strategy
space and mainly resort to Linear Programming and column generation techniques
to enlarge incrementally the strategy space. Such representations prevent the
adoption of standard tools such as abstraction generation, game solving, and
subgame solving, which demonstrated to be crucial when solving huge, real-world
two-player zero-sum games. Differently from these works, we answer the question
of whether there is any suitable game representation enabling the adoption of
those tools. In particular, our algorithms convert a sequential team game with
adversaries to a classical two-player zero-sum game. In this converted game,
the team is transformed into a single coordinator player who only knows
information common to the whole team and prescribes to the players an action
for any possible private state. Interestingly, we show that our game is more
expressive than the original extensive-form game as any state/action
abstraction of the extensive-form game can be captured by our representation,
while the reverse does not hold. Due to the NP-hard nature of the problem, the
resulting Public Team game may be exponentially larger than the original one.
To limit this explosion, we provide three algorithms, each returning an
information-lossless abstraction that dramatically reduces the size of the
tree. These abstractions can be produced without generating the original game
tree. Finally, we show the effectiveness of the proposed approach by presenting
experimental results on Kuhn and Leduc Poker games, obtained by applying
state-of-art algorithms for two-player zero-sum games on the converted gamesComment: 19 pages, 7 figures, Best Paper Award in Cooperative AI Workshop at
NeurIPS 202
Preplay Communication in Multi-Player Sequential Games: An Overview of Recent Results
AbstractThe computational study of game-theoretic solution concepts is fundamental to describe the optimal behavior of rational agents interacting in a strategic setting, and to predict the most likely outcome of a game. Equilibrium computation techniques have been applied to numerous real-world problems. Among other applications, they are the key building block of the best poker-playing AI agents [5, 6, 27], and have been applied to physical and cybersecurity problems (see, e.g., [18, 20, 21, 30–32])
A marriage between adversarial team games and 2-player games: enabling abstractions, no-regret learning, and subgame solving
Ex ante correlation is becoming the mainstream approach for sequential adversarial team games,where a team of players faces another team in a
zero-sum game. It is known that team members’asymmetric information makes both equilibrium computation APX-hard and team’s strategies not
directly representable on the game tree. This latter issue prevents the adoption of successful tools for huge 2-player zero-sum games such as,
e.g., abstractions, no-regret learning, and sub game solving. This work shows that we can re cover from this weakness by bridging the gap be tween sequential adversarial team games and 2-player games. In particular, we propose a new,suitable game representation that we call team public-information, in which a team is repre sented as a single coordinator who only knows information common to the whole team and pre scribes to each member an action for any pos sible private state. The resulting representation is highly explainable, being a 2-player tree in
which the team’s strategies are behavioral with a direct interpretation and more expressive than he original extensive form when designing ab stractions. Furthermore, we prove payoff equiva lence of our representation, and we provide tech niques that, starting directly from the extensive form, generate dramatically more compact repre sentations without information loss. Finally, we experimentally evaluate our techniques when ap plied to a standard testbed, comparing their per formance with the current state of the art