6 research outputs found
Coarse Correlation in Extensive-Form Games
Coarse correlation models strategic interactions of rational agents
complemented by a correlation device, that is a mediator that can recommend
behavior but not enforce it. Despite being a classical concept in the theory of
normal-form games for more than forty years, not much is known about the merits
of coarse correlation in extensive-form settings. In this paper, we consider
two instantiations of the idea of coarse correlation in extensive-form games:
normal-form coarse-correlated equilibrium (NFCCE), already defined in the
literature, and extensive-form coarse-correlated equilibrium (EFCCE), which we
introduce for the first time. We show that EFCCE is a subset of NFCCE and a
superset of the related extensive-form correlated equilibrium. We also show
that, in two-player extensive-form games, social-welfare-maximizing EFCCEs and
NFCEEs are bilinear saddle points, and give new efficient algorithms for the
special case of games with no chance moves. In our experiments, our proposed
algorithm for NFCCE is two to four orders of magnitude faster than the prior
state of the art
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
The existence of simple, uncoupled no-regret dynamics that converge to
correlated equilibria in normal-form games is a celebrated result in the theory
of multi-agent systems. Specifically, it has been known for more than 20 years
that when all players seek to minimize their internal regret in a repeated
normal-form game, the empirical frequency of play converges to a normal-form
correlated equilibrium. Extensive-form (that is, tree-form) games generalize
normal-form games by modeling both sequential and simultaneous moves, as well
as private information. Because of the sequential nature and presence of
partial information in the game, extensive-form correlation has significantly
different properties than the normal-form counterpart, many of which are still
open research directions. Extensive-form correlated equilibrium (EFCE) has been
proposed as the natural extensive-form counterpart to normal-form correlated
equilibrium. However, it was currently unknown whether EFCE emerges as the
result of uncoupled agent dynamics. In this paper, we give the first uncoupled
no-regret dynamics that converge to the set of EFCEs in -player general-sum
extensive-form games with perfect recall. First, we introduce a notion of
trigger regret in extensive-form games, which extends that of internal regret
in normal-form games. When each player has low trigger regret, the empirical
frequency of play is close to an EFCE. Then, we give an efficient
no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local
subproblems at each decision point for the player, and constructs a global
strategy of the player from the local solutions at each decision point
Hindsight and Sequential Rationality of Correlated Play
Driven by recent successes in two-player, zero-sum game solving and playing,
artificial intelligence work on games has increasingly focused on algorithms
that produce equilibrium-based strategies. However, this approach has been less
effective at producing competent players in general-sum games or those with
more than two players than in two-player, zero-sum games. An appealing
alternative is to consider adaptive algorithms that ensure strong performance
in hindsight relative to what could have been achieved with modified behavior.
This approach also leads to a game-theoretic analysis, but in the correlated
play that arises from joint learning dynamics rather than factored agent
behavior at equilibrium. We develop and advocate for this hindsight rationality
framing of learning in general sequential decision-making settings. To this
end, we re-examine mediated equilibrium and deviation types in extensive-form
games, thereby gaining a more complete understanding and resolving past
misconceptions. We present a set of examples illustrating the distinct
strengths and weaknesses of each type of equilibrium in the literature, and
prove that no tractable concept subsumes all others. This line of inquiry
culminates in the definition of the deviation and equilibrium classes that
correspond to algorithms in the counterfactual regret minimization (CFR)
family, relating them to all others in the literature. Examining CFR in greater
detail further leads to a new recursive definition of rationality in correlated
play that extends sequential rationality in a way that naturally applies to
hindsight evaluation.Comment: Technical report for a paper in the proceedings of the thirty-fifth
AAAI Conference on Artificial Intelligence (AAAI-21), February 2-9, 2021,
Virtual. 26 pages and 15 figure