138 research outputs found
Pure Monte Carlo Counterfactual Regret Minimization
Counterfactual Regret Minimization (CFR) and its variants are the best
algorithms so far for solving large-scale incomplete information games.
Building upon CFR, this paper proposes a new algorithm named Pure CFR (PCFR)
for achieving better performance. PCFR can be seen as a combination of CFR and
Fictitious Play (FP), inheriting the concept of counterfactual regret (value)
from CFR, and using the best response strategy instead of the regret matching
strategy for the next iteration. Our theoretical proof that PCFR can achieve
Blackwell approachability enables PCFR's ability to combine with any CFR
variant including Monte Carlo CFR (MCCFR). The resultant Pure MCCFR (PMCCFR)
can significantly reduce time and space complexity. Particularly, the
convergence speed of PMCCFR is at least three times more than that of MCCFR. In
addition, since PMCCFR does not pass through the path of strictly dominated
strategies, we developed a new warm-start algorithm inspired by the strictly
dominated strategies elimination method. Consequently, the PMCCFR with new warm
start algorithm can converge by two orders of magnitude faster than the CFR+
algorithm
Solving Imperfect-Information Games via Discounted Regret Minimization
Counterfactual regret minimization (CFR) is a family of iterative algorithms
that are the most popular and, in practice, fastest approach to approximately
solving large imperfect-information games. In this paper we introduce novel CFR
variants that 1) discount regrets from earlier iterations in various ways (in
some cases differently for positive and negative regrets), 2) reweight
iterations in various ways to obtain the output strategies, 3) use a
non-standard regret minimizer and/or 4) leverage "optimistic regret matching".
They lead to dramatically improved performance in many settings. For one, we
introduce a variant that outperforms CFR+, the prior state-of-the-art
algorithm, in every game tested, including large-scale realistic settings. CFR+
is a formidable benchmark: no other algorithm has been able to outperform it.
Finally, we show that, unlike CFR+, many of the important new variants are
compatible with modern imperfect-information-game pruning techniques and one is
also compatible with sampling in the game tree
Smoothing Method for Approximate Extensive-Form Perfect Equilibrium
Nash equilibrium is a popular solution concept for solving
imperfect-information games in practice. However, it has a major drawback: it
does not preclude suboptimal play in branches of the game tree that are not
reached in equilibrium. Equilibrium refinements can mend this issue, but have
experienced little practical adoption. This is largely due to a lack of
scalable algorithms.
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective algorithms for computing Nash equilibria in
large-scale two-player zero-sum extensive-form games. In this paper, we
provide, to our knowledge, the first extension of these methods to equilibrium
refinements. We develop a smoothing approach for behavioral perturbations of
the convex polytope that encompasses the strategy spaces of players in an
extensive-form game. This enables one to compute an approximate variant of
extensive-form perfect equilibria. Experiments show that our smoothing approach
leads to solutions with dramatically stronger strategies at information sets
that are reached with low probability in approximate Nash equilibria, while
retaining the overall convergence rate associated with fast algorithms for Nash
equilibrium. This has benefits both in approximate equilibrium finding (such
approximation is necessary in practice in large games) where some probabilities
are low while possibly heading toward zero in the limit, and exact equilibrium
computation where the low probabilities are actually zero.Comment: Published at IJCAI 1
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Public Information Representation for Adversarial Team Games
The peculiarity of adversarial team games resides in the asymmetric
information available to the team members during the play, which makes the
equilibrium computation problem hard even with zero-sum payoffs. The algorithms
available in the literature work with implicit representations of the strategy
space and mainly resort to Linear Programming and column generation techniques
to enlarge incrementally the strategy space. Such representations prevent the
adoption of standard tools such as abstraction generation, game solving, and
subgame solving, which demonstrated to be crucial when solving huge, real-world
two-player zero-sum games. Differently from these works, we answer the question
of whether there is any suitable game representation enabling the adoption of
those tools. In particular, our algorithms convert a sequential team game with
adversaries to a classical two-player zero-sum game. In this converted game,
the team is transformed into a single coordinator player who only knows
information common to the whole team and prescribes to the players an action
for any possible private state. Interestingly, we show that our game is more
expressive than the original extensive-form game as any state/action
abstraction of the extensive-form game can be captured by our representation,
while the reverse does not hold. Due to the NP-hard nature of the problem, the
resulting Public Team game may be exponentially larger than the original one.
To limit this explosion, we provide three algorithms, each returning an
information-lossless abstraction that dramatically reduces the size of the
tree. These abstractions can be produced without generating the original game
tree. Finally, we show the effectiveness of the proposed approach by presenting
experimental results on Kuhn and Leduc Poker games, obtained by applying
state-of-art algorithms for two-player zero-sum games on the converted gamesComment: 19 pages, 7 figures, Best Paper Award in Cooperative AI Workshop at
NeurIPS 202
- …