2,859 research outputs found
Smoothing Method for Approximate Extensive-Form Perfect Equilibrium
Nash equilibrium is a popular solution concept for solving
imperfect-information games in practice. However, it has a major drawback: it
does not preclude suboptimal play in branches of the game tree that are not
reached in equilibrium. Equilibrium refinements can mend this issue, but have
experienced little practical adoption. This is largely due to a lack of
scalable algorithms.
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective algorithms for computing Nash equilibria in
large-scale two-player zero-sum extensive-form games. In this paper, we
provide, to our knowledge, the first extension of these methods to equilibrium
refinements. We develop a smoothing approach for behavioral perturbations of
the convex polytope that encompasses the strategy spaces of players in an
extensive-form game. This enables one to compute an approximate variant of
extensive-form perfect equilibria. Experiments show that our smoothing approach
leads to solutions with dramatically stronger strategies at information sets
that are reached with low probability in approximate Nash equilibria, while
retaining the overall convergence rate associated with fast algorithms for Nash
equilibrium. This has benefits both in approximate equilibrium finding (such
approximation is necessary in practice in large games) where some probabilities
are low while possibly heading toward zero in the limit, and exact equilibrium
computation where the low probabilities are actually zero.Comment: Published at IJCAI 1
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective in solving large-scale two-player zero-sum
extensive-form games. The convergence rates of these methods depend heavily on
the properties of the distance-generating function that they are based on. We
investigate the acceleration of first-order methods for solving extensive-form
games through better design of the dilated entropy function---a class of
distance-generating functions related to the domains associated with the
extensive-form games. By introducing a new weighting scheme for the dilated
entropy function, we develop the first distance-generating function for the
strategy spaces of sequential games that has no dependence on the branching
factor of the player. This result improves the convergence rate of several
first-order methods by a factor of , where is the branching
factor of the player, and is the depth of the game tree.
Thus far, counterfactual regret minimization methods have been faster in
practice, and more popular, than first-order methods despite their
theoretically inferior convergence rates. Using our new weighting scheme and
practical tuning we show that, for the first time, the excessive gap technique
can be made faster than the fastest counterfactual regret minimization
algorithm, CFR+, in practice
Pure Monte Carlo Counterfactual Regret Minimization
Counterfactual Regret Minimization (CFR) and its variants are the best
algorithms so far for solving large-scale incomplete information games.
Building upon CFR, this paper proposes a new algorithm named Pure CFR (PCFR)
for achieving better performance. PCFR can be seen as a combination of CFR and
Fictitious Play (FP), inheriting the concept of counterfactual regret (value)
from CFR, and using the best response strategy instead of the regret matching
strategy for the next iteration. Our theoretical proof that PCFR can achieve
Blackwell approachability enables PCFR's ability to combine with any CFR
variant including Monte Carlo CFR (MCCFR). The resultant Pure MCCFR (PMCCFR)
can significantly reduce time and space complexity. Particularly, the
convergence speed of PMCCFR is at least three times more than that of MCCFR. In
addition, since PMCCFR does not pass through the path of strictly dominated
strategies, we developed a new warm-start algorithm inspired by the strictly
dominated strategies elimination method. Consequently, the PMCCFR with new warm
start algorithm can converge by two orders of magnitude faster than the CFR+
algorithm
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
- …