1 research outputs found
Pure Monte Carlo Counterfactual Regret Minimization
Counterfactual Regret Minimization (CFR) and its variants are the best
algorithms so far for solving large-scale incomplete information games.
Building upon CFR, this paper proposes a new algorithm named Pure CFR (PCFR)
for achieving better performance. PCFR can be seen as a combination of CFR and
Fictitious Play (FP), inheriting the concept of counterfactual regret (value)
from CFR, and using the best response strategy instead of the regret matching
strategy for the next iteration. Our theoretical proof that PCFR can achieve
Blackwell approachability enables PCFR's ability to combine with any CFR
variant including Monte Carlo CFR (MCCFR). The resultant Pure MCCFR (PMCCFR)
can significantly reduce time and space complexity. Particularly, the
convergence speed of PMCCFR is at least three times more than that of MCCFR. In
addition, since PMCCFR does not pass through the path of strictly dominated
strategies, we developed a new warm-start algorithm inspired by the strictly
dominated strategies elimination method. Consequently, the PMCCFR with new warm
start algorithm can converge by two orders of magnitude faster than the CFR+
algorithm