Search CORE

3,978 research outputs found

The Solution of a Two-Person Poker Variant

Author: Ponssard J.-P.
Publication venue: WP-74-016
Publication date: 01/01/1974
Field of study

This note presents the solution of a two-person poker variant considered by Friedman. The solution is derived using a general algorithm proposed by the author to solve two-person zero sum games with "almost" perfect information

International Institute for Applied Systems Analysis (IIASA)

Analysis and Optimization of Deep Counterfactual Value Networks

Author: J Nash
M Bowling
T Kanungo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/10/2018
Field of study

Recently a strong poker-playing algorithm called DeepStack was published, which is able to find an approximate Nash equilibrium during gameplay by using heuristic values of future states predicted by deep neural networks. This paper analyzes new ways of encoding the inputs and outputs of DeepStack's deep counterfactual value networks based on traditional abstraction techniques, as well as an unabstracted encoding, which was able to increase the network's accuracy.Comment: Long version of publication appearing at KI 2018: The 41st German Conference on Artificial Intelligence (http://dx.doi.org/10.1007/978-3-030-00111-7_26). Corrected typo in titl

arXiv.org e-Print Archive

Crossref

Simplified three player Kuhn poker

Author: Billingham John
Publication venue
Publication date: 25/04/2017
Field of study

We study a very small three player poker game (one-third street Kuhn poker), and a simplified version of the game that is interesting because it has three distinct equilibrium solutions. For one-third street Kuhn poker, we are able to find all of the equilibrium solutions analytically. For large enough pot size,

P

, there is a degree of freedom in the solution that allows one player to transfer profit between the other two players without changing their own profit. This has potentially interesting consequences in repeated play of the game. We also show that in a simplified version of the game with

P>5

, there is one equilibrium solution if

5 < P < P^* \equiv (5+\sqrt{73})/2

, and three distinct equilibrium solutions if

P > P^*

. This may be the simplest non-trivial multiplayer poker game with more than one distinct equilibrium solution and provides us with a test case for theories of dynamic strategy adjustment over multiple realisations of the game. We then study a third order system of ordinary differential equations that models the dynamics of three players who try to maximise their expectation by continuously varying their betting frequencies. We find that the dynamics of this system are oscillatory, with two distinct types of solution. We then study a difference equation model, based on repeated play of the game, in which each player continually updates their estimates of the other players' betting frequencies. We find that the dynamics are noisy, but basically oscillatory for short enough estimation periods and slow enough frequency adjustments, but that the dynamics can be very different for other parameter values.Comment: 41 pages, 2 Tables, 17 Figure

arXiv.org e-Print Archive

Nottingham eTheses

Smoothing Method for Approximate Extensive-Form Perfect Equilibrium

Author: Farina Gabriele
Kroer Christian
Sandholm Tuomas
Publication venue
Publication date: 25/05/2017
Field of study

Nash equilibrium is a popular solution concept for solving imperfect-information games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum extensive-form games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an extensive-form game. This enables one to compute an approximate variant of extensive-form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at information sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.Comment: Published at IJCAI 1

arXiv.org e-Print Archive

Crossref

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Author: Heinrich Johannes
Silver David
Publication venue
Publication date: 03/03/2016
Field of study

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

arXiv.org e-Print Archive

UCL Discovery

Simulation of a Texas Hold'Em poker player

Author: Bosilj P
Palašek P
Popović B
Štefić D
Publication venue
Publication date: 23/05/2011
Field of study

Copyright 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. This is the accepted version of the article. The published version is available at

Queen Mary Research Online

Theoretical and Practical Advances on Smoothing for Extensive-Form Games

Author: Kilinc-Karzan Fatma
Kroer Christian
Sandholm Tuomas
Waugh Kevin
Publication venue
Publication date: 08/05/2017
Field of study

Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate the acceleration of first-order methods for solving extensive-form games through better design of the dilated entropy function---a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player. This result improves the convergence rate of several first-order methods by a factor of

\Omega(b^dd)

, where

b

is the branching factor of the player, and

d

is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than first-order methods despite their theoretically inferior convergence rates. Using our new weighting scheme and practical tuning we show that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFR+, in practice

arXiv.org e-Print Archive

Crossref