735 research outputs found
Analysis and Optimization of Deep Counterfactual Value Networks
Recently a strong poker-playing algorithm called DeepStack was published,
which is able to find an approximate Nash equilibrium during gameplay by using
heuristic values of future states predicted by deep neural networks. This paper
analyzes new ways of encoding the inputs and outputs of DeepStack's deep
counterfactual value networks based on traditional abstraction techniques, as
well as an unabstracted encoding, which was able to increase the network's
accuracy.Comment: Long version of publication appearing at KI 2018: The 41st German
Conference on Artificial Intelligence
(http://dx.doi.org/10.1007/978-3-030-00111-7_26). Corrected typo in titl
Most Important Fundamental Rule of Poker Strategy
Poker is a large complex game of imperfect information, which has been
singled out as a major AI challenge problem. Recently there has been a series
of breakthroughs culminating in agents that have successfully defeated the
strongest human players in two-player no-limit Texas hold 'em. The strongest
agents are based on algorithms for approximating Nash equilibrium strategies,
which are stored in massive binary files and unintelligible to humans. A recent
line of research has explored approaches for extrapolating knowledge from
strong game-theoretic strategies that can be understood by humans. This would
be useful when humans are the ultimate decision maker and allow humans to make
better decisions from massive algorithmically-generated strategies. Using
techniques from machine learning we have uncovered a new simple, fundamental
rule of poker strategy that leads to a significant improvement in performance
over the best prior rule and can also easily be applied by human players
Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent
Blackwell approachability is a framework for reasoning about repeated games
with vector-valued payoffs. We introduce predictive Blackwell approachability,
where an estimate of the next payoff vector is given, and the decision maker
tries to achieve better performance based on the accuracy of that estimator. In
order to derive algorithms that achieve predictive Blackwell approachability,
we start by showing a powerful connection between four well-known algorithms.
Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the
most prevalent regret minimizers in online convex optimization. In spite of
this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms
have been preferred in the practice of solving large-scale games (as the local
regret minimizers within the counterfactual regret minimization framework). We
show that RM and RM+ are the algorithms that result from running FTRL and OMD,
respectively, to select the halfspace to force at all times in the underlying
Blackwell approachability game. By applying the predictive variants of FTRL or
OMD to this connection, we obtain predictive Blackwell approachability
algorithms, as well as predictive variants of RM and RM+. In experiments across
18 common zero-sum extensive-form benchmark games, we show that predictive RM+
coupled with counterfactual regret minimization converges vastly faster than
the fastest prior algorithms (CFR+, DCFR, LCFR) across all games but two of the
poker games and Liar's Dice, sometimes by two or more orders of magnitude
Developing Artificial Intelligence Agents for a Turn-Based Imperfect Information Game
Artificial intelligence (AI) is often employed to play games, whether to entertain human opponents, devise and test strategies, or obtain other analytical data. Games with hidden information require specific approaches by the player. As a result, the AI must be equipped with methods of operating without certain important pieces of information while being aware of the resulting potential dangers. The computer game GNaT was designed as a testbed for AI strategies dealing specifically with imperfect information. Its development and functionality are described, and the results of testing several strategies through AI agents are discussed
- …