735 research outputs found

    Analysis and Optimization of Deep Counterfactual Value Networks

    Full text link
    Recently a strong poker-playing algorithm called DeepStack was published, which is able to find an approximate Nash equilibrium during gameplay by using heuristic values of future states predicted by deep neural networks. This paper analyzes new ways of encoding the inputs and outputs of DeepStack's deep counterfactual value networks based on traditional abstraction techniques, as well as an unabstracted encoding, which was able to increase the network's accuracy.Comment: Long version of publication appearing at KI 2018: The 41st German Conference on Artificial Intelligence (http://dx.doi.org/10.1007/978-3-030-00111-7_26). Corrected typo in titl

    Most Important Fundamental Rule of Poker Strategy

    Full text link
    Poker is a large complex game of imperfect information, which has been singled out as a major AI challenge problem. Recently there has been a series of breakthroughs culminating in agents that have successfully defeated the strongest human players in two-player no-limit Texas hold 'em. The strongest agents are based on algorithms for approximating Nash equilibrium strategies, which are stored in massive binary files and unintelligible to humans. A recent line of research has explored approaches for extrapolating knowledge from strong game-theoretic strategies that can be understood by humans. This would be useful when humans are the ultimate decision maker and allow humans to make better decisions from massive algorithmically-generated strategies. Using techniques from machine learning we have uncovered a new simple, fundamental rule of poker strategy that leads to a significant improvement in performance over the best prior rule and can also easily be applied by human players

    Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent

    Full text link
    Blackwell approachability is a framework for reasoning about repeated games with vector-valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the next payoff vector is given, and the decision maker tries to achieve better performance based on the accuracy of that estimator. In order to derive algorithms that achieve predictive Blackwell approachability, we start by showing a powerful connection between four well-known algorithms. Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the most prevalent regret minimizers in online convex optimization. In spite of this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms have been preferred in the practice of solving large-scale games (as the local regret minimizers within the counterfactual regret minimization framework). We show that RM and RM+ are the algorithms that result from running FTRL and OMD, respectively, to select the halfspace to force at all times in the underlying Blackwell approachability game. By applying the predictive variants of FTRL or OMD to this connection, we obtain predictive Blackwell approachability algorithms, as well as predictive variants of RM and RM+. In experiments across 18 common zero-sum extensive-form benchmark games, we show that predictive RM+ coupled with counterfactual regret minimization converges vastly faster than the fastest prior algorithms (CFR+, DCFR, LCFR) across all games but two of the poker games and Liar's Dice, sometimes by two or more orders of magnitude

    Developing Artificial Intelligence Agents for a Turn-Based Imperfect Information Game

    Get PDF
    Artificial intelligence (AI) is often employed to play games, whether to entertain human opponents, devise and test strategies, or obtain other analytical data. Games with hidden information require specific approaches by the player. As a result, the AI must be equipped with methods of operating without certain important pieces of information while being aware of the resulting potential dangers. The computer game GNaT was designed as a testbed for AI strategies dealing specifically with imperfect information. Its development and functionality are described, and the results of testing several strategies through AI agents are discussed
    • …
    corecore