Search CORE

10,754 research outputs found

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Author: Heinrich Johannes
Silver David
Publication venue
Publication date: 03/03/2016
Field of study

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

arXiv.org e-Print Archive

UCL Discovery

Traditional Wisdom and Monte Carlo Tree Search Face-to-Face in the Card Game Scopone

Author: Di Palma Stefano
Lanzi Pier Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

We present the design of a competitive artificial intelligence for Scopone, a popular Italian card game. We compare rule-based players using the most established strategies (one for beginners and two for advanced players) against players using Monte Carlo Tree Search (MCTS) and Information Set Monte Carlo Tree Search (ISMCTS) with different reward functions and simulation strategies. MCTS requires complete information about the game state and thus implements a cheating player while ISMCTS can deal with incomplete information and thus implements a fair player. Our results show that, as expected, the cheating MCTS outperforms all the other strategies; ISMCTS is stronger than all the rule-based players implementing well-known and most advanced strategies and it also turns out to be a challenging opponent for human players.Comment: Preprint. Accepted for publication in the IEEE Transaction on Game

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Solving Games with Functional Regret Estimation

Author: Bagnell J. Andrew
Bowling Michael
Morrill Dustin
Waugh Kevin
Publication venue
Publication date: 31/12/2014
Field of study

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.Comment: AAAI Conference on Artificial Intelligence 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

What differentiates professional poker players from recreational poker players? A qualitative interview study

Author: Griffiths MD
McCormack A
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

The popularity of poker (and in particular online poker) has increasingly grown worldwide in recent years. Some of the factors that may explain this increased popularity include: (i) an increasing number of celebrities endorsing and playing poker, (ii) an increased number of televised poker tournaments, (iii) 24/7 access of poker via the internet, and (iv) the low stakes needed to play online poker. This increase in the popularity of poker has led to the increased incidence of the ‘professional poker player’. However, very little empirical research has been carried out into this relatively new group of gamblers. This research comprised a grounded theory study involving the analysis of data from three professional poker players, one semi-professional poker player and five recreational poker players. Results showed that all players believed that poker was a game of skill. The central theme as to what distinguishes professional poker players from recreational players was that professional poker players were much more disciplined in their gambling behaviour. They treated their poker playing as work, and as such were more likely to be logical and controlled in their behaviour, took less risks, and were less likely to chase losses. Recreational players were more likely to engage in chasing behaviour, showed signs of lack of control, took more risks, and engaged in gambling while under the influence of alcohol or drugs. Also of importance was the number of games and time spent playing online. Recreational players only played one or two games at a time, whereas professional poker players were much more likely to engage in multitable poker online, and played longer sessions, thus increasing the potential amount of winnings. Playing poker for a living is very possible for a minority of players but it takes a combination of talent, dedication, patience, discipline and disposition to succeed

Nottingham Trent Institutional Repository (IRep)