Search CORE

24 research outputs found

Solving Large Extensive-Form Games with Strategy Constraints

Author: Bowling Michael
Davis Trevor
Waugh Kevin
Publication venue
Publication date: 05/02/2019
Field of study

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zero-sum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.Comment: Appeared in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Building a poker playing agent based on game logs using supervised learning

Author: Teófilo Luís Filipe Guimarães
Publication venue
Publication date: 01/01/2010
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

Building a no limit Texas hold'em poker agent based on game logs using supervised learning

Author: A. Gilpin
B. Beattie
D. Billings
D. Felix
D. Félix
F.-H. Hsu
M. Hall
P.B. Miltersen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The development of competitive artificial Poker players is a challenge to Artificial Intelligence (AI) because the agent must deal with unreliable information and deception which make it essential to model the opponents to achieve good results. In this paper we propose the creation of an artificial Poker player through the analysis of past games between human players, with money involved. To accomplish this goal, we defined a classification problem that associates a given game state with the action that was performed by the player. To validate and test the defined player model, an agent that follows the learned tactic was created. The agent approximately follows the tactics from the human players, thus validating this model. However, this approach alone is insufficient to create a competitive agent, as generated strategies are static, meaning that they can't adapt to different situations. To solve this problem, we created an agent that uses a strategy that combines several tactics from different players. By using the combined strategy, the agentgreatly improved its performance against adversaries capable of modeling opponents

Crossref

Repositório Aberto da Universidade do Porto

HoldemML: A framework to generate No Limit Hold'em Poker agents from human player strategies

Author: Luís Filipe Teófilo
Luís Paulo Reis
Publication venue
Publication date: 01/01/2011
Field of study

Developing computer programs that play Poker at human level is considered to be challenge to the A.I research community, due to its incomplete information and stochastic nature. Due to these characteristics of the game, a competitive agent must manage luck and use opponent modeling to be successful at short term and therefore be profitable. In this paper we propose the creation of No Limit Hold'em Poker agents by copying strategies of the best human players, by analyzing past games between them. To accomplish this goal, first we determine the best players on a set of game logs by determining which ones have higher winning expectation. Next, we define a classification problem to represent the player strategy, by associating a game state with the performed action. To validate and test the defined player model, the HoldemML framework was created. This framework generates agents by classifying the data present on the game logs with the goal to copy the best human player tactics. The created agents approximately follow the tactics from the counterpart human player, thus validating the defined player model. However, this approach proved to be insufficient to create a competitive agent, since the generated strategies were static, which means that they are easy prey to opponents that can perform opponent modeling. This issue can be solved by combining multiple tactics from different players. This way, the agent switches the tactic from time to time, using a simple heuristic, in order to confuse the opponent modeling mechanisms

Repositório Aberto da Universidade do Porto

Rule based strategies for large extensive-form games: A specification language for No-Limit Texas Hold'em agents

Author: Henrique Lopes Cardoso
Luís Filipe Teófilo
Luís Paulo Reis
Pedro Mendes
Publication venue: 'National Library of Serbia'
Publication date: 01/01/2014
Field of study

Poker is used to measure progresses in extensive-form games research due to its unique characteristics: it is a game where playing agents have to deal with incomplete information and stochastic scenarios and a large number of decision points. The development of Poker agents has seen significant advances in one-on-one matches but there are still no consistent results in multiplayer and in games against human experts. In order to allow for experts to aid the improvement of the agents' performance, we have created a high-level strategy specification language. To support strategy definition, we have also developed an intuitive graphical tool. Additionally, we have also created a strategy inferring system, based on a dynamically weighted Euclidean distance. This approach was validated through the creation of simple agents and by successfully inferring strategies from 10 human players. The created agents were able to beat previously developed mid-level agents by a good profit margin

Crossref

Repositório Aberto da Universidade do Porto

Poker Learner: Reinforcement Learning Applied to Texas Hold'em Poker

Author: Passos Nuno Miguel da Silva
Publication venue
Publication date: 01/01/2011
Field of study

Bibliografia: p. 61-66Tese de Mestrado Integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia.. 201

Repositório Aberto da Universidade do Porto

Using a high-level language to build a poker playing agent

Author: Cruz Nuno Pedro Silva da
Publication venue
Publication date: 01/01/2009
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200

CiteSeerX

Repositório Aberto da Universidade do Porto

Opponent Modelling in Multi-Agent Systems

Author: Tian Zheng
Publication venue: UCL (University College London)
Publication date: 28/11/2021
Field of study

Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achieve certain goals by maximising a long-term return in an environment. Multi-agent reinforcement learning (MARL) extends traditional RL to multiple agents. Many RL algorithms lose convergence guarantee in non-stationary environments due to the adaptive opponents. Partial observation caused by agents’ different private observations introduces high variance during the training which exacerbates the data inefficiency. In MARL, training an agent to perform well against a set of opponents often leads to bad performance against another set of opponents. Non-stationarity, partial observation and unclear learning objective are three critical problems in MARL which hinder agents’ learning and they all share a cause which is the lack of knowledge of the other agents. Therefore, in this thesis, we propose to solve these problems with opponent modelling methods. We tailor our solutions by combining opponent modelling with other techniques according to the characteristics of problems we face. Specifically, we first propose ROMMEO, an algorithm inspired by Bayesian inference, as a solution to alleviate the non-stationarity in cooperative games. Then we study the partial observation problem caused by agents’ private observation and design an implicit communication training method named PBL. Lastly, we investigate solutions to the non-stationarity and unclear learning objective problems in zero-sum games. We propose a solution named EPSOM which aims for finding safe exploitation strategies to play against non-stationary opponents. We verify our proposed methods by varied experiments and show they can achieve the desired performance. Limitations and future works are discussed in the last chapter of this thesis

UCL Discovery