Search CORE

19,586 research outputs found

The Hanabi Challenge: A New Frontier for AI Research

Author: Bard Nolan
Bellemare Marc G.
Bowling Michael
Burch Neil
Chandar Sarath
Dumoulin Vincent
Dunning Iain
Foerster Jakob N.
Hughes Edward
Lanctot Marc
Larochelle Hugo
Moitra Subhodeep
Mourad Shibl
Parisotto Emilio
Song H. Francis
Publication venue: 'Elsevier BV'
Publication date: 06/12/2019
Field of study

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques.Comment: 32 pages, 5 figures, In Press (Artificial Intelligence

arXiv.org e-Print Archive

PolyPublie

ViZDoom Competitions: Playing Doom from Pixels

Author: Jaśkowski Wojciech
Kempka Michał
Wydmuch Marek
Publication venue
Publication date: 10/09/2018
Field of study

This paper presents the first two editions of Visual Doom AI Competition, held in 2016 and 2017. The challenge was to create bots that compete in a multi-player deathmatch in a first-person shooter (FPS) game, Doom. The bots had to make their decisions based solely on visual information, i.e., a raw screen buffer. To play well, the bots needed to understand their surroundings, navigate, explore, and handle the opponents at the same time. These aspects, together with the competitive multi-agent aspect of the game, make the competition a unique platform for evaluating the state of the art reinforcement learning algorithms. The paper discusses the rules, solutions, results, and statistics that give insight into the agents' behaviors. Best-performing agents are described in more detail. The results of the competition lead to the conclusion that, although reinforcement learning can produce capable Doom bots, they still are not yet able to successfully compete against humans in this game. The paper also revisits the ViZDoom environment, which is a flexible, easy to use, and efficient 3D platform for research for vision-based reinforcement learning, based on a well-recognized first-person perspective game Doom

arXiv.org e-Print Archive

Poker Learner: Reinforcement Learning Applied to Texas Hold'em Poker

Author: Passos Nuno Miguel da Silva
Publication venue
Publication date: 01/01/2011
Field of study

Bibliografia: p. 61-66Tese de Mestrado Integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia.. 201

Repositório Aberto da Universidade do Porto

Opponent Modelling in Multi-Agent Systems

Author: Tian Zheng
Publication venue: UCL (University College London)
Publication date: 28/11/2021
Field of study

Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achieve certain goals by maximising a long-term return in an environment. Multi-agent reinforcement learning (MARL) extends traditional RL to multiple agents. Many RL algorithms lose convergence guarantee in non-stationary environments due to the adaptive opponents. Partial observation caused by agents’ different private observations introduces high variance during the training which exacerbates the data inefficiency. In MARL, training an agent to perform well against a set of opponents often leads to bad performance against another set of opponents. Non-stationarity, partial observation and unclear learning objective are three critical problems in MARL which hinder agents’ learning and they all share a cause which is the lack of knowledge of the other agents. Therefore, in this thesis, we propose to solve these problems with opponent modelling methods. We tailor our solutions by combining opponent modelling with other techniques according to the characteristics of problems we face. Specifically, we first propose ROMMEO, an algorithm inspired by Bayesian inference, as a solution to alleviate the non-stationarity in cooperative games. Then we study the partial observation problem caused by agents’ private observation and design an implicit communication training method named PBL. Lastly, we investigate solutions to the non-stationarity and unclear learning objective problems in zero-sum games. We propose a solution named EPSOM which aims for finding safe exploitation strategies to play against non-stationary opponents. We verify our proposed methods by varied experiments and show they can achieve the desired performance. Limitations and future works are discussed in the last chapter of this thesis

UCL Discovery

Benelearn 2005: Annual Machine Learning Conference of Belgium and the Netherlands:CTIT Proceedings of the 14th annual Machine Learning Conference of Belgium and the Netherlands

Author
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 01/02/2005
Field of study

University of Twente Research Information