Search CORE

55 research outputs found

Fast Approximate Max-n Monte Carlo Tree Search for Ms Pac-Man

Author: Lucas Simon
Robles David
Samothrakis Spyridon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2011
Field of study

We present an application of Monte Carlo tree search (MCTS) for the game of Ms Pac-Man. Contrary to most applications of MCTS to date, Ms Pac-Man requires almost real-time decision making and does not have a natural end state. We approached the problem by performing Monte Carlo tree searches on a five player maxn tree representation of the game with limited tree search depth. We performed a number of experiments using both the MCTS game agents (for pacman and ghosts) and agents used in previous work (for ghosts). Performance-wise, our approach gets excellent scores, outperforming previous non-MCTS opponent approaches to the game by up to two orders of magnitude. © 2011 IEEE

University of Essex Research Repository

Crossref

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Crossref

Development of Rehabilitation System (RehabGame) through Monte-Carlo Tree Search Algorithm using Kinect and Myo Sensor Interface

Author: Sadeghi Esfahlani Shabnam
Wilson George
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/01/2018
Field of study

Artificial and Computational Intelligence in computer games play an important role that could simulate various aspects of real life problems. Development of artificial intelligence techniques in real time decision-making games can provide a platform for the examination of tree search algorithms. In this paper, we present a rehabilitation system known as RehabGame in which the Monte-Carlo Tree Search algorithm is used. The objective of the game is to combat the physical impairment of stroke/ brain injury casualties in order to improve upper limb movement. Through the process of a real-time rehabilitation game, the player decides on paths that could be taken by her/his upper limb in order to reach virtual goal objects. The system has the capability of adjusting the difficulty level to the player0 s ability by learning from the movements made and generating further subsequent objects. The game collects orientation, muscle and joint activity data and utilizes them to make decisions on game progression. Limb movements are stored in the search tree which is used to determine the location of new target virtual fruit objects by accessing the data saved in the background from different game plays. It monitors the enactment of the muscles strain and stress through the Myo armband sensor and provides the next step required for the rehabilitation purpose. The results from two samples show the effectiveness of the MonteCarlo Tree Search in the RehabGame by being able to build a coherent hand motion. It progresses from highly achievable paths to the less achievable ones, thus configuring and personalizing the rehabilitation process

arXiv.org e-Print Archive

Crossref

Anglia Ruskin Research

Pac-Man Conquers Academia: Two Decades of Research Using a Classic Arcade Game

Author: Liu J
Lucas SM
Perez-Liebana D
Rohlfshagen P
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/11/2018
Field of study

Crossref

Queen Mary Research Online

Monte-Carlo Tree Search Algorithm in Pac-Man Identification of commonalities in 2D video games for realisation in AI (Artificial Intelligence)

Author: Tilkidagi Salman
Publication venue
Publication date: 01/11/2021
Field of study

The research is dedicated to the game strategy, which uses the Monte-Carlo Tree Search algorithm for the Pac-Man agent. Two main strategies were heavily researched for Pac-Man’s behaviour (Next Level priority) and HS (Highest Score priority). The Pacman game best known as STPacman is a 2D maze game that will allow users to play the game using artificial intelligence and smart features such as, Panic buttons (where players can activate on or off when they want and when they do activate it Pacman will be controlled via Artificial intelligence). A Variety of experiments were provided to compare the results to determine the efficiency of every strategy. A lot of intensive research was also put into place to find a variety of 2D games (Chess, Checkers, Go, etc.) which have similar functionalities to the game of Pac-Man. The main idea behind the research was to see how effective 2D games will be if they were to be implemented in the program (Classes/Methods) and how well would the artificial intelligence used in the development of STPacman behave/perform in a variety of different 2D games. A lot of time was also dedicated to researching an ‘AI’ engine that will be able to develop any 2D game based on the users submitted requirements with the use of a spreadsheet functionality (chapter 3, topic 3.3.1 shows an example of the spreadsheet feature) which will contain near enough everything to do with 2D games such as the parameters (The API/Classes/Methods/Text descriptions and more). The spreadsheet feature will act as a tool that will scan/examine all of the users submitted requirements and will give a rough estimation(time) on how long it will take for the chosen 2D game to be developed. It will have a lot of smart functionality and if the game is not unique like chess/checkers it will automatically recognize it and alert the user of it

University of Essex Research Repository

A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks

Author: Burger Clayton
Publication venue: 'University of Zagreb, Faculty of Science, Department of Mathematics'
Publication date: 01/01/2014
Field of study

In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm

Nelson Mandela University

South East Academic Libraries System (SEALS)

Optimising Agent Behaviours and Game Parameters to Meet Designer’s Objectives

Author: Sombat Wichit
Publication venue
Publication date: 01/09/2016
Field of study

The game industry is one of the biggest economic sector in the entertainment business whose product rely heavily on the quality of the interactivity to stay relevant. Non-Player Character (NPC) is the main mechanic used for this purpose and it has to be optimised for its designated behaviour. The development process iteratively circulates the results among game designers, game AI developers, and game testers. Automatic optimisation of NPCs to designer’s objective will increase the speed of each iteration, and reduce the overall production time. Previous attempts used entropy evaluation metrics which are difficult to translate the terms to the optimising game and a slight misinterpretation often leads to incorrect measurement. This thesis proposes an alternative method which evaluates generated game data with reference result from the testers. The thesis first presents a reliable way to extract information for NPCs classification called Relative Region Feature (RRF). RRF provides an excellent data compression method, a way to effectively classify, and a way to optimise objective-oriented adaptive NPCs. The formalised optimisation is also proved to work on classifying player skill with the reference hall-of-fame scores. The demonstration are done on the on-line competition version of Ms PacMan. The generated games from participating entries provide challenging optimising problems for various evolutionary optimisers. The thesis developed modified version of CMA-ES and PSO to effectively tackle the problems. It also demonstrates the adaptivity of MCTS NPC which uses the evaluation method. This NPC performs reasonably well given adequate resources and no reference NPC is required

University of Essex Research Repository

Uma arquitetura de subsunção com capacidades adaptativas preditivas para o Pacman

Author: Mendes Osvaldo Francisco Lopes
Publication venue
Publication date: 01/01/2012
Field of study

Dissertação de mest., Engenharia Informática, Faculdade de Ciências e Tecnologia, Univ. do Algarve, 2012Os jogos de computador são um domínio de estudo muito importante na área da inteligência computacional. Essa importância advém das propriedades de seus ambientes: multi-agente, competitivos, estocásticos e dinâmicos; onde a verificação de sucesso ou fracasso é de fácil verificação. Para além disso, os jogos e o entretenimento digital em geral, são uma industria em expansão que gera um volume de negócios considerável. O objetivo deste trabalho é desenvolver um agente para controlar o famoso pacman, capaz de participar numa das mais populares competições, organizadas pela conferência IEEE em inteligência computacional e jogos. Ganha a competição quem conseguir a pontuação média mais elevada de 3 execuções por equipa de fantasmas. O objetivo das equipas de fantasmas é fazer diminuir essas pontuações A dificuldade do pacman deve-se ao facto de fornecer um ambiente estocástico, dinâmico, parcialmente observável, ser um jogo do tipo predador/presa com 4 predadores e ocorrer dentro de um labirinto, o que condiciona os movimentos. A abordagem proposta neste trabalho é estender a arquitetura reativa de Brooks, a chamada arquitetura de subsunção com capacidades adaptativas e preditivas. O agente assim construído deverá ser capaz de prever o movimento dos fantasmas ao longo de um horizonte temporal no futuro, baseando-se num modelo que é atualizado com informação recolhida no passado e usar essas previsões para decidir o que fazer a seguir

Sapientia

Monte-Carlo tree search enhancements for one-player and two-player domains

Author: Baier Hendrik
Publication venue: 'University of Maastricht'
Publication date: 01/01/2015
Field of study

Maastricht University Research Portal

Learning to Search in Reinforcement Learning

Author: Antonoglou Ioannis
Publication venue: UCL (University College London)
Publication date: 28/04/2023
Field of study

In this thesis, we investigate the use of search based algorithms with deep neural networks to tackle a wide range of problems ranging from board games to video games and beyond. Drawing inspiration from AlphaGo, the first computer program to achieve superhuman performance in the game of Go, we developed a new algorithm AlphaZero. AlphaZero is a general reinforcement learning algorithm that combines deep neural networks with a Monte Carlo Tree search for planning and learning. Starting completely from scratch, without any prior human knowledge beyond the basic rules of the game, AlphaZero managed to achieve superhuman performance in Go, chess and shogi. Subsequently, building upon the success of AlphaZero, we investigated ways to extend our methods to problems in which the rules are not known or cannot be hand-coded. This line of work led to the development of MuZero, a model-based reinforcement learning agent that builds a deterministic internal model of the world and uses it to construct plans in its imagination. We applied our method to Go, chess, shogi and the classic Atari suite of video-games, achieving superhuman performance. MuZero is the first RL algorithm to master a variety of both canonical challenges for high performance planning and visually complex problems using the same principles. Finally, we describe Stochastic MuZero, a general agent that extends the applicability of MuZero to highly stochastic environments. We show that our method achieves superhuman performance in stochastic domains such as backgammon and the classic game of 2048 while matching the performance of MuZero in deterministic ones like Go

UCL Discovery