Search CORE

146,937 research outputs found

Single-Agent and Game-Tree Search

Author
Publication venue: 'IOS Press'
Publication date
Field of study

Swarm intelligence in cooperative environments: n-step dynamic tree search algorithm overview

Author: Espinós Longa Marc
Inalhan Gokhan
Tsourdos Antonios
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 03/05/2023
Field of study

Reinforcement learning tree-based planning methods have been gaining popularity in the last few years due to their success in single-agent domains, where a perfect simulator model is available: for example, Go and chess strategic board games. This paper pretends to extend tree search algorithms to the multiagent setting in a decentralized structure, dealing with scalability issues and exponential growth of computational resources. The n-step dynamic tree search combines forward planning and direct temporal-difference updates, outperforming markedly conventional tabular algorithms such as Q learning and state-action-reward-state-action (SARSA). Future state transitions and rewards are predicted with a model built and learned from real interactions between agents and the environment. This paper analyzes the developed algorithm in the hunter–pursuit cooperative game against stochastic and intelligent evaders. The n-step dynamic tree search aims to adapt single-agent tree search learning methods to the multiagent boundaries and is demonstrated to be a remarkable advance as compared to conventional temporal-difference techniques

Cranfield CERES

Swarm intelligence in cooperative environments: N-step dynamic tree search algorithm extended analysis

Author: Espinós Longa Marc
Inalhan Gokhan
Tsourdos Antonios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/09/2022
Field of study

Reinforcement learning tree-based planning methods have been gaining popularity in the last few years due to their success in single-agent domains, where a perfect simulator model is available, e.g., Go and chess strategic board games. This paper pretends to extend tree search algorithms to the multi-agent setting in a decentralized structure, dealing with scalability issues and exponential growth of computational resources. The N-Step Dynamic Tree Search combines forward planning and direct temporal-difference updates, outperforming markedly state-of-the-art algorithms such as Q-Learning and SARSA. Future state transitions and rewards are predicted with a model built and learned from real interactions between agents and the environment. As an extension of previous work, this paper analyses the developed algorithm in the Hunter-Pursuit cooperative game against intelligent evaders. The N-Step Dynamic Tree Search aims to adapt the most successful single-agent learning methods to the multi-agent boundaries and demonstrates to be a remarkable advance compared to conventional temporal-difference techniques.Engineering and Physical Sciences Research Council (EPSRC): 2454254. BAE System

Cranfield CERES

Recommended from our members

Competitive multi-agent search

Author: Bahceci Erkin
Publication venue
Publication date: 09/02/2015
Field of study

textWhile evolutionary computation is well suited for automatic discovery in engineering, it can also be used to gain insight into how humans and organizations could perform more effectively. Using a real-world problem of innovation search in organizations as the motivating example, this dissertation formalizes human creative problem solving as competitive multi-agent search. It differs from existing single-agent and team-search problems in that the agents interact through knowledge of other agents' searches and through the dynamic changes in the search landscape caused by these searches. The main hypothesis is that evolutionary computation can be used to discover effective strategies for competitive multi-agent search. This hypothesis is verified in experiments using an abstract domain based on the NK model, i.e. partially correlated and tunably rugged fitness landscapes, and a concrete domain in the form of a social innovation game. In both domains, different specialized strategies are evolved for each different competitive environment, and also strategies that generalize across environments. Strategies evolved in the abstract domain are more effective and more complex than hand-designed strategies and one based on traditional tree search. Using a novel spherical visualization of the fitness landscapes of the abstract domain, insight is gained about how successful strategies work, e.g. by tracking positive changes in the landscape. In the concrete game domain, human players were modeled using backpropagation, and used as opponents to create environments for evolution. Evolved strategies scored significantly higher than the human models by using a different proportion of actions, providing insights into how performance could be improved in social innovation domains. The work thus provides a possible framework for studying various human creative activities as competitive multi-agent search in the future.Computer Science

Texas ScholarWorks

Self-adaptive MCTS for General Video Game Playing

Author: Bravi Ivan
Gaina Raluca D.
Kaufmann P.
Liu Jialin
Lucas Simon M.
Perez-Liebana Diego
Sim K.
Sironi Chiara F.
Winands Mark H. M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Monte-carlo tree search (mcts) has shown particular success in general game playing (ggp) and general video game playing (gvgp) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for mcts agents has been proposed that almost achieves the same performance as off-line tuning in ggp.in this paper we apply the same approach to gvgp and use the popular general video game ai (gvgai) framework, in which the time allowed to make a decision is only 40 ms. We design three self-adaptive mcts (sa-mcts) agents that optimize on-line the parameters of a standard non-self-adaptive mcts agent of gvgai. The three agents select the parameter values using naïve monte-carlo, an evolutionary algorithm and an n-tuple bandit evolutionary algorithm respectively, and are tested on 20 single-player games of gvgai.the sa-mcts agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard mcts agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-self-adaptive mcts instances that use the most sampled parameter settings during the on-line tuning of each of the three sa-mcts agents for each game. Results show that these parameter settings improve the win rate on the games wait for breakfast and escape by 4 times and 150 times, respectively

Maastricht University Research Portal

Crossref

Queen Mary Research Online

The effect of simulation bias on action selection in Monte Carlo Tree Search

Author: James Steven Doron
Publication venue
Publication date: 01/01/2016
Field of study

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Master of Science. August 2016.Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread attention in recent years. It combines a traditional tree-search approach with Monte Carlo simulations, using the outcome of these simulations (also known as playouts or rollouts) to evaluate states in a look-ahead tree. That MCTS does not require an evaluation function makes it particularly well-suited to the game of Go — seen by many to be chess’s successor as a grand challenge of artificial intelligence — with MCTS-based agents recently able to achieve expert-level play on 19×19 boards. Furthermore, its domain-independent nature also makes it a focus in a variety of other fields, such as Bayesian reinforcement learning and general game-playing. Despite the vast amount of research into MCTS, the dynamics of the algorithm are still not yet fully understood. In particular, the effect of using knowledge-heavy or biased simulations in MCTS still remains unknown, with interesting results indicating that better-informed rollouts do not necessarily result in stronger agents. This research provides support for the notion that MCTS is well-suited to a class of domain possessing a smoothness property. In these domains, biased rollouts are more likely to produce strong agents. Conversely, any error due to incorrect bias is compounded in non-smooth domains, and in particular for low-variance simulations. This is demonstrated empirically in a number of single-agent domains.LG201

Wits Institutional Repository on DSPACE

Fast Approximate Max-n Monte Carlo Tree Search for Ms Pac-Man

Author: Lucas Simon
Robles David
Samothrakis Spyridon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2011
Field of study

We present an application of Monte Carlo tree search (MCTS) for the game of Ms Pac-Man. Contrary to most applications of MCTS to date, Ms Pac-Man requires almost real-time decision making and does not have a natural end state. We approached the problem by performing Monte Carlo tree searches on a five player maxn tree representation of the game with limited tree search depth. We performed a number of experiments using both the MCTS game agents (for pacman and ghosts) and agents used in previous work (for ghosts). Performance-wise, our approach gets excellent scores, outperforming previous non-MCTS opponent approaches to the game by up to two orders of magnitude. © 2011 IEEE

University of Essex Research Repository

Crossref

Shallow decision-making analysis in General Video Game Playing

Author: Bravi Ivan
Liu Jialin
Lucas Simon
Perez-Liebana Diego
Publication venue
Publication date: 04/06/2018
Field of study

The General Video Game AI competitions have been the testing ground for several techniques for game playing, such as evolutionary computation techniques, tree search algorithms, hyper heuristic based or knowledge based algorithms. So far the metrics used to evaluate the performance of agents have been win ratio, game score and length of games. In this paper we provide a wider set of metrics and a comparison method for evaluating and comparing agents. The metrics and the comparison method give shallow introspection into the agent's decision making process and they can be applied to any agent regardless of its algorithmic nature. In this work, the metrics and the comparison method are used to measure the impact of the terms that compose a tree policy of an MCTS based agent, comparing with several baseline agents. The results clearly show how promising such general approach is and how it can be useful to understand the behaviour of an AI agent, in particular, how the comparison with baseline agents can help understanding the shape of the agent decision landscape. The presented metrics and comparison method represent a step toward to more descriptive ways of logging and analysing agent's behaviours

arXiv.org e-Print Archive

Crossref