Search CORE

413 research outputs found

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Iterative Depth-First Search for Fully Observable Non-Deterministic Planning

Author: De Giacomo Giuseppe
Messa Frederico
Pereira André G.
Pereira Ramon Fraga
Publication venue
Publication date: 08/04/2022
Field of study

Fully Observable Non-Deterministic (FOND) planning models uncertainty through actions with non-deterministic effects. Existing FOND planning algorithms are effective and employ a wide range of techniques. However, most of the existing algorithms are not robust for dealing with both non-determinism and task size. In this paper, we develop a novel iterative depth-first search algorithm that solves FOND planning tasks and produces strong cyclic policies. Our algorithm is explicitly designed for FOND planning, addressing more directly the non-deterministic aspect of FOND planning, and it also exploits the benefits of heuristic functions to make the algorithm more effective during the iterative searching process. We compare our proposed algorithm to well-known FOND planners, and show that it has robust performance over several distinct types of FOND domains considering different metrics

arXiv.org e-Print Archive

Towards a theory of heuristic and optimal planning for sequential information search

Author: Jones M.
Meder B.
Nelson J.
Publication venue: 'Center for Open Science'
Publication date: 01/01/2018
Field of study

MPG.PuRe

Taming Numbers and Durations in the Model Checking Integrated Planning System

Author: Edelkamp S.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

arXiv.org e-Print Archive

Crossref

Airport under Control:Multi-agent scheduling for airport ground handling

Author: Mao X.
Publication venue: TICC Dissertation Series 16
Publication date: 01/01/2011
Field of study

Tilburg University Repository

Qualitative and Quantitative Solution Diversity in Heuristic-Search and Case-Based Planning

Author: Coman Alexandra
Publication venue: Lehigh Preserve
Publication date
Field of study

Planning is a branch of Artificial Intelligence (AI) concerned with projecting courses of actions for executing tasks and reaching goals. AI Planning helps increase the autonomy of artificially intelligent agents and decrease the cognitive load burdening human planners working in challenging domains, such as the Mars exploration projects. Approaches to AI planning include first-principles heuristic search planning and case-based planning. The former conducts a heuristic-guided search in the solution space, while the latter generates new solutions by adapting solutions to previously-solved problems.The ability to generate not just one solution, but a set of meaningfully diverse solutions to each planning problem helps cater to a wider variety of user preferences and needs (which it may be difficult or even unfeasible to acquire and/or represent in their entirety), produce viable alternative courses of action to fall back on in case of failure, counter varied threats in intrusion detection, render computer games more compelling, and provide representative samples of the vast search spaces of planning problems.This work describes a general framework for generating diverse sets of solutions (i.e. courses of action) to planning problems. The general diversity-aware planning algorithm consists of iteratively generating solutions using a composite candidate-solution evaluation criterion taking into account both how promising the candidate solutions appear in their own right and on how likely they are to increase the overall diversity of the final set of solutions. This estimate of diversity is based on distance metrics, i.e. measures of the dissimilarity between two solutions. Distance metrics can be quantitative or qualitative.Quantitative distance measures are domain-independent. They require minimum knowledge engineering, but may not reflect dissimilarities that are truly meaningful. Qualitative distance metrics are domain-specific and reflect, based on the domain knowledge encoded within them, the kind of meaningful dissimilarities that might be identified by a person familiar with the domain.Based on the general framework for diversity-aware planning, three domain-independent planning algorithms have been implemented and are described and evaluated herein. DivFF is a diverse heuristic search planner for deterministic planning domains (i.e. domains for which the assumption is made that any action can only have one possible outcome). DivCBP is a diverse case-based planner, also for deterministic planning domains. DivNDP is a heuristic search planner for nondeterministic planning domains (i.e. domains the descriptions of which include actions with multiple possible outcomes). The experimental evaluation of the three algorithms is conducted on a computer game domain, chosen for its challenging characteristics, which include nondeterminism and dynamism. The generated courses of action are run in the game in order to ascertain whether they affect the game environment in diverse ways. This constitutes the test of their genuine diversity, which cannot be evaluated accurately based solely on their low-level structure.It is shown that all proposed planning systems successfully generate sets of diverse solutions using varied criteria for assessing solution dissimilarity. Qualitatively-diverse solution sets are demonstrated to constantly produce more diverse effects in the game environment than quantitatively-diverse solution sets.A comparison between the two planning systems for deterministic domains, DivCBP and DivFF, reveals the former to be more successful at consistently generating diverse sets of solutions. The reasons for this are investigated, thus contributing to the literature of comparative studies of first-principles and case-based planning approaches. Finally, an application of diversity in planning is showcased: simulating personality-trait variation in computer game characters. Sets of diverse solutions to both deterministic and nondeterministic planning problems are shown to successfully create diverse character behavior in the evaluation environment

Lehigh University: Lehigh Preserve

Dagstuhl News January - December 2001

Author: Wilhelm Reinhard
Publication venue: Dagstuhl Publications. Dagstuhl News
Publication date: 01/01/2001
Field of study

"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

Dagstuhl Research Online Publication Server

A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making

Author: Fernández-Olivares Juan
Mesejo Pablo
Núñez-Molina Carlos
Publication venue
Publication date: 20/04/2023
Field of study

The field of Sequential Decision Making (SDM) provides tools for solving Sequential Decision Processes (SDPs), where an agent must make a series of decisions in order to complete a task or achieve a goal. Historically, two competing SDM paradigms have view for supremacy. Automated Planning (AP) proposes to solve SDPs by performing a reasoning process over a model of the world, often represented symbolically. Conversely, Reinforcement Learning (RL) proposes to learn the solution of the SDP from data, without a world model, and represent the learned knowledge subsymbolically. In the spirit of reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques that learn to plan) and for learning aspects of their structure (e.g., world models, state invariants and landmarks). To the best of our knowledge, no other review in the field provides the same scope. As an additional contribution, we discuss what properties an ideal method for SDM should exhibit and argue that neurosymbolic AI is the current approach which most closely resembles this ideal method. Finally, we outline several proposals to advance the field of SDM via the integration of symbolic and subsymbolic AI

arXiv.org e-Print Archive

Heuristinen yhteistyöhaku ohjelmistoagenttien avulla

Author: Halme Antti
Publication venue
Publication date: 02/06/2014
Field of study

Parallel algorithms extend the notion of sequential algorithms by permitting the simultaneous execution of independent computational steps. When the independence constraint is lifted and executions can freely interact and intertwine, parallel algorithms become concurrent and may behave in a nondeterministic way. Parallelism has over the years slowly risen to be a standard feature of high-performance computing, but concurrency, being even harder to reason about, is still considered somewhat notorious and undesirable. As such, the implicit randomness available in concurrency is rarely made use of in algorithms. This thesis explores concurrency as a means to facilitate algorithmic cooperation in a heuristic search setting. We use agents, cooperating software entities, to build a single-source shortest path (SSSP) search algorithm based on parallelized A∗, dubbed A!. We show how asynchronous information sharing gives rise to implicit randomness, which cooperating agents use in A! to maintain a collective secondary ranking heuristic and focus search space exploration. We experimentally show that A! consistently outperforms both vanilla A∗ and a noncooperative, explicitly randomized A∗ variant in the standard n-puzzle sliding tile problem context. The results indicate that A! performance increases with the addition of more agents, but that the returns are diminishing. A! is observed to be sensitive to heuristic improvement, but also constrained by search overhead from limited path diversity. A hybrid approach combining both implicit and explicit randomness is also evaluated and found to not be an improvement over A! alone. The studied A! implementation based on vanilla A∗ is not as such competitive against state-of-the-art parallel A∗ algorithms, but rather a first step in applying concurrency to speed up heuristic SSSP search. The empirical results imply that concurrency and nondeterministic cooperation can successfully be harnessed in algorithm design, inviting further inquiry into algorithms of this kind.Rinnakkaisalgoritmit sallivat useiden riippumattomien ohjelmakäskyjen suorittamisen samanaikaisesti. Kun riippumattomuusrajoite poistetaan ja käskyjen suorittamisen järjestystä ei hallita, rinnakkaisalgoritmit voivat käskysuoritusten samanaikaisuuden vuoksi käyttäytyä epädeterministisellä tavalla. Rinnakkaisuus on vuosien saatossa noussut tärkeään rooliin tietotekniikassa ja samalla hallitsematonta samanaikaisuutta on yleisesti alettu pitää ongelmallisena ja ei-toivottuna. Samanaikaisuudesta kumpuavaa epäsuoraa satunnaisuutta hyödynnetään harvoin algoritmeissa. Tämä työ käsittelee käskysuoritusten samanaikaisuuden hyödyntämistä osana heuristista yhteistyöhakua. Työssä toteutetaan agenttien, yhteistyökykyisten ohjelmistokomponenttien, avulla uudenlainen A!-hakualgoritmi. A! perustuu rinnakkaiseen A∗ -algoritmiin, joka ratkaisee yhden lähteen lyhimmän polun hakuongelman. Työssä näytetään, miten ajastamaton viestintä agenttien välillä johtaa epäsuoraan satunnaisuuteen, jota A!-agentit kollektiivisesti hyödyntävät toissijaisen järjestämisheuristiikan ylläpitämisessä ja edelleen haun kohdentamisessa. Työssä näytetään kokeellisesti, kuinka A! suoriutuu niin tavanomaista kuin satunnaistettuakin A∗ -algoritmia paremmin n-puzzle pulmapelin ratkaisemisessa. Tulokset osoittavat, että A!-algoritmin suorituskyky kasvaa lisäagenttien myötä, mutta myös sen, että hyöty on joka lisäyksen jälkeen suhteellisesti pienempi. A! osoittautuu heuristiikan hyödyntämisen osalta verrokkeja herkemmäksi, mutta myös etsintäpolkujen monimuotoisuuden kannalta vaatimattomaksi. Yksinkertaisen suoraa ja epäsuoraa satunnaisuutta yhdistävän hybridialgoritmin ei todeta tuovan lisäsuorituskykyä A!-algoritmiin verrattuna. Empiiriset kokeet osoittavat, että hallitsematonta samanaikaisuutta ja epädeterminististä yhteistyötä voi onnistuneesti hyödyntää algoritmisuunnittelussa, mikä kannustaa lisätutkimuksiin näitä soveltavan algoritmiikan parissa

Aaltodoc Publication Archive