413 research outputs found
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Iterative Depth-First Search for Fully Observable Non-Deterministic Planning
Fully Observable Non-Deterministic (FOND) planning models uncertainty through
actions with non-deterministic effects. Existing FOND planning algorithms are
effective and employ a wide range of techniques. However, most of the existing
algorithms are not robust for dealing with both non-determinism and task size.
In this paper, we develop a novel iterative depth-first search algorithm that
solves FOND planning tasks and produces strong cyclic policies. Our algorithm
is explicitly designed for FOND planning, addressing more directly the
non-deterministic aspect of FOND planning, and it also exploits the benefits of
heuristic functions to make the algorithm more effective during the iterative
searching process. We compare our proposed algorithm to well-known FOND
planners, and show that it has robust performance over several distinct types
of FOND domains considering different metrics
Taming Numbers and Durations in the Model Checking Integrated Planning System
The Model Checking Integrated Planning System (MIPS) is a temporal least
commitment heuristic search planner based on a flexible object-oriented
workbench architecture. Its design clearly separates explicit and symbolic
directed exploration algorithms from the set of on-line and off-line computed
estimates and associated data structures. MIPS has shown distinguished
performance in the last two international planning competitions. In the last
event the description language was extended from pure propositional planning to
include numerical state variables, action durations, and plan quality objective
functions. Plans were no longer sequences of actions but time-stamped
schedules. As a participant of the fully automated track of the competition,
MIPS has proven to be a general system; in each track and every benchmark
domain it efficiently computed plans of remarkable quality. This article
introduces and analyzes the most important algorithmic novelties that were
necessary to tackle the new layers of expressiveness in the benchmark problems
and to achieve a high level of performance. The extensions include critical
path analysis of sequentially generated plans to generate corresponding optimal
parallel plans. The linear time algorithm to compute the parallel plan bypasses
known NP hardness results for partial ordering by scheduling plans with respect
to the set of actions and the imposed precedence relations. The efficiency of
this algorithm also allows us to improve the exploration guidance: for each
encountered planning state the corresponding approximate sequential plan is
scheduled. One major strength of MIPS is its static analysis phase that grounds
and simplifies parameterized predicates, functions and operators, that infers
knowledge to minimize the state description length, and that detects domain
object symmetries. The latter aspect is analyzed in detail. MIPS has been
developed to serve as a complete and optimal state space planner, with
admissible estimates, exploration engines and branching cuts. In the
competition version, however, certain performance compromises had to be made,
including floating point arithmetic, weighted heuristic search exploration
according to an inadmissible estimate and parameterized optimization
Qualitative and Quantitative Solution Diversity in Heuristic-Search and Case-Based Planning
Planning is a branch of Artificial Intelligence (AI) concerned with projecting courses of actions for executing tasks and reaching goals. AI Planning helps increase the autonomy of artificially intelligent agents and decrease the cognitive load burdening human planners working in challenging domains, such as the Mars exploration projects. Approaches to AI planning include first-principles heuristic search planning and case-based planning. The former conducts a heuristic-guided search in the solution space, while the latter generates new solutions by adapting solutions to previously-solved problems.The ability to generate not just one solution, but a set of meaningfully diverse solutions to each planning problem helps cater to a wider variety of user preferences and needs (which it may be difficult or even unfeasible to acquire and/or represent in their entirety), produce viable alternative courses of action to fall back on in case of failure, counter varied threats in intrusion detection, render computer games more compelling, and provide representative samples of the vast search spaces of planning problems.This work describes a general framework for generating diverse sets of solutions (i.e. courses of action) to planning problems. The general diversity-aware planning algorithm consists of iteratively generating solutions using a composite candidate-solution evaluation criterion taking into account both how promising the candidate solutions appear in their own right and on how likely they are to increase the overall diversity of the final set of solutions. This estimate of diversity is based on distance metrics, i.e. measures of the dissimilarity between two solutions. Distance metrics can be quantitative or qualitative.Quantitative distance measures are domain-independent. They require minimum knowledge engineering, but may not reflect dissimilarities that are truly meaningful. Qualitative distance metrics are domain-specific and reflect, based on the domain knowledge encoded within them, the kind of meaningful dissimilarities that might be identified by a person familiar with the domain.Based on the general framework for diversity-aware planning, three domain-independent planning algorithms have been implemented and are described and evaluated herein. DivFF is a diverse heuristic search planner for deterministic planning domains (i.e. domains for which the assumption is made that any action can only have one possible outcome). DivCBP is a diverse case-based planner, also for deterministic planning domains. DivNDP is a heuristic search planner for nondeterministic planning domains (i.e. domains the descriptions of which include actions with multiple possible outcomes). The experimental evaluation of the three algorithms is conducted on a computer game domain, chosen for its challenging characteristics, which include nondeterminism and dynamism. The generated courses of action are run in the game in order to ascertain whether they affect the game environment in diverse ways. This constitutes the test of their genuine diversity, which cannot be evaluated accurately based solely on their low-level structure.It is shown that all proposed planning systems successfully generate sets of diverse solutions using varied criteria for assessing solution dissimilarity. Qualitatively-diverse solution sets are demonstrated to constantly produce more diverse effects in the game environment than quantitatively-diverse solution sets.A comparison between the two planning systems for deterministic domains, DivCBP and DivFF, reveals the former to be more successful at consistently generating diverse sets of solutions. The reasons for this are investigated, thus contributing to the literature of comparative studies of first-principles and case-based planning approaches. Finally, an application of diversity in planning is showcased: simulating personality-trait variation in computer game characters. Sets of diverse solutions to both deterministic and nondeterministic planning problems are shown to successfully create diverse character behavior in the evaluation environment
Dagstuhl News January - December 2001
"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic
A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making
The field of Sequential Decision Making (SDM) provides tools for solving
Sequential Decision Processes (SDPs), where an agent must make a series of
decisions in order to complete a task or achieve a goal. Historically, two
competing SDM paradigms have view for supremacy. Automated Planning (AP)
proposes to solve SDPs by performing a reasoning process over a model of the
world, often represented symbolically. Conversely, Reinforcement Learning (RL)
proposes to learn the solution of the SDP from data, without a world model, and
represent the learned knowledge subsymbolically. In the spirit of
reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods
for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques
that learn to plan) and for learning aspects of their structure (e.g., world
models, state invariants and landmarks). To the best of our knowledge, no other
review in the field provides the same scope. As an additional contribution, we
discuss what properties an ideal method for SDM should exhibit and argue that
neurosymbolic AI is the current approach which most closely resembles this
ideal method. Finally, we outline several proposals to advance the field of SDM
via the integration of symbolic and subsymbolic AI
Heuristinen yhteistyöhaku ohjelmistoagenttien avulla
Parallel algorithms extend the notion of sequential algorithms by permitting the simultaneous execution of independent computational steps. When the independence constraint is lifted and executions can freely interact and intertwine, parallel algorithms become concurrent and may behave in a nondeterministic way. Parallelism has over the years slowly risen to be a standard feature of high-performance computing, but concurrency, being even harder to reason about, is still considered somewhat notorious and undesirable. As such, the implicit randomness available in concurrency is rarely made use of in algorithms.
This thesis explores concurrency as a means to facilitate algorithmic cooperation in a heuristic search setting. We use agents, cooperating software entities, to build a single-source shortest path (SSSP) search algorithm based on parallelized A∗, dubbed A!. We show how asynchronous information sharing gives rise to implicit randomness, which cooperating agents use in A! to maintain a collective secondary ranking heuristic and focus search space exploration.
We experimentally show that A! consistently outperforms both vanilla A∗ and a noncooperative, explicitly randomized A∗ variant in the standard n-puzzle sliding tile problem context. The results indicate that A! performance increases with the addition of more agents, but that the returns are diminishing. A! is observed to be sensitive to heuristic improvement, but also constrained by search overhead from limited path diversity. A hybrid approach combining both implicit and explicit
randomness is also evaluated and found to not be an improvement over A! alone.
The studied A! implementation based on vanilla A∗ is not as such competitive against state-of-the-art parallel A∗ algorithms, but rather a first step in applying concurrency to speed up heuristic SSSP search. The empirical results imply that concurrency and nondeterministic cooperation can successfully be harnessed in algorithm design, inviting further inquiry into algorithms of this kind.Rinnakkaisalgoritmit sallivat useiden riippumattomien ohjelmakäskyjen suorittamisen samanaikaisesti. Kun riippumattomuusrajoite poistetaan ja käskyjen suorittamisen järjestystä ei hallita, rinnakkaisalgoritmit voivat käskysuoritusten samanaikaisuuden vuoksi käyttäytyä epädeterministisellä tavalla. Rinnakkaisuus on vuosien saatossa noussut tärkeään rooliin tietotekniikassa ja samalla hallitsematonta samanaikaisuutta on yleisesti alettu pitää ongelmallisena ja ei-toivottuna. Samanaikaisuudesta kumpuavaa epäsuoraa satunnaisuutta hyödynnetään harvoin algoritmeissa.
Tämä työ käsittelee käskysuoritusten samanaikaisuuden hyödyntämistä osana heuristista yhteistyöhakua. Työssä toteutetaan agenttien, yhteistyökykyisten ohjelmistokomponenttien, avulla uudenlainen A!-hakualgoritmi. A! perustuu rinnakkaiseen A∗ -algoritmiin, joka ratkaisee yhden lähteen lyhimmän polun hakuongelman. Työssä näytetään, miten ajastamaton viestintä agenttien välillä johtaa epäsuoraan satunnaisuuteen, jota A!-agentit kollektiivisesti hyödyntävät toissijaisen järjestämisheuristiikan ylläpitämisessä ja edelleen haun kohdentamisessa.
Työssä näytetään kokeellisesti, kuinka A! suoriutuu niin tavanomaista kuin satunnaistettuakin A∗ -algoritmia paremmin n-puzzle pulmapelin ratkaisemisessa. Tulokset osoittavat, että A!-algoritmin suorituskyky kasvaa lisäagenttien myötä, mutta myös sen, että hyöty on joka lisäyksen jälkeen suhteellisesti pienempi. A! osoittautuu heuristiikan hyödyntämisen osalta verrokkeja herkemmäksi, mutta myös etsintäpolkujen monimuotoisuuden kannalta vaatimattomaksi. Yksinkertaisen suoraa ja epäsuoraa satunnaisuutta yhdistävän hybridialgoritmin ei todeta tuovan lisäsuorituskykyä A!-algoritmiin verrattuna.
Empiiriset kokeet osoittavat, että hallitsematonta samanaikaisuutta ja epädeterminististä yhteistyötä voi onnistuneesti hyödyntää algoritmisuunnittelussa, mikä kannustaa lisätutkimuksiin näitä soveltavan algoritmiikan parissa
- …