3 research outputs found
Iterative Budgeted Exponential Search
We tackle two long-standing problems related to re-expansions in heuristic search algorithms. For graph search, A* can require Ω(2^n) expansions, where n is the number of states within the final f bound. Existing algorithms that address this problem like B and B' improve this bound to Ω(n^2). For tree search, IDA* can also require Ω(n^2) expansions. We describe a new algorithmic framework that iteratively controls an expansion budget and solution cost limit, giving rise to new graph and tree search algorithms for which the number of expansions is O(n log C*), where C* is the optimal solution cost. Our experiments show that the new algorithms are robust in scenarios where existing algorithms fail. In the case of tree search, our new algorithms have no overhead over IDA* in scenarios to which IDA* is well suited and can therefore be recommended as a general replacement for IDA*
Policy-Guided Heuristic Search with Guarantees
The use of a policy and a heuristic function for guiding search can be quite
effective in adversarial problems, as demonstrated by AlphaGo and its
successors, which are based on the PUCT search algorithm. While PUCT can also
be used to solve single-agent deterministic problems, it lacks guarantees on
its search effort and it can be computationally inefficient in practice.
Combining the A* algorithm with a learned heuristic function tends to work
better in these domains, but A* and its variants do not use a policy. Moreover,
the purpose of using A* is to find solutions of minimum cost, while we seek
instead to minimize the search loss (e.g., the number of search steps). LevinTS
is guided by a policy and provides guarantees on the number of search steps
that relate to the quality of the policy, but it does not make use of a
heuristic function. In this work we introduce Policy-guided Heuristic Search
(PHS), a novel search algorithm that uses both a heuristic function and a
policy and has theoretical guarantees on the search loss that relates to both
the quality of the heuristic and of the policy. We show empirically on the
sliding-tile puzzle, Sokoban, and a puzzle from the commercial game `The
Witness' that PHS enables the rapid learning of both a policy and a heuristic
function and compares favorably with A*, Weighted A*, Greedy Best-First Search,
LevinTS, and PUCT in terms of number of problems solved and search time in all
three domains tested
A Guide to Budgeted Tree Search
Budgeted Tree Search (BTS), a variant of Iterative Budgeted Exponential Search, is a new algorithm that has the same performance as IDA* on problems where the state space grows exponentially, but has far better performance than IDA* in other cases where IDA* fails. The goal of this paper is to provide a detailed guide to BTS with worked examples to make the algorithm more accessible to practitioners in heuristic search