8 research outputs found

    Machine Learning for Classical Planning: Neural Network Heuristics, Online Portfolios, and State Space Topologies

    Get PDF
    State space search solves navigation tasks and many other real world problems. Heuristic search, especially greedy best-first search, is one of the most successful algorithms for state space search. We improve the state of the art in heuristic search in three directions. In Part I, we present methods to train neural networks as powerful heuristics for a given state space. We present a universal approach to generate training data using random walks from a (partial) state. We demonstrate that our heuristics trained for a specific task are often better than heuristics trained for a whole domain. We show that the performance of all trained heuristics is highly complementary. There is no clear pattern, which trained heuristic to prefer for a specific task. In general, model-based planners still outperform planners with trained heuristics. But our approaches exceed the model-based algorithms in the Storage domain. To our knowledge, only once before in the Spanner domain, a learning-based planner exceeded the state-of-the-art model-based planners. A priori, it is unknown whether a heuristic, or in the more general case a planner, performs well on a task. Hence, we trained online portfolios to select the best planner for a task. Today, all online portfolios are based on handcrafted features. In Part II, we present new online portfolios based on neural networks, which receive the complete task as input, and not just a few handcrafted features. Additionally, our portfolios can reconsider their choices. Both extensions greatly improve the state-of-the-art of online portfolios. Finally, we show that explainable machine learning techniques, as the alternative to neural networks, are also good online portfolios. Additionally, we present methods to improve our trust in their predictions. Even if we select the best search algorithm, we cannot solve some tasks in reasonable time. We can speed up the search if we know how it behaves in the future. In Part III, we inspect the behavior of greedy best-first search with a fixed heuristic on simple tasks of a domain to learn its behavior for any task of the same domain. Once greedy best-first search expanded a progress state, it expands only states with lower heuristic values. We learn to identify progress states and present two methods to exploit this knowledge. Building upon this, we extract the bench transition system of a task and generalize it in such a way that we can apply it to any task of the same domain. We can use this generalized bench transition system to split a task into a sequence of simpler searches. In all three research directions, we contribute new approaches and insights to the state of the art, and we indicate interesting topics for future work

    Machine learning for classical planning : neural network heuristics, online portfolios, and state space topologies

    Get PDF
    State space search solves navigation tasks and many other real world problems. Heuristic search, especially greedy best-first search, is one of the most successful algorithms for state space search. We improve the state of the art in heuristic search in three directions. In Part I, we present methods to train neural networks as powerful heuristics for a given state space. We present a universal approach to generate training data using random walks from a (partial) state. We demonstrate that our heuristics trained for a specific task are often better than heuristics trained for a whole domain. We show that the performance of all trained heuristics is highly complementary. There is no clear pattern, which trained heuristic to prefer for a specific task. In general, model-based planners still outperform planners with trained heuristics. But our approaches exceed the model-based algorithms in the Storage domain. To our knowledge, only once before in the Spanner domain, a learning-based planner exceeded the state-of-the-art model-based planners. A priori, it is unknown whether a heuristic, or in the more general case a planner, performs well on a task. Hence, we trained online portfolios to select the best planner for a task. Today, all online portfolios are based on handcrafted features. In Part II, we present new online portfolios based on neural networks, which receive the complete task as input, and not just a few handcrafted features. Additionally, our portfolios can reconsider their choices. Both extensions greatly improve the state-of-the-art of online portfolios. Finally, we show that explainable machine learning techniques, as the alternative to neural networks, are also good online portfolios. Additionally, we present methods to improve our trust in their predictions. Even if we select the best search algorithm, we cannot solve some tasks in reasonable time. We can speed up the search if we know how it behaves in the future. In Part III, we inspect the behavior of greedy best-first search with a fixed heuristic on simple tasks of a domain to learn its behavior for any task of the same domain. Once greedy best- first search expanded a progress state, it expands only states with lower heuristic values. We learn to identify progress states and present two methods to exploit this knowledge. Building upon this, we extract the bench transition system of a task and generalize it in such a way that we can apply it to any task of the same domain. We can use this generalized bench transition system to split a task into a sequence of simpler searches. In all three research directions, we contribute new approaches and insights to the state of the art, and we indicate interesting topics for future work.Viele Alltagsprobleme können mit Hilfe der Zustandsraumsuche gelöst werden. Heuristische Suche, insbesondere die gierige Bestensuche, ist einer der erfolgreichsten Algorithmen für die Zustandsraumsuche. Wir verbessern den aktuellen Stand der Wissenschaft bezüglich heuristischer Suche auf drei Arten. Eine der wichtigsten Komponenten der heuristischen Suche ist die Heuristik. Mit einer guten Heuristik findet die Suche schnell eine Lösung. Eine gute Heuristik für ein Problem zu modellieren ist mühsam. In Teil I präsentieren wir Methoden, um automatisiert gute Heuristiken für ein Problem zu lernen. Hierfür generieren wird die Trainingsdaten mittels Zufallsbewegungen ausgehend von (Teil-) Zuständen des Problems. Wir zeigen, dass die Heuristiken, die wir für einen einzigen Zustandsraum trainieren, oft besser sind als Heuristiken, die für eine Problemklasse trainiert wurden. Weiterhin zeigen wir, dass die Qualität aller trainierten Heuristiken je nach Problemklasse stark variiert, keine Heuristik eine andere dominiert, und es nicht vorher erkennbar ist, ob eine trainierte Heuristik gut funktioniert. Wir stellen fest, dass in fast allen getesteten Problemklassen die modellbasierte Suchalgorithmen den trainierten Heuristiken überlegen sind. Lediglich in der Storage Problemklasse sind unsere Heuristiken überlegen. Oft ist es unklar, welche Heuristik oder Suchalgorithmus man für ein Problem nutzen sollte. Daher trainieren wir online Portfolios, die für ein gegebenes Problem den besten Algorithmus vorherzusagen. Die Eingabe für das online Portfolio sind bisher immer von Menschen ausgewählte Eigenschaften des Problems. In Teil II präsentieren wir neue online Portfolios, die das gesamte Problem als Eingabe bekommen. Darüber hinaus können unsere online Portfolios ihre Entscheidung einmal korrigieren. Beide Änderungen verbessern die Qualität von online Portfolios erheblich. Weiterhin zeigen wir, dass wir auch gute online Portfolios mit erklärbaren Techniken des maschinellen Lernens trainieren können. Selbst wenn wir den besten Algorithmus für ein Problem auswählen, kann es sein, dass das Problem zu schwierig ist, um in akzeptabler Zeit gelöst zu werden. In Teil III zeigen wir, wie wir von dem Verhalten einer gierigen Bestensuche auf einfachen Problemen ihr Verhalten auf schwierigeren Problemen der gleichen Problemklasse vorhersagen können. Dieses Wissen nutzen wir, um die Suche zu verbessern. Zuerst zeigen wir, wie man Fortschrittszustände erkennt. Immer wenn gierige Bestensuche einen Fortschrittszustand expandiert, wissen wir, dass es nie wieder einen Zustand mit gleichem oder höheren heuristischen Wert expandieren wird.Wir präsentieren zwei Methoden, die diesesWissen verwenden. Aufbauend auf dieser Arbeit lernen wir von einem Problem, wie man jegliches Problem der gleichen Problemklasse in eine Reihe von einfacheren Suchen aufteilen kann

    Search behavior of greedy best-first search

    Get PDF
    Greedy best-first search (GBFS) is a sibling of A* in the family of best-first state-space search algorithms. While A* is guaranteed to find optimal solutions of search problems, GBFS does not provide any guarantees but typically finds satisficing solutions more quickly than A*. A classical result of optimal best-first search shows that A* with an admissible and consistent heuristic expands every state whose f-value is below the optimal solution cost and no state whose f-value is above the optimal solution cost. Theoretical results of this kind are useful for the analysis of heuristics in different search domains and for the improvement of algorithms. For satisficing algorithms, a similarly clear understanding is currently lacking. We examine the search behavior of GBFS in order to make progress towards such an understanding. We introduce the concept of high-water mark benches, which separate the search space into areas that are searched by GBFS in sequence. High-water mark benches allow us to exactly determine the set of states that GBFS expands under at least one tie-breaking strategy. We show that benches contain craters. Once GBFS enters a crater, it has to expand every state in the crater before being able to escape. Benches and craters allow us to characterize the best-case and worst-case behavior of GBFS in given search instances. We show that computing the best-case or worst-case behavior of GBFS is NP-complete in general but can be computed in polynomial time for undirected state spaces. We present algorithms for extracting the set of states that GBFS potentially expands and for computing the best-case and worst-case behavior. We use the algorithms to analyze GBFS on benchmark tasks from planning competitions under a state-of-the-art heuristic. Experimental results reveal interesting characteristics of the heuristic on the given tasks and demonstrate the importance of tie-breaking in GBFS

    New perspectives on cost partitioning for optimal classical planning

    Get PDF
    Admissible heuristics are the main ingredient when solving classical planning tasks optimally with heuristic search. There are many such heuristics, and each has its own strengths and weaknesses. As higher admissible heuristic values are more accurate, the maximum over several admissible heuristics dominates each individual one. Operator cost partitioning is a well-known technique to combine admissible heuristics in a way that dominates their maximum and remains admissible. But are there better options to combine the heuristics? We make three main contributions towards this question: Extensions to the cost partitioning framework can produce higher estimates from the same set of heuristics. Cost partitioning traditionally uses non-negative cost functions. We prove that this restriction is not necessary, and that allowing negative values as well makes the framework more powerful: the resulting heuristic values can be exponentially higher, and unsolvability can be detected even if all component heuristics have a finite value. We also generalize operator cost partitioning to transition cost partitioning, which can differentiate between different contexts in which an operator is used. Operator-counting heuristics reason about the number of times each operator is used in a plan. Many existing heuristics can be expressed in this framework, which gives new theoretical insight into their relationship. Different operator-counting heuristics can be easily combined within the framework in a way that dominates their maximum. Potential heuristics compute a heuristic value as a weighted sum over state features and are a fast alternative to operator-counting heuristics. Admissible and consistent potential heuristics for certain feature sets can be described in a compact way which means that the best heuristic from this class can be extracted in polynomial time. Both operator-counting and potential heuristics are closely related to cost partitioning. They offer a new look on cost-partitioned heuristics and already sparked research beyond their use as classical planning heuristics

    Building a Heuristic for Greedy Search

    No full text
    Suboptimal heuristic search algorithms such as greedy best-first search allow us to find solutions when constraints of either time, memory, or both prevent the application of optimal algorithms such as A*. Guidelines for building an effective heuristic for A* are well established in the literature, but we show that if those rules are applied for greedy best-first search, performance can actually degrade. Observing what went wrong for greedy best-first search leads us to a quantitative metric appropriate for greedy heuristics, called Goal Distance Rank Correlation (GDRC). We demonstrate that GDRC can be used to build effective heuristics for greedy best-first search automatically
    corecore