1,536 research outputs found

    Online Planner Selection with Graph Neural Networks and Adaptive Scheduling

    Get PDF
    Automated planning is one of the foundational areas of AI. Since no single planner can work well for all tasks and domains, portfolio-based techniques have become increasingly popular in recent years. In particular, deep learning emerges as a promising methodology for online planner selection. Owing to the recent development of structural graph representations of planning tasks, we propose a graph neural network (GNN) approach to selecting candidate planners. GNNs are advantageous over a straightforward alternative, the convolutional neural networks, in that they are invariant to node permutations and that they incorporate node labels for better inference. Additionally, for cost-optimal planning, we propose a two-stage adaptive scheduling method to further improve the likelihood that a given task is solved in time. The scheduler may switch at halftime to a different planner, conditioned on the observed performance of the first one. Experimental results validate the effectiveness of the proposed method against strong baselines, both deep learning and non-deep learning based. The code is available at \url{https://github.com/matenure/GNN_planner}.Comment: AAAI 2020. Code is released at https://github.com/matenure/GNN_planner. Data set is released at https://github.com/IBM/IPC-graph-dat

    Marvin: A Heuristic Search Planner with Online Macro-Action Learning

    Full text link
    This paper describes Marvin, a planner that competed in the Fourth International Planning Competition (IPC 4). Marvin uses action-sequence-memoisation techniques to generate macro-actions, which are then used during search for a solution plan. We provide an overview of its architecture and search behaviour, detailing the algorithms used. We also empirically demonstrate the effectiveness of its features in various planning domains; in particular, the effects on performance due to the use of macro-actions, the novel features of its search behaviour, and the native support of ADL and Derived Predicates

    On the inference and management of macro-actions in forward-chaining planning

    Get PDF
    In this paper we discuss techniques for online generation of macro-actions as part of the planning process and demonstrate their use in a forward chaining search planning framework. The macroactions learnt are specifically created at places in the search space where the heuristic is not informative. We present results to show that using macro-actions generated during planning can improve planning performance

    Machine learning for classical planning : neural network heuristics, online portfolios, and state space topologies

    Get PDF
    State space search solves navigation tasks and many other real world problems. Heuristic search, especially greedy best-first search, is one of the most successful algorithms for state space search. We improve the state of the art in heuristic search in three directions. In Part I, we present methods to train neural networks as powerful heuristics for a given state space. We present a universal approach to generate training data using random walks from a (partial) state. We demonstrate that our heuristics trained for a specific task are often better than heuristics trained for a whole domain. We show that the performance of all trained heuristics is highly complementary. There is no clear pattern, which trained heuristic to prefer for a specific task. In general, model-based planners still outperform planners with trained heuristics. But our approaches exceed the model-based algorithms in the Storage domain. To our knowledge, only once before in the Spanner domain, a learning-based planner exceeded the state-of-the-art model-based planners. A priori, it is unknown whether a heuristic, or in the more general case a planner, performs well on a task. Hence, we trained online portfolios to select the best planner for a task. Today, all online portfolios are based on handcrafted features. In Part II, we present new online portfolios based on neural networks, which receive the complete task as input, and not just a few handcrafted features. Additionally, our portfolios can reconsider their choices. Both extensions greatly improve the state-of-the-art of online portfolios. Finally, we show that explainable machine learning techniques, as the alternative to neural networks, are also good online portfolios. Additionally, we present methods to improve our trust in their predictions. Even if we select the best search algorithm, we cannot solve some tasks in reasonable time. We can speed up the search if we know how it behaves in the future. In Part III, we inspect the behavior of greedy best-first search with a fixed heuristic on simple tasks of a domain to learn its behavior for any task of the same domain. Once greedy best- first search expanded a progress state, it expands only states with lower heuristic values. We learn to identify progress states and present two methods to exploit this knowledge. Building upon this, we extract the bench transition system of a task and generalize it in such a way that we can apply it to any task of the same domain. We can use this generalized bench transition system to split a task into a sequence of simpler searches. In all three research directions, we contribute new approaches and insights to the state of the art, and we indicate interesting topics for future work.Viele Alltagsprobleme können mit Hilfe der Zustandsraumsuche gelöst werden. Heuristische Suche, insbesondere die gierige Bestensuche, ist einer der erfolgreichsten Algorithmen für die Zustandsraumsuche. Wir verbessern den aktuellen Stand der Wissenschaft bezüglich heuristischer Suche auf drei Arten. Eine der wichtigsten Komponenten der heuristischen Suche ist die Heuristik. Mit einer guten Heuristik findet die Suche schnell eine Lösung. Eine gute Heuristik für ein Problem zu modellieren ist mühsam. In Teil I präsentieren wir Methoden, um automatisiert gute Heuristiken für ein Problem zu lernen. Hierfür generieren wird die Trainingsdaten mittels Zufallsbewegungen ausgehend von (Teil-) Zuständen des Problems. Wir zeigen, dass die Heuristiken, die wir für einen einzigen Zustandsraum trainieren, oft besser sind als Heuristiken, die für eine Problemklasse trainiert wurden. Weiterhin zeigen wir, dass die Qualität aller trainierten Heuristiken je nach Problemklasse stark variiert, keine Heuristik eine andere dominiert, und es nicht vorher erkennbar ist, ob eine trainierte Heuristik gut funktioniert. Wir stellen fest, dass in fast allen getesteten Problemklassen die modellbasierte Suchalgorithmen den trainierten Heuristiken überlegen sind. Lediglich in der Storage Problemklasse sind unsere Heuristiken überlegen. Oft ist es unklar, welche Heuristik oder Suchalgorithmus man für ein Problem nutzen sollte. Daher trainieren wir online Portfolios, die für ein gegebenes Problem den besten Algorithmus vorherzusagen. Die Eingabe für das online Portfolio sind bisher immer von Menschen ausgewählte Eigenschaften des Problems. In Teil II präsentieren wir neue online Portfolios, die das gesamte Problem als Eingabe bekommen. Darüber hinaus können unsere online Portfolios ihre Entscheidung einmal korrigieren. Beide Änderungen verbessern die Qualität von online Portfolios erheblich. Weiterhin zeigen wir, dass wir auch gute online Portfolios mit erklärbaren Techniken des maschinellen Lernens trainieren können. Selbst wenn wir den besten Algorithmus für ein Problem auswählen, kann es sein, dass das Problem zu schwierig ist, um in akzeptabler Zeit gelöst zu werden. In Teil III zeigen wir, wie wir von dem Verhalten einer gierigen Bestensuche auf einfachen Problemen ihr Verhalten auf schwierigeren Problemen der gleichen Problemklasse vorhersagen können. Dieses Wissen nutzen wir, um die Suche zu verbessern. Zuerst zeigen wir, wie man Fortschrittszustände erkennt. Immer wenn gierige Bestensuche einen Fortschrittszustand expandiert, wissen wir, dass es nie wieder einen Zustand mit gleichem oder höheren heuristischen Wert expandieren wird.Wir präsentieren zwei Methoden, die diesesWissen verwenden. Aufbauend auf dieser Arbeit lernen wir von einem Problem, wie man jegliches Problem der gleichen Problemklasse in eine Reihe von einfacheren Suchen aufteilen kann

    GENERATING PLANS IN CONCURRENT, PROBABILISTIC, OVER-SUBSCRIBED DOMAINS

    Get PDF
    Planning in realistic domains typically involves reasoning under uncertainty, operating under time and resource constraints, and finding the optimal subset of goals to work on. Creating optimal plans that consider all of these features is a computationally complex, challenging problem. This dissertation develops an AO* search based planner named CPOAO* (Concurrent, Probabilistic, Over-subscription AO*) which incorporates durative actions, time and resource constraints, concurrent execution, over-subscribed goals, and probabilistic actions. To handle concurrent actions, action combinations rather than individual actions are taken as plan steps. Plan optimization is explored by adding two novel aspects to plans. First, parallel steps that serve the same goal are used to increase the plan’s probability of success. Traditionally, only parallel steps that serve different goals are used to reduce plan execution time. Second, actions that are executing but are no longer useful can be terminated to save resources and time. Conventional planners assume that all actions that were started will be carried out to completion. To reduce the size of the search space, several domain independent heuristic functions and pruning techniques were developed. The key ideas are to exploit dominance relations for candidate action sets and to develop relaxed planning graphs to estimate the expected rewards of states. This thesis contributes (1) an AO* based planner to generate parallel plans, (2) domain independent heuristics to increase planner efficiency, and (3) the ability to execute redundant actions and to terminate useless actions to increase plan efficiency

    Machine Learning for Classical Planning: Neural Network Heuristics, Online Portfolios, and State Space Topologies

    Get PDF
    State space search solves navigation tasks and many other real world problems. Heuristic search, especially greedy best-first search, is one of the most successful algorithms for state space search. We improve the state of the art in heuristic search in three directions. In Part I, we present methods to train neural networks as powerful heuristics for a given state space. We present a universal approach to generate training data using random walks from a (partial) state. We demonstrate that our heuristics trained for a specific task are often better than heuristics trained for a whole domain. We show that the performance of all trained heuristics is highly complementary. There is no clear pattern, which trained heuristic to prefer for a specific task. In general, model-based planners still outperform planners with trained heuristics. But our approaches exceed the model-based algorithms in the Storage domain. To our knowledge, only once before in the Spanner domain, a learning-based planner exceeded the state-of-the-art model-based planners. A priori, it is unknown whether a heuristic, or in the more general case a planner, performs well on a task. Hence, we trained online portfolios to select the best planner for a task. Today, all online portfolios are based on handcrafted features. In Part II, we present new online portfolios based on neural networks, which receive the complete task as input, and not just a few handcrafted features. Additionally, our portfolios can reconsider their choices. Both extensions greatly improve the state-of-the-art of online portfolios. Finally, we show that explainable machine learning techniques, as the alternative to neural networks, are also good online portfolios. Additionally, we present methods to improve our trust in their predictions. Even if we select the best search algorithm, we cannot solve some tasks in reasonable time. We can speed up the search if we know how it behaves in the future. In Part III, we inspect the behavior of greedy best-first search with a fixed heuristic on simple tasks of a domain to learn its behavior for any task of the same domain. Once greedy best-first search expanded a progress state, it expands only states with lower heuristic values. We learn to identify progress states and present two methods to exploit this knowledge. Building upon this, we extract the bench transition system of a task and generalize it in such a way that we can apply it to any task of the same domain. We can use this generalized bench transition system to split a task into a sequence of simpler searches. In all three research directions, we contribute new approaches and insights to the state of the art, and we indicate interesting topics for future work

    Best-First Width Search for Lifted Classical Planning

    Get PDF
    Lifted planners are useful to solve tasks that are too hard to ground. Still, computing informative lifted heuristics is difficult: directly adapting ground heuristics to the lifted setting is often too expensive, and extracting heuristics from the lifted representation can be uninformative. A natural alternative for lifted planners is to use width-based search. These algorithms are among the strongest for ground planning, even the variants that do not access the action model. In this work, we adapt best-first width search to the lifted setting and show that this yields state-of-the-art performance for hard-to-ground planning tasks

    Homomorphisms of Lifted Planning Tasks: The Case for Delete-free Relaxation Heuristics

    Get PDF
    • …
    corecore