280 research outputs found

    Conflict-driven learning in AI planning state-space search

    Get PDF
    Many combinatorial computation problems in computer science can be cast as a reachability problem in an implicitly described, potentially huge, graph: the state space. State-space search is a versatile and widespread method to solve such reachability problems, but it requires some form of guidance to prevent exploring that combinatorial space exhaustively. Conflict-driven learning is an indispensable search ingredient for solving constraint satisfaction problems (most prominently, Boolean satisfiability). It guides search towards solutions by identifying conflicts during the search, i.e., search branches not leading to any solution, learning from them knowledge to avoid similar conflicts in the remainder of the search. This thesis adapts the conflict-driven learning methodology to more general classes of reachability problems. Specifically, our work is placed in AI planning. We consider goal-reachability objectives in classical planning and in planning under uncertainty. The canonical form of "conflicts" in this context are dead-end states, i.e., states from which the desired goal property cannot be reached. We pioneer methods for learning sound and generalizable dead-end knowledge from conflicts encountered during forward state-space search. This embraces the following core contributions: When acting under uncertainty, the presence of dead-end states may make it impossible to satisfy the goal property with absolute certainty. The natural planning objective then is MaxProb, maximizing the probability of reaching the goal. However, algorithms for MaxProb probabilistic planning are severely underexplored. We close this gap by developing a large design space of probabilistic state-space search methods, contributing new search algorithms, admissible state-space reduction techniques, and goal-probability bounds suitable for heuristic state-space search. We systematically explore this design space through an extensive empirical evaluation. The key to our conflict-driven learning algorithm adaptation are unsolvability detectors, i.e., goal-reachability overapproximations. We design three complementary families of such unsolvability detectors, building upon known techniques: critical-path heuristics, linear-programming-based heuristics, and dead-end traps. We develop search methods to identify conflicts in deterministic and probabilistic state spaces, and we develop suitable refinement methods for the different unsolvability detectors so to recognize these states. Arranged in a depth-first search, our techniques approach the elegance of conflict-driven learning in constraint satisfaction, featuring the ability to learn to refute search subtrees, and intelligent backjumping to the root cause of a conflict. We provide a comprehensive experimental evaluation, demonstrating that the proposed techniques yield state-of-the-art performance for finding plans for solvable classical planning tasks, proving classical planning tasks unsolvable, and solving MaxProb in probabilistic planning, on benchmarks where dead-end states abound.Viele kombinatorisch komplexe Berechnungsprobleme in der Informatik lassen sich als Erreichbarkeitsprobleme in einem implizit dargestellten, potenziell riesigen, Graphen - dem Zustandsraum - verstehen. Die Zustandsraumsuche ist eine weit verbreitete Methode, um solche Erreichbarkeitsprobleme zu lösen. Die Effizienz dieser Methode hängt aber maßgeblich von der Verwendung strikter Suchkontrollmechanismen ab. Das konfliktgesteuerte Lernen ist eine essenzielle Suchkomponente für das Lösen von Constraint-Satisfaction-Problemen (wie dem Erfüllbarkeitsproblem der Aussagenlogik), welches von Konflikten, also Fehlern in der Suche, neue Kontrollregeln lernt, die ähnliche Konflikte zukünftig vermeiden. In dieser Arbeit erweitern wir die zugrundeliegende Methodik auf Zielerreichbarkeitsfragen, wie sie im klassischen und probabilistischen Planen, einem Teilbereich der Künstlichen Intelligenz, auftauchen. Die kanonische Form von „Konflikten“ in diesem Kontext sind sog. Sackgassen, Zustände, von denen aus die Zielbedingung nicht erreicht werden kann. Wir präsentieren Methoden, die es ermöglichen, während der Zustandsraumsuche von solchen Konflikten korrektes und verallgemeinerbares Wissen über Sackgassen zu erlernen. Unsere Arbeit umfasst folgende Beiträge: Wenn der Effekt des Handelns mit Unsicherheiten behaftet ist, dann kann die Existenz von Sackgassen dazu führen, dass die Zielbedingung nicht unter allen Umständen erfüllt werden kann. Die naheliegendste Planungsbedingung in diesem Fall ist MaxProb, das Maximieren der Wahrscheinlichkeit, dass die Zielbedingung erreicht wird. Planungsalgorithmen für MaxProb sind jedoch wenig erforscht. Um diese Lücke zu schließen, erstellen wir einen umfangreichen Bausatz für Suchmethoden in probabilistischen Zustandsräumen, und entwickeln dabei neue Suchalgorithmen, Zustandsraumreduktionsmethoden, und Abschätzungen der Zielerreichbarkeitswahrscheinlichkeit, wie sie für heuristische Suchalgorithmen gebraucht werden. Wir explorieren den resultierenden Gestaltungsraum systematisch in einer breit angelegten empirischen Studie. Die Grundlage unserer Adaption des konfliktgesteuerten Lernens bilden Unerreichbarkeitsdetektoren. Wir konzipieren drei Familien solcher Detektoren basierend auf bereits bekannten Techniken: Kritische-Pfad Heuristiken, Heuristiken basierend auf linearer Optimierung, und Sackgassen-Fallen. Wir entwickeln Suchmethoden, um Konflikte in deterministischen und probabilistischen Zustandsräumen zu erkennen, sowie Methoden, um die verschiedenen Unerreichbarkeitsdetektoren basierend auf den erkannten Konflikten zu verfeinern. Instanziiert als Tiefensuche weisen unsere Techniken ähnliche Eigenschaften auf wie das konfliktgesteuerte Lernen für Constraint-Satisfaction-Problemen. Wir evaluieren die entwickelten Methoden empirisch, und zeigen dabei, dass das konfliktgesteuerte Lernen unter gewissen Voraussetzungen zu signifikanten Suchreduktionen beim Finden von Plänen in lösbaren klassischen Planungsproblemen, Beweisen der Unlösbarkeit von klassischen Planungsproblemen, und Lösen von MaxProb im probabilistischen Planen, führen kann

    Taming Numbers and Durations in the Model Checking Integrated Planning System

    Full text link
    The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

    Star-topology decoupled state-space search in AI planning and model checking

    Get PDF
    State-space search is a widely employed concept in many areas of computer science. The well-known state explosion problem, however, imposes a severe limitation to the effective implementation of search in state spaces that are exponential in the size of a compact system description, which captures the state-transition semantics. Decoupled state-space search, decoupled search for short, is a novel approach to tackle the state explosion. It decomposes the system such that the dependencies between components take the form of a star topology with a center and several leaf components. Decoupled search exploits that the leaves in that topology are conditionally independent. Such independence naturally arises in many kinds of factored model representations, where the overall state space results from the product of several system components. In this work, we introduce decoupled search in the context of artificial intelligence planning and formal verification using model checking. Building on common formalisms, we develop the concept of the decoupled state space and prove its correctness with respect to capturing reachability of the underlying model exactly. This allows us to connect decoupled search to any search algorithm, and, important for planning, adapt any heuristic function to the decoupled state representation. Such heuristics then guide the search towards states that satisfy a desired goal condition. In model checking, we address the problems of verifying safety properties, which express system states that must never occur, and liveness properties, that must hold in any infinite system execution. Many approaches have been proposed in the past to tackle the state explosion problem. Most prominently partial-order reduction, symmetry breaking, Petri-net unfolding, and symbolic state representations. Like decoupled search, all of these are capable of exponentially reducing the search effort, either by pruning part of the state space (the former two), or by representing large state sets compactly (the latter two). For all these techniques, we prove that decoupled search can be exponentially more efficient, confirming that it is indeed a novel concept that exploits model properties in a unique way. Given such orthogonality, we combine decoupled search with several complementary methods. Empirically, we show that decoupled search favourably compares to state-of-the-art planners in common algorithmic planning problems using standard benchmarks. In model checking, decoupled search outperforms well-established tools, both in the context of the verification of safety and liveness properties.Die Zustandsraumsuche ist ein weit verbreitetes Konzept in vielen Bereichen der Informatik, deren effektive Anwendung jedoch durch das Problem der Zustandsexplosion deutlich erschwert wird. Die Zustandsexplosion ist dadurch charakterisiert dass kompakte Systemmodelle exponentiell große Zustandsräume beschreiben. Entkoppelte Zustandsraumsuche (entkoppelte Suche) beschreibt einen neuartigen Ansatz der Zustandsexplosion entgegenzuwirken indem die Struktur des Modells, insbesondere die bedingte Unabhängigkeit von Systemkomponenten in einer Sterntopologie, ausgenutzt wird. Diese Unabhängigkeit ergibt sich bei vielen faktorisierten Modellen deren Zustandsraum sich aus dem Produkt mehrerer Komponenten zusammensetzt. In dieser Arbeit wird die entkoppelte Suche in der Planung, als Teil der Künstlichen Intelligenz, und der Verifikation mittels Modellprüfung eingeführt. In etablierten Formalismen wird das Konzept des entkoppelten Zustandsraums entwickelt und dessen Korrektheit bezüglich der exakten Erfassung der Erreichbarkeit von Modellzuständen bewiesen. Dies ermöglicht die Kombination der entkoppelten Suche mit beliebigen Suchalgorithmen. Wichtig für die Planung ist zudem die Nutzung von Heuristiken, die die Suche zu Zuständen führen, die eine gewünschte Zielbedingung erfüllen, mit der entkoppelten Zustandsdarstellung. Im Teil zur Modellprüfung wird die Verifikation von Sicherheits- sowie Lebendigkeitseigenschaften betrachtet, die unerwünschte Zustände, bzw. Eigenschaften, die bei unendlicher Systemausführung gelten müssen, beschreiben. Es existieren diverse Ansätze um die Zustandsexplosion anzugehen. Am bekanntesten sind die Reduktion partieller Ordnung, Symmetriereduktion, Entfaltung von Petri-Netzen und symbolische Suche. Diese können, wie die entkoppelte Suche, den Suchaufwand exponentiell reduzieren. Dies geschieht durch Beschneidung eines Teils des Zustandsraums, oder durch die kompakte Darstellung großer Zustandsmengen. Für diese Verfahren wird bewiesen, dass die entkoppelte Suche exponentiell effizienter sein kann. Dies belegt dass es sich um ein neuartiges Konzept handelt, das sich auf eigene Art der Modelleigenschaften bedient. Auf Basis dieser Beobachtung werden, mit Ausnahme der Entfaltung, Kombinationen mit entkoppelter Suche entwickelt. Empirisch kann die entkoppelte Suche im Vergleich zu modernen Planern zu deutlichen Vorteilen führen. In der Modellprüfung werden, sowohl bei der Überprüfung von Sicherheit-, als auch Lebendigkeitseigenschaften, etablierte Programme übertroffen.Deutsche Forschungsgesellschaft; Star-Topology Decoupled State Space Searc

    New perspectives on cost partitioning for optimal classical planning

    Get PDF
    Admissible heuristics are the main ingredient when solving classical planning tasks optimally with heuristic search. There are many such heuristics, and each has its own strengths and weaknesses. As higher admissible heuristic values are more accurate, the maximum over several admissible heuristics dominates each individual one. Operator cost partitioning is a well-known technique to combine admissible heuristics in a way that dominates their maximum and remains admissible. But are there better options to combine the heuristics? We make three main contributions towards this question: Extensions to the cost partitioning framework can produce higher estimates from the same set of heuristics. Cost partitioning traditionally uses non-negative cost functions. We prove that this restriction is not necessary, and that allowing negative values as well makes the framework more powerful: the resulting heuristic values can be exponentially higher, and unsolvability can be detected even if all component heuristics have a finite value. We also generalize operator cost partitioning to transition cost partitioning, which can differentiate between different contexts in which an operator is used. Operator-counting heuristics reason about the number of times each operator is used in a plan. Many existing heuristics can be expressed in this framework, which gives new theoretical insight into their relationship. Different operator-counting heuristics can be easily combined within the framework in a way that dominates their maximum. Potential heuristics compute a heuristic value as a weighted sum over state features and are a fast alternative to operator-counting heuristics. Admissible and consistent potential heuristics for certain feature sets can be described in a compact way which means that the best heuristic from this class can be extracted in polynomial time. Both operator-counting and potential heuristics are closely related to cost partitioning. They offer a new look on cost-partitioned heuristics and already sparked research beyond their use as classical planning heuristics

    Explainable and Interpretable Decision-Making for Robotic Tasks

    Get PDF
    Future generations of robots, such as service robots that support humans with household tasks, will be a pervasive part of our daily lives. The human\u27s ability to understand the decision-making process of robots is thereby considered to be crucial for establishing trust-based and efficient interactions between humans and robots. In this thesis, we present several interpretable and explainable decision-making methods that aim to improve the human\u27s understanding of a robot\u27s actions, with a particular focus on the explanation of why robot failures were committed.In this thesis, we consider different types of failures, such as task recognition errors and task execution failures. Our first goal is an interpretable approach to learning from human demonstrations (LfD), which is essential for robots to learn new tasks without the time-consuming trial-and-error learning process. Our proposed method deals with the challenge of transferring human demonstrations to robots by an automated generation of symbolic planning operators based on interpretable decision trees. Our second goal is the prediction, explanation, and prevention of robot task execution failures based on causal models of the environment. Our contribution towards the second goal is a causal-based method that finds contrastive explanations for robot execution failures, which enables robots to predict, explain and prevent even timely shifted action failures (e.g., the current action was successful but will negatively affect the success of future actions). Since learning causal models is data-intensive, our final goal is to improve the data efficiency by utilizing prior experience. This investigation aims to help robots learn causal models faster, enabling them to provide failure explanations at the cost of fewer action execution experiments.In the future, we will work on scaling up the presented methods to generalize to more complex, human-centered applications

    Towards full-scale autonomy for multi-vehicle systems planning and acting in extreme environments

    Get PDF
    Currently, robotic technology offers flexible platforms for addressing many challenging problems that arise in extreme environments. These problems’ nature enhances the use of heterogeneous multi-vehicle systems which can coordinate and collaborate to achieve a common set of goals. While such applications have previously been explored in limited contexts, long-term deployments in such settings often require an advanced level of autonomy to maintain operability. The success of planning and acting approaches for multi-robot systems are conditioned by including reasoning regarding temporal, resource and knowledge requirements, and world dynamics. Automated planning provides the tools to enable intelligent behaviours in robotic systems. However, whilst many planning approaches and plan execution techniques have been proposed, these solutions highlight an inability to consistently build and execute high-quality plans. Motivated by these challenges, this thesis presents developments advancing state-of-the-art temporal planning and acting to address multi-robot problems. We propose a set of advanced techniques, methods and tools to build a high-level temporal planning and execution system that can devise, execute and monitor plans suitable for long-term missions in extreme environments. We introduce a new task allocation strategy, called HRTA, that optimises the task distribution amongst the heterogeneous fleet, relaxes the planning problem and boosts the plan search. We implement the TraCE planner that enforces contingent planning considering propositional temporal and numeric constraints to deal with partial observability about the initial state. Our developments regarding robust plan execution and mission adaptability include the HLMA, which efficiently optimises the task allocation and refines the planning model considering the experience from robots’ previous mission executions. We introduce the SEA failure solver that, combined with online planning, overcomes unexpected situations during mission execution, deals with joint goals implementation, and enhances mission operability in long-term deployments. Finally, we demonstrate the efficiency of our approaches with a series of experiments using a new set of real-world planning domains.Engineering and Physical Sciences Research Council (EPSRC) grant EP/R026173/
    corecore