25 research outputs found

    Stochastic Planning with Lifted Symbolic Trajectory Optimization

    Get PDF
    This paper investigates online stochastic planning for problems with large factored state and action spaces. One promising approach in recent work estimates the quality of applicable actions in the current state through aggregate simulation from the states they reach. This leads to significant speedup, compared to search over concrete states and actions, and suffices to guide decision making in cases where the performance of a random policy is informative of the quality of a state. The paper makes two significant improvements to this approach. The first, taking inspiration from lifted belief propagation, exploits the structure of the problem to derive a more compact computation graph for aggregate simulation. The second improvement replaces the random policy embedded in the computation graph with symbolic variables that are optimized simultaneously with the search for high quality actions. This expands the scope of the approach to problems that require deep search and where information is lost quickly with random steps. An empirical evaluation shows that these ideas significantly improve performance, leading to state of the art performance on hard planning problems

    GENERATING PLANS IN CONCURRENT, PROBABILISTIC, OVER-SUBSCRIBED DOMAINS

    Get PDF
    Planning in realistic domains typically involves reasoning under uncertainty, operating under time and resource constraints, and finding the optimal subset of goals to work on. Creating optimal plans that consider all of these features is a computationally complex, challenging problem. This dissertation develops an AO* search based planner named CPOAO* (Concurrent, Probabilistic, Over-subscription AO*) which incorporates durative actions, time and resource constraints, concurrent execution, over-subscribed goals, and probabilistic actions. To handle concurrent actions, action combinations rather than individual actions are taken as plan steps. Plan optimization is explored by adding two novel aspects to plans. First, parallel steps that serve the same goal are used to increase the plan’s probability of success. Traditionally, only parallel steps that serve different goals are used to reduce plan execution time. Second, actions that are executing but are no longer useful can be terminated to save resources and time. Conventional planners assume that all actions that were started will be carried out to completion. To reduce the size of the search space, several domain independent heuristic functions and pruning techniques were developed. The key ideas are to exploit dominance relations for candidate action sets and to develop relaxed planning graphs to estimate the expected rewards of states. This thesis contributes (1) an AO* based planner to generate parallel plans, (2) domain independent heuristics to increase planner efficiency, and (3) the ability to execute redundant actions and to terminate useless actions to increase plan efficiency

    Conflict-driven learning in AI planning state-space search

    Get PDF
    Many combinatorial computation problems in computer science can be cast as a reachability problem in an implicitly described, potentially huge, graph: the state space. State-space search is a versatile and widespread method to solve such reachability problems, but it requires some form of guidance to prevent exploring that combinatorial space exhaustively. Conflict-driven learning is an indispensable search ingredient for solving constraint satisfaction problems (most prominently, Boolean satisfiability). It guides search towards solutions by identifying conflicts during the search, i.e., search branches not leading to any solution, learning from them knowledge to avoid similar conflicts in the remainder of the search. This thesis adapts the conflict-driven learning methodology to more general classes of reachability problems. Specifically, our work is placed in AI planning. We consider goal-reachability objectives in classical planning and in planning under uncertainty. The canonical form of "conflicts" in this context are dead-end states, i.e., states from which the desired goal property cannot be reached. We pioneer methods for learning sound and generalizable dead-end knowledge from conflicts encountered during forward state-space search. This embraces the following core contributions: When acting under uncertainty, the presence of dead-end states may make it impossible to satisfy the goal property with absolute certainty. The natural planning objective then is MaxProb, maximizing the probability of reaching the goal. However, algorithms for MaxProb probabilistic planning are severely underexplored. We close this gap by developing a large design space of probabilistic state-space search methods, contributing new search algorithms, admissible state-space reduction techniques, and goal-probability bounds suitable for heuristic state-space search. We systematically explore this design space through an extensive empirical evaluation. The key to our conflict-driven learning algorithm adaptation are unsolvability detectors, i.e., goal-reachability overapproximations. We design three complementary families of such unsolvability detectors, building upon known techniques: critical-path heuristics, linear-programming-based heuristics, and dead-end traps. We develop search methods to identify conflicts in deterministic and probabilistic state spaces, and we develop suitable refinement methods for the different unsolvability detectors so to recognize these states. Arranged in a depth-first search, our techniques approach the elegance of conflict-driven learning in constraint satisfaction, featuring the ability to learn to refute search subtrees, and intelligent backjumping to the root cause of a conflict. We provide a comprehensive experimental evaluation, demonstrating that the proposed techniques yield state-of-the-art performance for finding plans for solvable classical planning tasks, proving classical planning tasks unsolvable, and solving MaxProb in probabilistic planning, on benchmarks where dead-end states abound.Viele kombinatorisch komplexe Berechnungsprobleme in der Informatik lassen sich als Erreichbarkeitsprobleme in einem implizit dargestellten, potenziell riesigen, Graphen - dem Zustandsraum - verstehen. Die Zustandsraumsuche ist eine weit verbreitete Methode, um solche Erreichbarkeitsprobleme zu lösen. Die Effizienz dieser Methode hängt aber maßgeblich von der Verwendung strikter Suchkontrollmechanismen ab. Das konfliktgesteuerte Lernen ist eine essenzielle Suchkomponente für das Lösen von Constraint-Satisfaction-Problemen (wie dem Erfüllbarkeitsproblem der Aussagenlogik), welches von Konflikten, also Fehlern in der Suche, neue Kontrollregeln lernt, die ähnliche Konflikte zukünftig vermeiden. In dieser Arbeit erweitern wir die zugrundeliegende Methodik auf Zielerreichbarkeitsfragen, wie sie im klassischen und probabilistischen Planen, einem Teilbereich der Künstlichen Intelligenz, auftauchen. Die kanonische Form von „Konflikten“ in diesem Kontext sind sog. Sackgassen, Zustände, von denen aus die Zielbedingung nicht erreicht werden kann. Wir präsentieren Methoden, die es ermöglichen, während der Zustandsraumsuche von solchen Konflikten korrektes und verallgemeinerbares Wissen über Sackgassen zu erlernen. Unsere Arbeit umfasst folgende Beiträge: Wenn der Effekt des Handelns mit Unsicherheiten behaftet ist, dann kann die Existenz von Sackgassen dazu führen, dass die Zielbedingung nicht unter allen Umständen erfüllt werden kann. Die naheliegendste Planungsbedingung in diesem Fall ist MaxProb, das Maximieren der Wahrscheinlichkeit, dass die Zielbedingung erreicht wird. Planungsalgorithmen für MaxProb sind jedoch wenig erforscht. Um diese Lücke zu schließen, erstellen wir einen umfangreichen Bausatz für Suchmethoden in probabilistischen Zustandsräumen, und entwickeln dabei neue Suchalgorithmen, Zustandsraumreduktionsmethoden, und Abschätzungen der Zielerreichbarkeitswahrscheinlichkeit, wie sie für heuristische Suchalgorithmen gebraucht werden. Wir explorieren den resultierenden Gestaltungsraum systematisch in einer breit angelegten empirischen Studie. Die Grundlage unserer Adaption des konfliktgesteuerten Lernens bilden Unerreichbarkeitsdetektoren. Wir konzipieren drei Familien solcher Detektoren basierend auf bereits bekannten Techniken: Kritische-Pfad Heuristiken, Heuristiken basierend auf linearer Optimierung, und Sackgassen-Fallen. Wir entwickeln Suchmethoden, um Konflikte in deterministischen und probabilistischen Zustandsräumen zu erkennen, sowie Methoden, um die verschiedenen Unerreichbarkeitsdetektoren basierend auf den erkannten Konflikten zu verfeinern. Instanziiert als Tiefensuche weisen unsere Techniken ähnliche Eigenschaften auf wie das konfliktgesteuerte Lernen für Constraint-Satisfaction-Problemen. Wir evaluieren die entwickelten Methoden empirisch, und zeigen dabei, dass das konfliktgesteuerte Lernen unter gewissen Voraussetzungen zu signifikanten Suchreduktionen beim Finden von Plänen in lösbaren klassischen Planungsproblemen, Beweisen der Unlösbarkeit von klassischen Planungsproblemen, und Lösen von MaxProb im probabilistischen Planen, führen kann

    Artificial general intelligence: Proceedings of the Second Conference on Artificial General Intelligence, AGI 2009, Arlington, Virginia, USA, March 6-9, 2009

    Get PDF
    Artificial General Intelligence (AGI) research focuses on the original and ultimate goal of AI – to create broad human-like and transhuman intelligence, by exploring all available paths, including theoretical and experimental computer science, cognitive science, neuroscience, and innovative interdisciplinary methodologies. Due to the difficulty of this task, for the last few decades the majority of AI researchers have focused on what has been called narrow AI – the production of AI systems displaying intelligence regarding specific, highly constrained tasks. In recent years, however, more and more researchers have recognized the necessity – and feasibility – of returning to the original goals of the field. Increasingly, there is a call for a transition back to confronting the more difficult issues of human level intelligence and more broadly artificial general intelligence

    Exploiting Contextual Independence In Probabilistic Inference

    Full text link
    Bayesian belief networks have grown to prominence because they provide compact representations for many problems for which probabilistic inference is appropriate, and there are algorithms to exploit this compactness. The next step is to allow compact representations of the conditional probabilities of a variable given its parents. In this paper we present such a representation that exploits contextual independence in terms of parent contexts; which variables act as parents may depend on the value of other variables. The internal representation is in terms of contextual factors (confactors) that is simply a pair of a context and a table. The algorithm, contextual variable elimination, is based on the standard variable elimination algorithm that eliminates the non-query variables in turn, but when eliminating a variable, the tables that need to be multiplied can depend on the context. This algorithm reduces to standard variable elimination when there is no contextual independence structure to exploit. We show how this can be much more efficient than variable elimination when there is structure to exploit. We explain why this new method can exploit more structure than previous methods for structured belief network inference and an analogous algorithm that uses trees

    Differentiable world programs

    Full text link
    L'intelligence artificielle (IA) moderne a ouvert de nouvelles perspectives prometteuses pour la création de robots intelligents. En particulier, les architectures d'apprentissage basées sur le gradient (réseaux neuronaux profonds) ont considérablement amélioré la compréhension des scènes 3D en termes de perception, de raisonnement et d'action. Cependant, ces progrès ont affaibli l'attrait de nombreuses techniques ``classiques'' développées au cours des dernières décennies. Nous postulons qu'un mélange de méthodes ``classiques'' et ``apprises'' est la voie la plus prometteuse pour développer des modèles du monde flexibles, interprétables et exploitables : une nécessité pour les agents intelligents incorporés. La question centrale de cette thèse est : ``Quelle est la manière idéale de combiner les techniques classiques avec des architectures d'apprentissage basées sur le gradient pour une compréhension riche du monde 3D ?''. Cette vision ouvre la voie à une multitude d'applications qui ont un impact fondamental sur la façon dont les agents physiques perçoivent et interagissent avec leur environnement. Cette thèse, appelée ``programmes différentiables pour modèler l'environnement'', unifie les efforts de plusieurs domaines étroitement liés mais actuellement disjoints, notamment la robotique, la vision par ordinateur, l'infographie et l'IA. Ma première contribution---gradSLAM--- est un système de localisation et de cartographie simultanées (SLAM) dense et entièrement différentiable. En permettant le calcul du gradient à travers des composants autrement non différentiables tels que l'optimisation non linéaire par moindres carrés, le raycasting, l'odométrie visuelle et la cartographie dense, gradSLAM ouvre de nouvelles voies pour intégrer la reconstruction 3D classique et l'apprentissage profond. Ma deuxième contribution - taskography - propose une sparsification conditionnée par la tâche de grandes scènes 3D encodées sous forme de graphes de scènes 3D. Cela permet aux planificateurs classiques d'égaler (et de surpasser) les planificateurs de pointe basés sur l'apprentissage en concentrant le calcul sur les attributs de la scène pertinents pour la tâche. Ma troisième et dernière contribution---gradSim--- est un simulateur entièrement différentiable qui combine des moteurs physiques et graphiques différentiables pour permettre l'estimation des paramètres physiques et le contrôle visuomoteur, uniquement à partir de vidéos ou d'une image fixe.Modern artificial intelligence (AI) has created exciting new opportunities for building intelligent robots. In particular, gradient-based learning architectures (deep neural networks) have tremendously improved 3D scene understanding in terms of perception, reasoning, and action. However, these advancements have undermined many ``classical'' techniques developed over the last few decades. We postulate that a blend of ``classical'' and ``learned'' methods is the most promising path to developing flexible, interpretable, and actionable models of the world: a necessity for intelligent embodied agents. ``What is the ideal way to combine classical techniques with gradient-based learning architectures for a rich understanding of the 3D world?'' is the central question in this dissertation. This understanding enables a multitude of applications that fundamentally impact how embodied agents perceive and interact with their environment. This dissertation, dubbed ``differentiable world programs'', unifies efforts from multiple closely-related but currently-disjoint fields including robotics, computer vision, computer graphics, and AI. Our first contribution---gradSLAM---is a fully differentiable dense simultaneous localization and mapping (SLAM) system. By enabling gradient computation through otherwise non-differentiable components such as nonlinear least squares optimization, ray casting, visual odometry, and dense mapping, gradSLAM opens up new avenues for integrating classical 3D reconstruction and deep learning. Our second contribution---taskography---proposes a task-conditioned sparsification of large 3D scenes encoded as 3D scene graphs. This enables classical planners to match (and surpass) state-of-the-art learning-based planners by focusing computation on task-relevant scene attributes. Our third and final contribution---gradSim---is a fully differentiable simulator that composes differentiable physics and graphics engines to enable physical parameter estimation and visuomotor control, solely from videos or a still image
    corecore