229,556 research outputs found

    Efficient Open World Reasoning for Planning

    Full text link
    We consider the problem of reasoning and planning with incomplete knowledge and deterministic actions. We introduce a knowledge representation scheme called PSIPLAN that can effectively represent incompleteness of an agent's knowledge while allowing for sound, complete and tractable entailment in domains where the set of all objects is either unknown or infinite. We present a procedure for state update resulting from taking an action in PSIPLAN that is correct, complete and has only polynomial complexity. State update is performed without considering the set of all possible worlds corresponding to the knowledge state. As a result, planning with PSIPLAN is done without direct manipulation of possible worlds. PSIPLAN representation underlies the PSIPOP planning algorithm that handles quantified goals with or without exceptions that no other domain independent planner has been shown to achieve. PSIPLAN has been implemented in Common Lisp and used in an application on planning in a collaborative interface.Comment: 39 pages, 13 figures. to appear in Logical Methods in Computer Scienc

    ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

    Full text link
    For robots to perform a wide variety of tasks, they require a 3D representation of the world that is semantically rich, yet compact and efficient for task-driven perception and planning. Recent approaches have attempted to leverage features from large vision-language models to encode semantics in 3D representations. However, these approaches tend to produce maps with per-point feature vectors, which do not scale well in larger environments, nor do they contain semantic spatial relationships between entities in the environment, which are useful for downstream planning. In this work, we propose ConceptGraphs, an open-vocabulary graph-structured representation for 3D scenes. ConceptGraphs is built by leveraging 2D foundation models and fusing their output to 3D by multi-view association. The resulting representations generalize to novel semantic classes, without the need to collect large 3D datasets or finetune models. We demonstrate the utility of this representation through a number of downstream planning tasks that are specified through abstract (language) prompts and require complex reasoning over spatial and semantic concepts. (Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYnc )Comment: Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYn

    AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

    Full text link
    We propose a novel framework for learning high-level cognitive capabilities in robot manipulation tasks, such as making a smiley face using building blocks. These tasks often involve complex multi-step reasoning, presenting significant challenges due to the limited paired data connecting human instructions (e.g., making a smiley face) and robot actions (e.g., end-effector movement). Existing approaches relieve this challenge by adopting an open-loop paradigm decomposing high-level instructions into simple sub-task plans, and executing them step-by-step using low-level control models. However, these approaches are short of instant observations in multi-step reasoning, leading to sub-optimal results. To address this issue, we propose to automatically collect a cognitive robot dataset by Large Language Models (LLMs). The resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of multi-step text plans and paired observation sequences. To enable efficient data acquisition, we employ elaborated multi-round prompt designs that effectively reduce the burden of extensive human involvement. We further propose a closed-loop multi-modal embodied planning model that autoregressively generates plans by taking image observations as input. To facilitate effective learning, we leverage MiniGPT-4 with a frozen visual encoder and LLM, and finetune additional vision adapter and Q-former to enable fine-grained spatial perception for manipulation tasks. We conduct experiments to verify the superiority over existing open and closed-loop methods, and achieve a significant increase in success rate by 21.4% and 14.5% over ChatGPT and GPT-4 based robot tasks. Real-world demos are shown in https://www.youtube.com/watch?v=ayAzID1_qQk

    Task planning using physics-based heuristics on manipulation actions

    Get PDF
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In order to solve mobile manipulation problems, the efficient combination of task and motion planning is usually required. Moreover, the incorporation of physics-based information has recently been taken into account in order to plan the tasks in a more realistic way. In the present paper, a task and motion planning framework is proposed based on a modified version of the Fast-Forward task planner that is guided by physics-based knowledge. The proposal uses manipulation knowledge for reasoning on symbolic literals (both in offline and online modes) taking into account geometric information in order to evaluate the applicability as well as feasibility of actions while evaluating the heuristic cost. It results in an efficient search of the state space and in the obtention of low-cost physically-feasible plans. The proposal has been implemented and is illustrated with a manipulation problem consisting of a mobile robot and some fixed and manipulatable objects.Peer ReviewedPostprint (author's final draft

    Contingent task and motion planning under uncertainty for human–robot interactions

    Get PDF
    Manipulation planning under incomplete information is a highly challenging task for mobile manipulators. Uncertainty can be resolved by robot perception modules or using human knowledge in the execution process. Human operators can also collaborate with robots for the execution of some difficult actions or as helpers in sharing the task knowledge. In this scope, a contingent-based task and motion planning is proposed taking into account robot uncertainty and human–robot interactions, resulting a tree-shaped set of geometrically feasible plans. Different sorts of geometric reasoning processes are embedded inside the planner to cope with task constraints like detecting occluding objects when a robot needs to grasp an object. The proposal has been evaluated with different challenging scenarios in simulation and a real environment.Postprint (published version

    Learning and Reasoning for Robot Sequential Decision Making under Uncertainty

    Full text link
    Robots frequently face complex tasks that require more than one action, where sequential decision-making (SDM) capabilities become necessary. The key contribution of this work is a robot SDM framework, called LCORPP, that supports the simultaneous capabilities of supervised learning for passive state estimation, automated reasoning with declarative human knowledge, and planning under uncertainty toward achieving long-term goals. In particular, we use a hybrid reasoning paradigm to refine the state estimator, and provide informative priors for the probabilistic planner. In experiments, a mobile robot is tasked with estimating human intentions using their motion trajectories, declarative contextual knowledge, and human-robot interaction (dialog-based and motion-based). Results suggest that, in efficiency and accuracy, our framework performs better than its no-learning and no-reasoning counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence, 202

    Physics-based Motion Planning with Temporal Logic Specifications

    Get PDF
    One of the main foci of robotics is nowadays centered in providing a great degree of autonomy to robots. A fundamental step in this direction is to give them the ability to plan in discrete and continuous spaces to find the required motions to complete a complex task. In this line, some recent approaches describe tasks with Linear Temporal Logic (LTL) and reason on discrete actions to guide sampling-based motion planning, with the aim of finding dynamically-feasible motions that satisfy the temporal-logic task specifications. The present paper proposes an LTL planning approach enhanced with the use of ontologies to describe and reason about the task, on the one hand, and that includes physics-based motion planning to allow the purposeful manipulation of objects, on the other hand. The proposal has been implemented and is illustrated with didactic examples with a mobile robot in simple scenarios where some of the goals are occupied with objects that must be removed in order to fulfill the task.Comment: The 20th World Congress of the International Federation of Automatic Control, 9-14 July 201
    • …
    corecore