229,563 research outputs found
Efficient Open World Reasoning for Planning
We consider the problem of reasoning and planning with incomplete knowledge
and deterministic actions. We introduce a knowledge representation scheme
called PSIPLAN that can effectively represent incompleteness of an agent's
knowledge while allowing for sound, complete and tractable entailment in
domains where the set of all objects is either unknown or infinite. We present
a procedure for state update resulting from taking an action in PSIPLAN that is
correct, complete and has only polynomial complexity. State update is performed
without considering the set of all possible worlds corresponding to the
knowledge state. As a result, planning with PSIPLAN is done without direct
manipulation of possible worlds. PSIPLAN representation underlies the PSIPOP
planning algorithm that handles quantified goals with or without exceptions
that no other domain independent planner has been shown to achieve. PSIPLAN has
been implemented in Common Lisp and used in an application on planning in a
collaborative interface.Comment: 39 pages, 13 figures. to appear in Logical Methods in Computer
Scienc
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
For robots to perform a wide variety of tasks, they require a 3D
representation of the world that is semantically rich, yet compact and
efficient for task-driven perception and planning. Recent approaches have
attempted to leverage features from large vision-language models to encode
semantics in 3D representations. However, these approaches tend to produce maps
with per-point feature vectors, which do not scale well in larger environments,
nor do they contain semantic spatial relationships between entities in the
environment, which are useful for downstream planning. In this work, we propose
ConceptGraphs, an open-vocabulary graph-structured representation for 3D
scenes. ConceptGraphs is built by leveraging 2D foundation models and fusing
their output to 3D by multi-view association. The resulting representations
generalize to novel semantic classes, without the need to collect large 3D
datasets or finetune models. We demonstrate the utility of this representation
through a number of downstream planning tasks that are specified through
abstract (language) prompts and require complex reasoning over spatial and
semantic concepts. (Project page: https://concept-graphs.github.io/ Explainer
video: https://youtu.be/mRhNkQwRYnc )Comment: Project page: https://concept-graphs.github.io/ Explainer video:
https://youtu.be/mRhNkQwRYn
AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
We propose a novel framework for learning high-level cognitive capabilities
in robot manipulation tasks, such as making a smiley face using building
blocks. These tasks often involve complex multi-step reasoning, presenting
significant challenges due to the limited paired data connecting human
instructions (e.g., making a smiley face) and robot actions (e.g., end-effector
movement). Existing approaches relieve this challenge by adopting an open-loop
paradigm decomposing high-level instructions into simple sub-task plans, and
executing them step-by-step using low-level control models. However, these
approaches are short of instant observations in multi-step reasoning, leading
to sub-optimal results. To address this issue, we propose to automatically
collect a cognitive robot dataset by Large Language Models (LLMs). The
resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of
multi-step text plans and paired observation sequences. To enable efficient
data acquisition, we employ elaborated multi-round prompt designs that
effectively reduce the burden of extensive human involvement. We further
propose a closed-loop multi-modal embodied planning model that autoregressively
generates plans by taking image observations as input. To facilitate effective
learning, we leverage MiniGPT-4 with a frozen visual encoder and LLM, and
finetune additional vision adapter and Q-former to enable fine-grained spatial
perception for manipulation tasks. We conduct experiments to verify the
superiority over existing open and closed-loop methods, and achieve a
significant increase in success rate by 21.4% and 14.5% over ChatGPT and GPT-4
based robot tasks. Real-world demos are shown in
https://www.youtube.com/watch?v=ayAzID1_qQk
Task planning using physics-based heuristics on manipulation actions
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In order to solve mobile manipulation problems, the efficient combination of task and motion planning is usually required. Moreover, the incorporation of physics-based information has recently been taken into account in order to plan the tasks in a more realistic way. In the present paper, a task and motion planning framework is proposed based on a modified version of the Fast-Forward task planner that is guided by physics-based knowledge.
The proposal uses manipulation knowledge for reasoning on symbolic literals (both in offline and online modes) taking into account geometric information in order to evaluate the applicability as well as feasibility of actions while evaluating the heuristic cost. It results in an efficient search of the state space and in the obtention of low-cost physically-feasible plans. The proposal has been implemented and is illustrated with a manipulation problem consisting of a mobile robot and some fixed and manipulatable objects.Peer ReviewedPostprint (author's final draft
Contingent task and motion planning under uncertainty for human–robot interactions
Manipulation planning under incomplete information is a highly challenging task for mobile manipulators. Uncertainty can be resolved by robot perception modules or using human knowledge in the execution process. Human operators can also collaborate with robots for the execution of some difficult actions or as helpers in sharing the task knowledge. In this scope, a contingent-based task and motion planning is proposed taking into account robot uncertainty and human–robot interactions, resulting a tree-shaped set of geometrically feasible plans. Different sorts of geometric reasoning processes are embedded inside the planner to cope with task constraints like detecting occluding objects when a robot needs to grasp an object. The proposal has been evaluated with different challenging scenarios in simulation and a real environment.Postprint (published version
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
Physics-based Motion Planning with Temporal Logic Specifications
One of the main foci of robotics is nowadays centered in providing a great
degree of autonomy to robots. A fundamental step in this direction is to give
them the ability to plan in discrete and continuous spaces to find the required
motions to complete a complex task. In this line, some recent approaches
describe tasks with Linear Temporal Logic (LTL) and reason on discrete actions
to guide sampling-based motion planning, with the aim of finding
dynamically-feasible motions that satisfy the temporal-logic task
specifications. The present paper proposes an LTL planning approach enhanced
with the use of ontologies to describe and reason about the task, on the one
hand, and that includes physics-based motion planning to allow the purposeful
manipulation of objects, on the other hand. The proposal has been implemented
and is illustrated with didactic examples with a mobile robot in simple
scenarios where some of the goals are occupied with objects that must be
removed in order to fulfill the task.Comment: The 20th World Congress of the International Federation of Automatic
Control, 9-14 July 201
- …