29,361 research outputs found
CAPE: Corrective Actions from Precondition Errors using Large Language Models
Extracting commonsense knowledge from a large language model (LLM) offers a
path to designing intelligent robots. Existing approaches that leverage LLMs
for planning are unable to recover when an action fails and often resort to
retrying failed actions, without resolving the error's underlying cause.
We propose a novel approach (CAPE) that attempts to propose corrective
actions to resolve precondition errors during planning. CAPE improves the
quality of generated plans by leveraging few-shot reasoning from action
preconditions. Our approach enables embodied agents to execute more tasks than
baseline methods while ensuring semantic correctness and minimizing
re-prompting. In VirtualHome, CAPE generates executable plans while improving a
human-annotated plan correctness metric from 28.89% to 49.63% over SayCan. Our
improvements transfer to a Boston Dynamics Spot robot initialized with a set of
skills (specified in language) and associated preconditions, where CAPE
improves the correctness metric of the executed task plans by 76.49% compared
to SayCan. Our approach enables the robot to follow natural language commands
and robustly recover from failures, which baseline approaches largely cannot
resolve or address inefficiently.Comment: 8 pages, 3 figures, Under Review at ICRA 202
Active vision for dexterous grasping of novel objects
How should a robot direct active vision so as to ensure reliable grasping? We
answer this question for the case of dexterous grasping of unfamiliar objects.
By dexterous grasping we simply mean grasping by any hand with more than two
fingers, such that the robot has some choice about where to place each finger.
Such grasps typically fail in one of two ways, either unmodeled objects in the
scene cause collisions or object reconstruction is insufficient to ensure that
the grasp points provide a stable force closure. These problems can be solved
more easily if active sensing is guided by the anticipated actions. Our
approach has three stages. First, we take a single view and generate candidate
grasps from the resulting partial object reconstruction. Second, we drive the
active vision approach to maximise surface reconstruction quality around the
planned contact points. During this phase, the anticipated grasp is continually
refined. Third, we direct gaze to improve the safety of the planned reach to
grasp trajectory. We show, on a dexterous manipulator with a camera on the
wrist, that our approach (80.4% success rate) outperforms a randomised
algorithm (64.3% success rate).Comment: IROS 2016. Supplementary video: https://youtu.be/uBSOO6tMzw
- …