DoReMi: Grounding Language Model by Detecting and Recovering from
  Plan-Execution Misalignment

Chen, Jianyu; Guo, Yanjiang; Jiang, Zheyuan; Wang, Yen-Jen; Zha, Lihan

DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment

Authors: Jianyu Chen
Yanjiang Guo
Zheyuan Jiang
Yen-Jen Wang
Lihan Zha
Publication date: 24 August 2023
Publisher

Abstract

Large language models encode a vast amount of semantic knowledge and possess remarkable understanding and reasoning capabilities. Previous research has explored how to ground language models in robotic tasks to ensure that the sequences generated by the language model are both logically correct and practically executable. However, low-level execution may deviate from the high-level plan due to environmental perturbations or imperfect controller design. In this paper, we propose DoReMi, a novel language model grounding framework that enables immediate Detection and Recovery from Misalignments between plan and execution. Specifically, LLMs are leveraged for both planning and generating constraints for planned steps. These constraints can indicate plan-execution misalignments and we use a vision question answering (VQA) model to check constraints during low-level skill execution. If certain misalignment occurs, our method will call the language model to re-plan in order to recover from misalignments. Experiments on various complex tasks including robot arms and humanoid robots demonstrate that our method can lead to higher task success rates and shorter task completion times. Videos of DoReMi are available at https://sites.google.com/view/doremi-paper.Comment: 21 pages, 13 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.00329

Last time updated on 06/07/2023