Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Abbas, Adeeb; Bekris, Kostas; Boularias, Abdeslam; Boyalakuntla, Kowndinya; Cai, Siwei; Chang, Haonan; Geng, Shijie; Jing, Eric; Keskar, Shreesh; Lu, Shiyang; Zhou, Lifeng

Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Authors: Adeeb Abbas
Kostas Bekris
Abdeslam Boularias
Kowndinya Boyalakuntla
Siwei Cai
Haonan Chang
Shijie Geng
Eric Jing
Shreesh Keskar
Shiyang Lu
Lifeng Zhou
Publication date: 27 September 2023
Publisher

Abstract

We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a sofa on which someone is sitting". In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.Comment: The code and dataset used for evaluation can be found at https://github.com/changhaonan/OVSG}{https://github.com/changhaonan/OVSG. This paper has been accepted by CoRL202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.15940

Last time updated on 10/12/2023