1 research outputs found

    <b>D</b>ataset for <b>O</b>pen <b>V</b>ocabulary <b>E</b>ntity <b>G</b>rounding (DOVE-G)

    No full text
    DOVE-GTo accommodate the richness of open-vocabulary queries, we introduced a custom dataset—DOVE-G (Dataset for Open-Vocabulary Entity Grounding). This dataset has 8 scenes namely kitchen, kitchenette, room1, room2, room3, bathroom, computer lab, and hallway. This dataset is created to facilitate users to query for objects within a scene using natural language. For each scene within DOVE-G, we manually labeled the ground truth and created 50 natural language queries (Lq ). To augment this query set, we harnessed LLMs to generate four additional sets of natural language queries. This approach yielded a total of 250 queries for each scene, and cumulatively, we have 4000 queries to evaluate OVSG’s performance. With this setup, we set out to assess how our OVSG framework performs with open-vocabulary queries, one of our key research questions, providing a critical testbed for its effectiveness in handling diverse natural language expressions.</p
    corecore