5 research outputs found
Visually Grounding Instruction for History-Dependent Manipulation
This paper emphasizes the importance of robot's ability to refer its task
history, when it executes a series of pick-and-place manipulations by following
text instructions given one by one. The advantage of referring the manipulation
history can be categorized into two folds: (1) the instructions omitting
details or using co-referential expressions can be interpreted, and (2) the
visual information of objects occluded by previous manipulations can be
inferred. For this challenge, we introduce the task of history-dependent
manipulation which is to visually ground a series of text instructions for
proper manipulations depending on the task history. We also suggest a relevant
dataset and a methodology based on the deep neural network, and show that our
network trained with a synthetic dataset can be applied to the real world based
on images transferred into synthetic-style based on the CycleGAN.Comment: 8 pages, 6 figure