Visual action planning particularly excels in applications where the state of
the system cannot be computed explicitly, such as manipulation of deformable
objects, as it enables planning directly from raw images. Even though the field
has been significantly accelerated by deep learning techniques, a crucial
requirement for their success is the availability of a large amount of data. In
this work, we propose the Augment-Connect-Explore (ACE) paradigm to enable
visual action planning in cases of data scarcity.
We build upon the Latent Space Roadmap (LSR) framework which performs
planning with a graph built in a low dimensional latent space. In particular,
ACE is used to i) Augment the available training dataset by autonomously
creating new pairs of datapoints, ii) create new unobserved Connections among
representations of states in the latent graph, and iii) Explore new regions of
the latent space in a targeted manner. We validate the proposed approach on
both simulated box stacking and real-world folding task showing the
applicability for rigid and deformable object manipulation tasks, respectively