3 research outputs found
Motion Reasoning for Goal-Based Imitation Learning
We address goal-based imitation learning, where the aim is to output the
symbolic goal from a third-person video demonstration. This enables the robot
to plan for execution and reproduce the same goal in a completely different
environment. The key challenge is that the goal of a video demonstration is
often ambiguous at the level of semantic actions. The human demonstrators might
unintentionally achieve certain subgoals in the demonstrations with their
actions. Our main contribution is to propose a motion reasoning framework that
combines task and motion planning to disambiguate the true intention of the
demonstrator in the video demonstration. This allows us to robustly recognize
the goals that cannot be disambiguated by previous action-based approaches. We
evaluate our approach by collecting a dataset of 96 video demonstrations in a
mockup kitchen environment. We show that our motion reasoning plays an
important role in recognizing the actual goal of the demonstrator and improves
the success rate by over 20%. We further show that by using the automatically
inferred goal from the video demonstration, our robot is able to reproduce the
same task in a real kitchen environment
Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs
We present a visually grounded hierarchical planning algorithm for
long-horizon manipulation tasks. Our algorithm offers a joint framework of
neuro-symbolic task planning and low-level motion generation conditioned on the
specified goal. At the core of our approach is a two-level scene graph
representation, namely geometric scene graph and symbolic scene graph. This
hierarchical representation serves as a structured, object-centric abstraction
of manipulation scenes. Our model uses graph neural networks to process these
scene graphs for predicting high-level task plans and low-level motions. We
demonstrate that our method scales to long-horizon tasks and generalizes well
to novel task goals. We validate our method in a kitchen storage task in both
physical simulation and the real world. Our experiments show that our method
achieved over 70% success rate and nearly 90% of subgoal completion rate on the
real robot while being four orders of magnitude faster in computation time
compared to standard search-based task-and-motion planner.Comment: Accepted to ICRA 202
SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks
Recent advances in deep reinforcement learning (RL) have demonstrated its
potential to learn complex robotic manipulation tasks. However, RL still
requires the robot to collect a large amount of real-world experience. To
address this problem, recent works have proposed learning from expert
demonstrations (LfD), particularly via inverse reinforcement learning (IRL),
given its ability to achieve robust performance with only a small number of
expert demonstrations. Nevertheless, deploying IRL on real robots is still
challenging due to the large number of robot experiences it requires. This
paper aims to address this scalability challenge with a robust,
sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new
but related long-horizon task robustly given only a single video demonstration.
First, this algorithm bootstraps the learning of a task encoder and a
task-conditioned policy using behavioral cloning (BC). It then collects
real-robot experiences and bypasses reward learning by directly recovering a
Q-function from the combined robot and expert trajectories. Next, this
algorithm uses the Q-function to re-evaluate all cumulative experiences
collected by the robot to improve the policy quickly. In the end, the policy
performs more robustly (90%+ success) than BC on new tasks while requiring no
trial-and-errors at test time. Finally, our real-robot and simulated
experiments demonstrate our algorithm's generality across different state
spaces, action spaces, and vision-based manipulation tasks, e.g.,
pick-pour-place and pick-carry-drop.Comment: 8 page