Autonomous excavation is a challenging task. The unknown contact dynamics
between the excavator bucket and the terrain could easily result in large
contact forces and jamming problems during excavation. Traditional model-based
methods struggle to handle such problems due to complex dynamic modeling. In
this paper, we formulate the excavation skills with three novel manipulation
primitives. We propose to learn the manipulation primitives with offline
reinforcement learning (RL) to avoid large amounts of online robot
interactions. The proposed method can learn efficient penetration skills from
sub-optimal demonstrations, which contain sub-trajectories that can be
``stitched" together to formulate an optimal trajectory without causing
jamming. We evaluate the proposed method with extensive experiments on
excavating a variety of rigid objects and demonstrate that the learned policy
outperforms the demonstrations. We also show that the learned policy can
quickly adapt to unseen and challenging fragmented rocks with online
fine-tuning.Comment: Submitted to IROS 202