Deep reinforcement learning (deep RL) has emerged as an effective tool for
developing controllers for legged robots. However, vanilla deep RL often
requires a tremendous amount of training samples and is not feasible for
achieving robust behaviors. Instead, researchers have investigated a novel
policy architecture by incorporating human experts' knowledge, such as Policies
Modulating Trajectory Generators (PMTG). This architecture builds a recurrent
control loop by combining a parametric trajectory generator (TG) and a feedback
policy network to achieve more robust behaviors. To take advantage of human
experts' knowledge but eliminate time-consuming interactive teaching,
researchers have investigated a novel architecture, Policies Modulating
Trajectory Generators (PMTG), which builds a recurrent control loop by
combining a parametric trajectory generator (TG) and a feedback policy network
to achieve more robust behaviors using intuitive prior knowledge. In this work,
we propose Policies Modulating Finite State Machine (PM-FSM) by replacing TGs
with contact-aware finite state machines (FSM), which offer more flexible
control of each leg. Compared with the TGs, FSMs offer high-level management on
each leg motion generator and enable a flexible state arrangement, which makes
the learned behavior less vulnerable to unseen perturbations or challenging
terrains. This invention offers an explicit notion of contact events to the
policy to negotiate unexpected perturbations. We demonstrated that the proposed
architecture could achieve more robust behaviors in various scenarios, such as
challenging terrains or external perturbations, on both simulated and real
robots. The supplemental video can be found at: https://youtu.be/78cboMqTkJQ