This paper investigates a novel method combining Scalable Evolution
Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL).
S-ES, named for its excellent scalability, was popularised with demonstrated
performance comparable to state-of-the-art policy gradient
methods. However, S-ES has not been tested in conjunction with
HRL methods, which empower temporal abstraction thus allowing
agents to tackle more challenging problems. We introduce a novel
method merging S-ES and HRL, which creates a highly scalable
and efficient (compute time) algorithm. We demonstrate that the
proposed method benefits from S-ESβs scalability and indifference
to delayed rewards. This results in our main contribution: significantly
higher learning speed and competitive performance compared
to gradient-based HRL methods, across a range of tasks