Many real-world reinforcement learning problems have a hierarchical nature,
and often exhibit some degree of partial observability. While hierarchy and
partial observability are usually tackled separately (for instance by combining
recurrent neural networks and options), we show that addressing both problems
simultaneously is simpler and more efficient in many cases. More specifically,
we make the initiation set of options conditional on the previously-executed
option, and show that options with such Option-Observation Initiation Sets
(OOIs) are at least as expressive as Finite State Controllers (FSCs), a
state-of-the-art approach for learning in POMDPs. OOIs are easy to design based
on an intuitive description of the task, lead to explainable policies and keep
the top-level and option policies memoryless. Our experiments show that OOIs
allow agents to learn optimal policies in challenging POMDPs, while being much
more sample-efficient than a recurrent neural network over options