Recent work on designing an appropriate distribution of environments has
shown promise for training effective generally capable agents. Its success is
partly because of a form of adaptive curriculum learning that generates
environment instances (or levels) at the frontier of the agent's capabilities.
However, such an environment design framework often struggles to find effective
levels in challenging design spaces and requires costly interactions with the
environment. In this paper, we aim to introduce diversity in the Unsupervised
Environment Design (UED) framework. Specifically, we propose a task-agnostic
method to identify observed/hidden states that are representative of a given
level. The outcome of this method is then utilized to characterize the
diversity between two levels, which as we show can be crucial to effective
performance. In addition, to improve sampling efficiency, we incorporate the
self-play technique that allows the environment generator to automatically
generate environments that are of great benefit to the training agent.
Quantitatively, our approach, Diversity-induced Environment Design via
Self-Play (DivSP), shows compelling performance over existing methods