524 research outputs found
Bootstrap State Representation using Style Transfer for Better Generalization in Deep Reinforcement Learning
Deep Reinforcement Learning (RL) agents often overfit the training
environment, leading to poor generalization performance. In this paper, we
propose Thinker, a bootstrapping method to remove adversarial effects of
confounding features from the observation in an unsupervised way, and thus, it
improves RL agents' generalization. Thinker first clusters experience
trajectories into several clusters. These trajectories are then bootstrapped by
applying a style transfer generator, which translates the trajectories from one
cluster's style to another while maintaining the content of the observations.
The bootstrapped trajectories are then used for policy learning. Thinker has
wide applicability among many RL settings. Experimental results reveal that
Thinker leads to better generalization capability in the Procgen benchmark
environments compared to base algorithms and several data augmentation
techniques.Comment: Accepted at ECML-PKDD 202
- …