Enhancing deep reinforcement learning for scale flexibility in real-time strategy games

Abstract

Real-time strategy (RTS) games present a unique challenge for AI agents due to the combination of several fundamental AI problems. While Deep Reinforcement Learning (DRL) has shown promise in the development of autonomous agents for the genre, existing architectures often struggle with games featuring maps of varying dimensions. This limitation hinders the agent’s ability to generalize its learned strategies across different scenarios. This paper proposes a novel approach that overcomes this problem by incorporating Spatial Pyramid Pooling (SPP) within a DRL framework. We leverage the GridNet architecture’s encoder–decoder structure and integrate an SPP layer into the critic network of the Proximal Policy Optimization (PPO) algorithm. This SPP layer dynamically generates a standardized representation of the game state, regardless of the initial observation size. This allows the agent to effectively adapt its decision-making process to any map configuration. Our evaluations demonstrate that the proposed method significantly enhances the model’s flexibility and efficiency in training agents for various RTS game scenarios, albeit with some discernible limitations when applied to very small maps. This approach paves the way for more robust and adaptable AI agents capable of excelling in sequential decision problems with variable-size observations

    Similar works