258 research outputs found

    Macro action selection with deep reinforcement learning in StarCraft

    Full text link
    StarCraft (SC) is one of the most popular and successful Real Time Strategy (RTS) games. In recent years, SC is also widely accepted as a challenging testbed for AI research because of its enormous state space, partially observed information, multi-agent collaboration, and so on. With the help of annual AIIDE and CIG competitions, a growing number of SC bots are proposed and continuously improved. However, a large gap remains between the top-level bot and the professional human player. One vital reason is that current SC bots mainly rely on predefined rules to select macro actions during their games. These rules are not scalable and efficient enough to cope with the enormous yet partially observed state space in the game. In this paper, we propose a deep reinforcement learning (DRL) framework to improve the selection of macro actions. Our framework is based on the combination of the Ape-X DQN and the Long-Short-Term-Memory (LSTM). We use this framework to build our bot, named as LastOrder. Our evaluation, based on training against all bots from the AIIDE 2017 StarCraft AI competition set, shows that LastOrder achieves an 83% winning rate, outperforming 26 bots in total 28 entrants

    Macro action selection with deep reinforcement learning in StarCraft

    Get PDF
    StarCraft (SC) is one of the most popular and successful Real Time Strategy (RTS) games. In recent years, SC is also considered as a testbed for AI research, due to its enormous state space, hidden information, multi-agent collaboration and so on. Thanks to the annual AIIDE and CIG competitions, a growing number of bots are proposed and being continuously improved. However, a big gap still remains between the top bot and the professional human players. One vital reason is that current bots mainly rely on predefined rules to perform macro actions. These rules are not scalable and efficient enough to cope with the large but partially observed macro state space in SC. In this paper, we propose a DRL based framework to do macro action selection. Our framework combines the reinforcement learning approach Ape-X DQN with Long-Short-Term-Memory (LSTM) to improve the macro action selection in bot. We evaluate our bot, named as LastOrder, on the AIIDE 2017 StarCraft AI competition bots set. Our bot achieves overall 83% win-rate, outperforming 26 bots in total 28 entrants

    On Reinforcement Learning for Full-length Game of StarCraft

    Full text link
    StarCraft II poses a grand challenge for reinforcement learning. The main difficulties of it include huge state and action space and a long-time horizon. In this paper, we investigate a hierarchical reinforcement learning approach for StarCraft II. The hierarchy involves two levels of abstraction. One is the macro-action automatically extracted from expert's trajectories, which reduces the action space in an order of magnitude yet remains effective. The other is a two-layer hierarchical architecture which is modular and easy to scale, enabling a curriculum transferring from simpler tasks to more complex tasks. The reinforcement training algorithm for this architecture is also investigated. On a 64x64 map and using restrictive units, we achieve a winning rate of more than 99\% against the difficulty level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat model, we can achieve over 93\% winning rate of Protoss against the most difficult non-cheating built-in AI (level-7) of Terran, training within two days using a single machine with only 48 CPU cores and 8 K40 GPUs. It also shows strong generalization performance, when tested against never seen opponents including cheating levels built-in AI and all levels of Zerg and Protoss built-in AI. We hope this study could shed some light on the future research of large-scale reinforcement learning.Comment: Appeared in AAAI 201

    Deep learning for video game playing

    Get PDF
    In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

    A Study of AI Population Dynamics with Million-agent Reinforcement Learning

    Get PDF
    We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning. Our intention is to put intelligent agents into a simulated natural context and verify if the principles developed in the real world could also be used in understanding an artificially-created intelligent population. To achieve this, we simulate a large-scale predator-prey world, where the laws of the world are designed by only the findings or logical equivalence that have been discovered in nature. We endow the agents with the intelligence based on deep reinforcement learning (DRL). In order to scale the population size up to millions agents, a large-scale DRL training platform with redesigned experience buffer is proposed. Our results show that the population dynamics of AI agents, driven only by each agent's individual self-interest, reveals an ordered pattern that is similar to the Lotka-Volterra model studied in population biology. We further discover the emergent behaviors of collective adaptations in studying how the agents' grouping behaviors will change with the environmental resources. Both of the two findings could be explained by the self-organization theory in nature.Comment: Full version of the paper presented at AAMAS 2018 (International Conference on Autonomous Agents and Multiagent Systems

    On Efficient Reinforcement Learning for Full-length Game of StarCraft II

    Full text link
    StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL), of which the main difficulties include huge state space, varying action space, and a long time horizon. In this work, we investigate a set of RL techniques for the full-length game of StarCraft II. We investigate a hierarchical RL approach involving extracted macro-actions and a hierarchical architecture of neural networks. We investigate a curriculum transfer training procedure and train the agent on a single machine with 4 GPUs and 48 CPU threads. On a 64x64 map and using restrictive units, we achieve a win rate of 99% against the level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat models, we achieve a 93% win rate against the most difficult non-cheating level built-in AI (level-7). In this extended version of the paper, we improve our architecture to train the agent against the cheating level AIs and achieve the win rate against the level-8, level-9, and level-10 AIs as 96%, 97%, and 94%, respectively. Our codes are at https://github.com/liuruoze/HierNet-SC2. To provide a baseline referring the AlphaStar for our work as well as the research and open-source community, we reproduce a scaled-down version of it, mini-AlphaStar (mAS). The latest version of mAS is 1.07, which can be trained on the raw action space which has 564 actions. It is designed to run training on a single common machine, by making the hyper-parameters adjustable. We then compare our work with mAS using the same resources and show that our method is more effective. The codes of mini-AlphaStar are at https://github.com/liuruoze/mini-AlphaStar. We hope our study could shed some light on the future research of efficient reinforcement learning on SC2 and other large-scale games.Comment: 48 pages,21 figure
    corecore