694 research outputs found

    Regular Boardgames

    Full text link
    We propose a new General Game Playing (GGP) language called Regular Boardgames (RBG), which is based on the theory of regular languages. The objective of RBG is to join key properties as expressiveness, efficiency, and naturalness of the description in one GGP formalism, compensating certain drawbacks of the existing languages. This often makes RBG more suitable for various research and practical developments in GGP. While dedicated mostly for describing board games, RBG is universal for the class of all finite deterministic turn-based games with perfect information. We establish foundations of RBG, and analyze it theoretically and experimentally, focusing on the efficiency of reasoning. Regular Boardgames is the first GGP language that allows efficient encoding and playing games with complex rules and with large branching factor (e.g.\ amazons, arimaa, large chess variants, go, international checkers, paper soccer).Comment: AAAI 201

    Dynamic Difficulty Adjustment

    Get PDF
    One of the challenges that a computer game developer faces when creating a new game is getting the difficulty right. Providing a game with an ability to automatically scale the difficulty depending on the current player would make the games more engaging over longer time. In this work we aim at a dynamic difficulty adjustment algorithm that can be used as a black box: universal, nonintrusive, and with guarantees on its performance. While there are a few commercial games that boast about having such a system, as well as a few published results on this topic, to the best of our knowledge none of them satisfy all three of these properties. On the way to our destination we first consider a game as an interaction between a player and her opponent. In this context, assuming their goals are mutually exclusive, difficulty adjustment consists of tuning the skill of the opponent to match the skill of the player. We propose a way to estimate the latter and adjust the former based on ranking the moves available to each player. Two sets of empirical experiments demonstrate the power, but also the limitations of this approach. Most importantly, the assumptions we make restrict the class of games it can be applied to. Looking for universality, we drop the constraints on the types of games we consider. We rely on the power of supervised learning and use the data collected from game testers to learn models of difficulty adjustment, as well as a mapping from game traces to models. Given a short game trace, the corresponding model tells the game what difficulty adjustment should be used. Using a self-developed game, we show that the predicted adjustments match players' preferences. The quality of the difficulty models depends on the quality of existing training data. The desire to dispense with the need for it leads us to the last approach. We propose a formalization of dynamic difficulty adjustment as a novel learning problem in the context of online learning and provide an algorithm to solve it, together with an upper bound on its performance. We show empirical results obtained in simulation and in two qualitatively different games with human participants. Due to its general nature, this algorithm can indeed be used as a black box for dynamic difficulty adjustment: It is applicable to any game with various difficulty states; it does not interfere with the player's experience; and it has a theoretical guarantee on how many mistakes it can possibly make

    Pgx: Hardware-accelerated Parallel Game Simulators for Reinforcement Learning

    Full text link
    We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging auto-vectorization and Just-In-Time (JIT) compilation of JAX, Pgx can efficiently scale to thousands of parallel executions over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing Python RL libraries. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles. We demonstrate the efficient training of the Gumbel AlphaZero algorithm with Pgx environments. Overall, Pgx provides high-performance environment simulators for researchers to accelerate their RL experiments. Pgx is available at https://github.com/sotetsuk/pgx.Comment: 9 page

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Selective search in games of different complexity

    Get PDF

    CH-Go: Online Go System Based on Chunk Data Storage

    Full text link
    The training and running of an online Go system require the support of effective data management systems to deal with vast data, such as the initial Go game records, the feature data set obtained by representation learning, the experience data set of self-play, the randomly sampled Monte Carlo tree, and so on. Previous work has rarely mentioned this problem, but the ability and efficiency of data management systems determine the accuracy and speed of the Go system. To tackle this issue, we propose an online Go game system based on the chunk data storage method (CH-Go), which processes the format of 160k Go game data released by Kiseido Go Server (KGS) and designs a Go encoder with 11 planes, a parallel processor and generator for better memory performance. Specifically, we store the data in chunks, take the chunk size of 1024 as a batch, and save the features and labels of each chunk as binary files. Then a small set of data is randomly sampled each time for the neural network training, which is accessed by batch through yield method. The training part of the prototype includes three modules: supervised learning module, reinforcement learning module, and an online module. Firstly, we apply Zobrist-guided hash coding to speed up the Go board construction. Then we train a supervised learning policy network to initialize the self-play for generation of experience data with 160k Go game data released by KGS. Finally, we conduct reinforcement learning based on REINFORCE algorithm. Experiments show that the training accuracy of CH- Go in the sampled 150 games is 99.14%, and the accuracy in the test set is as high as 98.82%. Under the condition of limited local computing power and time, we have achieved a better level of intelligence. Given the current situation that classical systems such as GOLAXY are not free and open, CH-Go has realized and maintained complete Internet openness.Comment: The 8th International Conference on Data Science and System
    corecore