15 research outputs found

    Playing Cassino with Reinforcement Learning

    Get PDF
    Reinforcement learning algorithms have been used to create game-playing agents for various games—mostly, deterministic games such as chess, shogi, and Go. This study used Deep-Q reinforcement learning to create an agent that plays a non-deterministic card game, Cassino. This agent’s performance was compared against the performance of a Cassino mobile app. Results showed that the trained models did not perform well and had trouble training around build actions which are important in Cassino. Future research could experiment with other reinforcement learning algorithms to see if they are better at training around build actions

    A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks

    Get PDF
    In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm

    Solving The Flexible Job Shop Problem using Multi-Objective Optimizer with Solution Characteristic Extraction

    Get PDF
    It is difficult to find optimal scheduling solutions for abstract scheduling problems with mass parallel tasks on multiprocessors because they are NP-complete. In this paper, a solution searching strategy called solution characteristic extraction is proposed as a multi-objective optimizer for solving flexible job shop problems (FJSP). These problems are concerned with finishing assigned jobs with minimal critical machine workload, total workload, and completion times. A suitable job assignment must consider processor performance, job complexity, and job suitability for each individual processor simultaneously. To test the efficiency and robustness of the proposed method, the experiments will contain two groups of benchmarks; with, and without release time constraints. Each benchmark includes numbers of heterogeneous processors and different jobs for execution. The results indicate the proposed method can find more potential solutions, and outperform related methods

    Searching by learning: Exploring artificial general intelligence on small board games by deep reinforcement learning

    Get PDF
    In deep reinforcement learning, searching and learning techniques are two important components. They can be used independently and in combination to deal with different problems in AI. These results have inspired research into artificial general intelligence (AGI).We study table based classic Q-learning on the General Game Playing (GGP) system, showing that classic Q-learning works on GGP, although convergence is slow, and it is computationally expensive to learn complex games.This dissertation uses an AlphaZero-like self-play framework to explore AGI on small games. By tuning different hyper-parameters, the role, effects and contributions of searching and learning are studied. A further experiment shows that search techniques can contribute as experts to generate better training examples to speed up the start phase of training.In order to extend the AlphaZero-likeself-play approach to single player complex games, the Morpion Solitaire game is implemented by combining Ranked Reward method. Our first AlphaZero-based approach is able to achieve a near human best record.Overall, in this thesis, both searching and learning techniques are studied (by themselves and in combination) in GGP and AlphaZero-like self-play systems. We do so for the purpose of making steps towards artificial general intelligence, towards systems that exhibit intelligent behavior in more than one domain. China Scholarship CouncilAlgorithms and the Foundations of Software technolog

    The Wellesley News (05-01-1969)

    Get PDF
    https://repository.wellesley.edu/wcnews/1234/thumbnail.jp

    April 12, 2007

    Get PDF
    The Breeze is the student newspaper of James Madison University in Harrisonburg, Virginia

    Monte-Carlo tree search enhancements for one-player and two-player domains

    Get PDF

    Monte Carlo Tree Search for games with Hidden Information and Uncertainty

    Get PDF
    Monte Carlo Tree Search (MCTS) is an AI technique that has been successfully applied to many deterministic games of perfect information, leading to large advances in a number of domains, such as Go and General Game Playing. Imperfect information games are less well studied in the field of AI despite being popular and of significant commercial interest, for example in the case of computer and mobile adaptations of turn based board and card games. This is largely because hidden information and uncertainty leads to a large increase in complexity compared to perfect information games. In this thesis MCTS is extended to games with hidden information and uncertainty through the introduction of the Information Set MCTS (ISMCTS) family of algorithms. It is demonstrated that ISMCTS can handle hidden information and uncertainty in a variety of complex board and card games. This is achieved whilst preserving the general applicability of MCTS and using computational budgets appropriate for use in a commercial game. The ISMCTS algorithm is shown to outperform the existing approach of Perfect Information Monte Carlo (PIMC) search. Additionally it is shown that ISMCTS can be used to solve two known issues with PIMC search, namely strategy fusion and non-locality. ISMCTS has been integrated into a commercial game, Spades by AI Factory, with over 2.5 million downloads. The Information Capture And ReUSe (ICARUS) framework is also introduced in this thesis. The ICARUS framework generalises MCTS enhancements in terms of information capture (from MCTS simulations) and reuse (to improve MCTS tree and simulation policies). The ICARUS framework is used to express existing enhancements, to provide a tool to design new ones, and to rigorously define how MCTS enhancements can be combined. The ICARUS framework is tested across a wide variety of games
    corecore