Search CORE

15 research outputs found

Playing Cassino with Reinforcement Learning

Author: Yong Edmund
Publication venue: Digital Commons @ East Tennessee State University
Publication date: 01/05/2022
Field of study

Reinforcement learning algorithms have been used to create game-playing agents for various games—mostly, deterministic games such as chess, shogi, and Go. This study used Deep-Q reinforcement learning to create an agent that plays a non-deterministic card game, Cassino. This agent’s performance was compared against the performance of a Cassino mobile app. Results showed that the trained models did not perform well and had trouble training around build actions which are important in Cassino. Future research could experiment with other reinforcement learning algorithms to see if they are better at training around build actions

East Tennessee State University

A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks

Author: Burger Clayton
Publication venue: 'University of Zagreb, Faculty of Science, Department of Mathematics'
Publication date: 01/01/2014
Field of study

In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm

Nelson Mandela University

South East Academic Libraries System (SEALS)

Solving The Flexible Job Shop Problem using Multi-Objective Optimizer with Solution Characteristic Extraction

Author: Chiu Shih-Yuan
Hsieh Sheng-Ta
Lin Chun-Ling
Su Tsan-Cheng
Yen Shi-Jim
Publication venue: Arab Journals Platform
Publication date: 01/09/2016
Field of study

It is difficult to find optimal scheduling solutions for abstract scheduling problems with mass parallel tasks on multiprocessors because they are NP-complete. In this paper, a solution searching strategy called solution characteristic extraction is proposed as a multi-objective optimizer for solving flexible job shop problems (FJSP). These problems are concerned with finishing assigned jobs with minimal critical machine workload, total workload, and completion times. A suitable job assignment must consider processor performance, job complexity, and job suitability for each individual processor simultaneously. To test the efficiency and robustness of the proposed method, the experiments will contain two groups of benchmarks; with, and without release time constraints. Each benchmark includes numbers of heterogeneous processors and different jobs for execution. The results indicate the proposed method can find more potential solutions, and outperform related methods

Arab Journals Platform

Searching by learning: Exploring artificial general intelligence on small board games by deep reinforcement learning

Author: Wang H.
Publication venue
Publication date: 07/09/2021
Field of study

In deep reinforcement learning, searching and learning techniques are two important components. They can be used independently and in combination to deal with different problems in AI. These results have inspired research into artificial general intelligence (AGI).We study table based classic Q-learning on the General Game Playing (GGP) system, showing that classic Q-learning works on GGP, although convergence is slow, and it is computationally expensive to learn complex games.This dissertation uses an AlphaZero-like self-play framework to explore AGI on small games. By tuning different hyper-parameters, the role, effects and contributions of searching and learning are studied. A further experiment shows that search techniques can contribute as experts to generate better training examples to speed up the start phase of training.In order to extend the AlphaZero-likeself-play approach to single player complex games, the Morpion Solitaire game is implemented by combining Ranked Reward method. Our first AlphaZero-based approach is able to achieve a near human best record.Overall, in this thesis, both searching and learning techniques are studied (by themselves and in combination) in GGP and AlphaZero-like self-play systems. We do so for the purpose of making steps towards artificial general intelligence, towards systems that exhibit intelligent behavior in more than one domain. China Scholarship CouncilAlgorithms and the Foundations of Software technolog

Leiden University Scholary Publications

The Wellesley News (05-01-1969)

Author: Wellesley College
Publication venue: Wellesley College Digital Scholarship and Archive
Publication date
Field of study

https://repository.wellesley.edu/wcnews/1234/thumbnail.jp

Wellesley College

April 12, 2007

Author: James Madison University
Publication venue: JMU Scholarly Commons
Publication date: 12/04/2007
Field of study

The Breeze is the student newspaper of James Madison University in Harrisonburg, Virginia

James Madison University

Monte-Carlo tree search enhancements for one-player and two-player domains

Author: Baier Hendrik
Publication venue: 'University of Maastricht'
Publication date: 01/01/2015
Field of study

Maastricht University Research Portal

Monte Carlo Tree Search for games with Hidden Information and Uncertainty

Author: Whitehouse Daniel
Publication venue: University of York
Publication date: 06/08/2014
Field of study

Monte Carlo Tree Search (MCTS) is an AI technique that has been successfully applied to many deterministic games of perfect information, leading to large advances in a number of domains, such as Go and General Game Playing. Imperfect information games are less well studied in the field of AI despite being popular and of significant commercial interest, for example in the case of computer and mobile adaptations of turn based board and card games. This is largely because hidden information and uncertainty leads to a large increase in complexity compared to perfect information games. In this thesis MCTS is extended to games with hidden information and uncertainty through the introduction of the Information Set MCTS (ISMCTS) family of algorithms. It is demonstrated that ISMCTS can handle hidden information and uncertainty in a variety of complex board and card games. This is achieved whilst preserving the general applicability of MCTS and using computational budgets appropriate for use in a commercial game. The ISMCTS algorithm is shown to outperform the existing approach of Perfect Information Monte Carlo (PIMC) search. Additionally it is shown that ISMCTS can be used to solve two known issues with PIMC search, namely strategy fusion and non-locality. ISMCTS has been integrated into a commercial game, Spades by AI Factory, with over 2.5 million downloads. The Information Capture And ReUSe (ICARUS) framework is also introduced in this thesis. The ICARUS framework generalises MCTS enhancements in terms of information capture (from MCTS simulations) and reuse (to improve MCTS tree and simulation policies). The ICARUS framework is used to express existing enhancements, to provide a tool to design new ones, and to rigorously define how MCTS enhancements can be combined. The ICARUS framework is tested across a wide variety of games

White Rose E-theses Online