30 research outputs found

    Depth, balancing, and limits of the Elo model

    Get PDF
    -Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, human-centered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to a continuous measure. We provide new depth results and present tool (given-first-move, pie rule, size extension) for increasing it. We also use these measures for analyzing games and opening moves in Y, NoGo, Killall Go, and the effect of pie rules

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments

    Get PDF
    International audienceTHE AUTHORS ARE EXTREMELY GRATEFUL TO GRID5000 for helping in designing and experimenting around Monte-Carlo Tree Search. In order to promote computer Go and stimulate further development and research in the field, the event activities, "Computational Intelligence Forum" and "World 99 Computer Go Championship," were held in Taiwan. This study focuses on the invited games played in the tournament, "Taiwanese Go players versus the computer program MoGo," held at National University of Tainan (NUTN). Several Taiwanese Go players, including one 9-Dan professional Go player and eight amateur Go players, were invited by NUTN to play against MoGo from August 26 to October 4, 2008. The MoGo program combines All Moves As First (AMAF)/Rapid Action Value Estimation (RAVE) values, online "UCT-like" values, offline values extracted from databases, and expert rules. Additionally, four properties of MoGo are analyzed including: (1) the weakness in corners, (2) the scaling over time, (3) the behavior in handicap games, and (4) the main strength of MoGo in contact fights. The results reveal that MoGo can reach the level of 3 Dan with, (1) good skills for fights, (2) weaknesses in corners, in particular for "semeai" situations, and (3) weaknesses in favorable situations such as handicap games. It is hoped that the advances in artificial intelligence and computational power will enable considerable progress in the field of computer Go, with the aim of achieving the same levels as computer chess or Chinese chess in the future

    AlphaZe∗∗: AlphaZero-like baselines for imperfect information games are surprisingly strong

    Get PDF
    In recent years, deep neural networks for strategy games have made significant progress. AlphaZero-like frameworks which combine Monte-Carlo tree search with reinforcement learning have been successfully applied to numerous games with perfect information. However, they have not been developed for domains where uncertainty and unknowns abound, and are therefore often considered unsuitable due to imperfect observations. Here, we challenge this view and argue that they are a viable alternative for games with imperfect information — a domain currently dominated by heuristic approaches or methods explicitly designed for hidden information, such as oracle-based techniques. To this end, we introduce a novel algorithm based solely on reinforcement learning, called AlphaZe∗∗, which is an AlphaZero-based framework for games with imperfect information. We examine its learning convergence on the games Stratego and DarkHex and show that it is a surprisingly strong baseline, while using a model-based approach: it achieves similar win rates against other Stratego bots like Pipeline Policy Space Response Oracle (P2SRO), while not winning in direct comparison against P2SRO or reaching the much stronger numbers of DeepNash. Compared to heuristics and oracle-based approaches, AlphaZe∗∗ can easily deal with rule changes, e.g., when more information than usual is given, and drastically outperforms other approaches in this respect

    Guiding Monte Carlo tree searches with neural networks in the game of go

    Get PDF
    A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer EngineeringO jogo de tabuleiro Go é um dos poucos jogos determinísticos em que os computadores ainda não conseguem vencer jogadores humanos profissionais consistentemente. Neste trabalho dois métodos de aprendizagem – por um algoritmo genético e por treino por propagação do erro – são utilizados para criar redes neuronais capazes de assistir um algoritmo de pesquisa em árvore de Monte Carlo. Este último algoritmo tem sido o mais bem sucedido na última década de investigação sobre Go. A utilização de uma rede neuronal é uma abordagem que está a sofrer uma revitalização, com os recentes sucessos de redes neuronais profundas de convolução. Estas necessitam, contudo, de recursos que ainda são muitas vezes proibitivos. Este trabalho explora o impacto de redes neuronais mais simples e a produção de um software representativo do estado da arte. Para isto é complementado com técnicas para pesquisas em árvore de Monte Carlo, aquisição automática de conhecimento, paralelismo em árvore, otimização e outros problemas presentes na computação aplicada ao jogo de Go. O software produzido – Matilda – é por isso o culminar de um conjunto de experiências nesta área.Abstract: The game of Go remains one of the few deterministic perfect information games where computer players still struggle against professional human players. In this work two methods of derivation of artificial neural networks – by genetic evolution of symbiotic populations, and by training of multilayer perceptron networks with backpropagation – are analyzed for the production of a neural network suitable for guiding a Monte Carlo tree search algorithm. This last family of algorithms has been the most successful in computer Go software in the last decade. Using a neural network to reduce the branching complexity of the search is an approach to the problema that is currently being revitalized, with the advent of the application of deep convolution neural networks. DCNN however require computational facilities that many computers still don’t have. This work explores the impact of simpler neural networks for the purpose of guiding Monte Carlo tree searches, and the production of a state-of-the-art computer Go program. For this several improvements to Monte Carlo tree searches are also explored. The work is further built upon with considerations related to the parallelization of the search, and the addition of other componentes necessary for competitive programs such as time control mechanisms and opening books. Time considerations for playing against humans are also proposed for na extra psychological advantage. The final software – named Matilda– is not only the sum of a series of experimental parts surrounding Monte Carlo Tree Search applied to Go, but also an attempt at the strongest possible solution for shared memory systems

    UMJETNA INTELIGENCIJA U RAČUNALNIM IGRAMA

    Get PDF
    Today, the highly developed and competitive computer games industry needs to make better and better computer games and beat the competition. In order to keep the players entertained with computer games, manufacturers use a variety of techniques to make games interesting and challenging. This is largely aided by research in the field of artificial intelligence that is extremely well suited for computer games. Games need to be made as complex and unpredictable as possible to provide as much fun as possible. This article explores and gives an overview of all the most popular techniques that can be applied.Danas, visoko razvijena i konkurentna industrija računalnih igara mora proizvoditi sve bolje računalne igre kako bi bila bolja od konkurencije. Kako bi igrače nagnali na što dulje sudjelovanje u igri, proizvođači koriste razne tehnike kako bi one bile zanimljive i izazovne. Ovome u velikoj mjeri pomaže istraživanje u području umjetne inteligencije koja je izuzetno pogodna za razvoj računalnih igara. Igre moraju biti što je više moguće složene i nepredvidljive kako bi pružile igraču zabavu. Ovaj članak istražuje i daje pregled svih najpopularnijih tehnika koje se mogu primijeniti u ovom područj

    Framework for Monte Carlo Tree Search-related strategies in Competitive Card Based Games

    Get PDF
    In recent years, Monte Carlo Tree Search (MCTS) has been successfully applied as a new artificial intelligence strategy in game playing, with excellent results yielded in the popular board game Go, real time strategy games and card games. The MCTS algorithm was developed as an alternative over established adversarial search algorithms, i.e., Minimax (MM) and knowledge-based approaches.MCTS can achieve good results with nothing more than information about the game rules, and can achieve breakthroughs in domains of high complexity, whereas in traditional AI approaches, developers might struggle to find heuristics through expertise in each specific game.Every algorithm has its caveats, and MCTS is no exception, as stated by Browne et al: "Although basic implementations of MCTS provide effective play for some domains, results can be weak if the basic algorithm is not enhanced. (...) There is currently no better way than a manual, empirical study of the effect of enhancements to obtain acceptable performance in a particular domain."Thus, the first objective of this dissertation is to research various state of the art MCTS enhancements in a context of card games and then proceed to apply, experiment and fine tune them in order to achieve a highly competitive implementation, validated and tested against other algorithms such as MM.By analysing trick-taking card games such as Sueca and Bisca, where players take turns placing cards face up in the table, there are similarities that allow the development of a MCTS based implementation that features effective enhancements for multiple game variations, since they are non deterministic imperfect information problems. Good results have been achieved in this domain with the algorithm, in games such as Spades and Hearts.The end result aims toward a framework that offers a competitive AI implementation for at least 3 different card games (achieved with analysis and validation against other approaches), allowing developers to integrate their own card games and benefit from a working AI, and also serving as testing ground to rank different agent implementations

    Turn-Based War Chess Model and Its Search Algorithm per Turn

    Get PDF
    War chess gaming has so far received insufficient attention but is a significant component of turn-based strategy games (TBS) and is studied in this paper. First, a common game model is proposed through various existing war chess types. Based on the model, we propose a theory frame involving combinational optimization on the one hand and game tree search on the other hand. We also discuss a key problem, namely, that the number of the branching factors of each turn in the game tree is huge. Then, we propose two algorithms for searching in one turn to solve the problem: (1) enumeration by order; (2) enumeration by recursion. The main difference between these two is the permutation method used: the former uses the dictionary sequence method, while the latter uses the recursive permutation method. Finally, we prove that both of these algorithms are optimal, and we analyze the difference between their efficiencies. An important factor is the total time taken for the unit to expand until it achieves its reachable position. The factor, which is the total number of expansions that each unit makes in its reachable position, is set. The conclusion proposed is in terms of this factor: Enumeration by recursion is better than enumeration by order in all situations

    Reinforcement Learning for Determining Spread Dynamics of Spatially Spreading Processes with Emphasis on Forest Fires

    Get PDF
    Machine learning algorithms have increased tremendously in power in recent years but have yet to be fully utilized in many ecology and sustainable resource management domains such as wildlife reserve design, forest fire management and invasive species spread. One thing these domains have in common is that they contain dynamics that can be characterized as a Spatially Spreading Process (SSP) which requires many parameters to be set precisely to model the dynamics, spread rates and directional biases of the elements which are spreading. We introduce a novel approach for learning in SSP domains such as wild fires using Reinforcement Learning (RL) where fire is the agent at any cell in the landscape and the set of actions the fire can take from a location at any point in time includes spreading into any point in the 3 ×\times 3 grid around it (including not spreading). This approach inverts the usual RL setup since the dynamics of the corresponding Markov Decision Process (MDP) is a known function for immediate wildfire spread. Meanwhile, we learn an agent policy for a predictive model of the dynamics of a complex spatially-spreading process. Rewards are provided for correctly classifying which cells are on fire or not compared to satellite and other related data. We use 3 demonstrative domains to prove the ability of our approach. The first one is a popular online simulator of a wildfire, the second domain involves a pair of forest fires in Northern Alberta which are the Fort McMurray fire of 2016 that led to an unprecedented evacuation of almost 90,000 people and the Richardson fire of 2011, and the third domain deals with historical Saskatchewan fires previously compared by others to a physics-based simulator. The standard RL algorithms considered on all the domains include Monte Carlo Tree Search, Asynchronous Advantage Actor-Critic (A3C), Deep Q Learning (DQN) and Deep Q Learning with prioritized experience replay. We also introduce a novel combination of Monte-Carlo Tree Search (MCTS) and A3C algorithms that shows the best performance across different test domains and testing environments. Additionally, some other algorithms like Value Iteration, Policy Iteration and Q-Learning are applied on the Alberta fires testing domain to show the performances of these simple model based and model free approaches. We also compare to a Gaussian process based supervised learning approach and discuss relation to state-of-the-art methods from forest wildfire modelling. The results show that we can learn predictive, agent-based policies as models of spatial dynamics using RL on readily available datasets like satellite images which are at least as good as other methods and have many additional advantages in terms of generalizability and interpretability

    Integrating Across Conceptual Spaces

    Get PDF
    It has been shown that structure is shared across multiple modalities in the real world: if we speak about two items in similar ways, then they are also likely to appear in similar visual contexts. Such similarity relationships are recapitulated across modalities for entire systems of concepts. This provides a signal that can be used to identify the correct mapping between modalities without relying on event-based learning, by a process of systems alignment. Because it depends on relationships within a modality, systems alignment can operate asynchronously, meaning that learning may not require direct labelling events (e.g., seeing a truck and hearing someone say the word ‘truck’). Instead, learning can occur based on linguistic and visual information which is received at different points in time (e.g., having overheard a conversation about trucks, and seeing one on the road the next day). This thesis explores the value of alignment in learning to integrate between conceptual systems. It takes a joint experimental and computational approach, which simultaneously facilitates insights on alignment processes in controlled environments and at scale. The role of alignment in learning is explored from three perspectives, yielding three distinct contributions. In Chapter 2, signatures of alignment are identified in a real-world setting: children’s early concept learning. Moving to a controlled experimental setting, Chapter 3 demonstrates that humans benefit from alignment signals in cross-system learning, and finds that models which attempt the asynchronous alignment of systems best capture human behaviour. Chapter 4 implements these insights in machine-learning systems, using alignment to tackle cross-modal learning problems at scale. Alignment processes prove valuable to human learning across conceptual systems, providing a fresh perspective on learning that complements prevailing event-based accounts. This research opens doors for machine learning systems to harness alignment mechanisms for cross-modal learning, thus reducing their reliance on extensive supervision by drawing inspiration from both human learning and the structure of the environment
    corecore