17 research outputs found

    On monte carlo tree search and reinforcement learning

    Get PDF
    Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

    Desenvolvendo um benchmark para deep learning sobre a plataforma Jetson TX2

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Ciências da Computação.Com os recentes avanços nas áreas de navegação visual e inteligência artificial (deep learning), diversas áreas sofrem mudanças quanto ao modo de atacar e solucionar os problemas nelas existentes, como é o caso da área de navegação visual e veículos autônomos. Porém, não só técnicas teóricas tem avançado como também sistemas embarcados com foco em acelerar a prototipação de novas soluções vem acompanhando as novas mudanças. O presente trabalho tem como foco desenvolver um benchmark sobre sistemas embarcados especializados em algoritmos de inteligência artificial, realizando um estudo comparativo de modelos alinhados ao estado da arte, sobre a plataforma Jetson TX2, auxiliando a tomada de decisão quanto as ferramentas mais apropriadas para implementação de aplicações reais sobre a área de deep learning, como é o caso de carros autônomos

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Generalised Player Modelling : Why Artificial Intelligence in Games Should Incorporate Meaning, with a Formalism for so Doing

    Get PDF
    General game-playing artificial intelligence (AI) has recently seen important advances due to the various techniques known as ‘deep learning’. However, in terms of human-computer interaction, the advances conceal a major limitation: these algorithms do not incorporate any sense of what human players find meaningful in games. I argue that adaptive game AI will be enhanced by a generalised player model, because games are inherently human artefacts which require some encoding of the human perspective in order to respond naturally to individual players. The player model provides constraints on the adaptive AI, which allow it to encode aspects of what human players find meaningful. I propose that a general player model requires parameters for the subjective experience of play, including: player psychology, game structure, and actions of play. I argue that such a player model would enhance efficiency of per-game solutions, and also support study of game-playing by allowing (within-player) comparison between games, or (within-game) comparison between players (human and AI). Here we detail requirements for functional adaptive AI, arguing from first-principles drawn from games research literature, and propose a formal specification for a generalised player model based on our ‘Behavlets’ method for psychologically-derived player modelling.Peer reviewe

    A novel computer Scrabble engine based on probability that performs at championship leve

    Get PDF
    The thesis starts by giving an introduction to the game of Scrabble, then mentions state-of-the-art computer Scrabble programs and presents some characteristics of our developed Scrabble engine Heuri. Some brief notions of Game Theory are given, along with history of some games in Artificial Intelligence; the fundamental algorithms for game playing, as well as state-of-the-art engines and the algorithms used by them, are presented. Basic elements of Scrabble, such as the Scrabble rules and the letter distribution, are given. Some history and state-of-the-art of Computer Scrabble are commented. For instance, the generation methods of valid moves based on the data structure DAWG (Directed Acyclic Word Graph) and also the variant GADDAG are recalled. These methods are used by the state-of-the-art Scrabble engines Quackle and Maven. Then, the contributions of this thesis are presented. A Spanish lexicon for playing Scrabble has been built that is used by Heuri engines. From this construction, a detailed study and classification of Spanish irregular verbs has been provided. A novel Scrabble move generator based on anagrams has been designed and implemented, which has been shown to be faster than the GADDAG-based generator used in Quackle engine. This method is similar to the way Scrabble players look for a move, searching for anagrams and a spot to play on the board. Next, we address the evaluation of moves when playing Scrabble; the quality of your game depends on deciding what move should be played given a certain board and a rack with tiles. This decision was made initially by Heuri trying several heuristics which ended up with the construction of several engines. We give the explanation of the heuristics used in these engines, all of them based on probabilities. All these initial heuristic evaluation functions (up to six) do not use forward looking, they are static evaluators. They have shown, after testing, an increasing playing performance, which allow Heuri to beat (top-level) expert human players in Spanish, without the need of using sampling and simulation techniques. These heuristics mainly consider the possibility of achieving a bingo on the actual board, whereas Quackle used pre-calculated values (superleaves) regardless of the latter. Then, in order to improve the quality of play of Heuri even more, some additional engines are presented in which look ahead is employed. The HeuriSamp engine, which evaluates a 2-ply search, permits to obtain a defense value. The HeuriSim engine uses a 3-ply adversarial search tree; it contemplates the best first moves (according to Heuri sixth engine heuristic) from Player 1, then some replies to these moves (Player 2 moves) and then some replies to these replies (Player 1 moves). Finally, to improve these engines, opponent modeling is used; this technique makes predictions on some of the opponents' tiles based on the last play made by the opponent. We present results obtained by playing thousands of Heuri vs Heuri games, collecting important information: general statistics of Scrabble game, like a 16 point handicap of the second player, and word statistics in Spanish, like a list of the most frequently played bingos (words that use all 7 tiles of a player's rack). In addition, we present results of matches played by Heuri against top-level humans in Spanish and results obtained by massive playing of different Heuri engines against the Quackle engine in Spanish, French and English. All these match results demonstrate the championship level performance of the Heuri engines in the three languages, especially of the last developed engine that includes simulation and opponent modeling techniques. From here, conclusions of the thesis are drawn and work for the future is envisaged.La tesi comença introduint el joc del Scrabble, esmentant els programes d’ordinador de l’estat de l’art que juguen Scrabble, i presentant algunes característiques del motor de joc de Scrabble que s’ha desenvolupat anomenat Heuri. Es donen breus nocions de la Teoria de Jocs, junt amb la història d’alguns jocs en Intel·ligència Artificial; es presenten els algorismes fonamentals per jugar, així com els motors de joc de l’estat de l’art en diferents jocs i els algorismes que usen. Es comenta també la història i estat de l’art del Computer Scrabble. Es recorden els mètodes de generació de moviments vàlids basats en l’estructura de dades DAWG (Directed Acyclic Word Graph) i en la variant GADDAG, que són usats pels motors de joc de Scrabble Quackle i Maven. A continuació es presenten les contribucions de la tesi. S’ha construït un diccionari per jugar Scrabble en espanyol, el qual és usat per les diferentes versions del motor de joc Heuri. S’ha fet un estudi detallat i una classificació dels verbs irregulars en espanyol. S’ha dissenyat i implementat un nou generador de moviments de Scrabble basat en anagrames, que ha demostrat ser més ràpid que el generador basat en GADDAG usat al motor Quackle. Aquest mètode és similar a la manera en la que els jugadors de Scrabble cerquen un moviment, buscant anagrames i un lloc del tauler on col·locar-los. Seguidament, es tracta l’evacuació dels moviments quan es juga Scrabble; la qualitat del joc depèn de decidir quin moviment cal jugar donat un cert tauler i un faristol amb fitxes. En Heuri, inicialment, aquesta decisió es va prendre provant diferents heurístiques que van dur a la construcció de diversos motors. Donem l’explicació de les heurístiques usades en aquests motors, totes elles basades en probabilitats. Totes aquestes funcions d’avaluació heurística inicials (fins a sis) no miren cap endavant, fan avaluacions estàtiques. Han mostrat, després de ser provades, un rendiment creixent de nivell de joc, el que ha permès Heuri derrotar a jugadors humans experts de màxim nivell en espanyol, sense necessitat d’usar tècniques de mostreig i de simulació. Aquestes heurístiques consideren principalment la possibilitat d’aconseguir un bingo en el tauler actual, mentre que Quackle usa uns valors pre-calculats (superleaves) que no tenen en compte l’anterior. Amb l’objectiu de millorar la qualitat de joc de Heuri encara més, es presenten uns motors de joc addicionals que sí miren cap endavant. El motor HeuriSamp, que realitza una cerca 2-ply, permet obtenir un valor de defensa. El motor HeuriSim usa un arbre de cerca 3-ply; contempla els millors primers moviments (d’acord al sisè motor heurístic d’Heuri) del Jugador 1, després algunes respostes a aquests moviments (moviments del Jugador 2) i llavors algunes rèpliques a aquestes respostes (moviments del Jugador 1). Finalment, per a millorar aquests motors, es proposa usar modelatge d’oponents; aquesta tècnica realitza prediccions d’algunes de les fitxes de l’oponent basant-se en l’últim moviment jugat per aquest. Es presenten resultats obtinguts de jugar milers de partides d’Heuri contra Heuri, que recullen important informació: estadístiques generals del joc del Scrabble, com un handicap de 16 punts del segon jugador, i estadístiques de paraules en espanyol, com una llista dels bingos (paraules que usen les 7 fitxes del faristol d’un jugador) que es juguen més freqüentment. A més, es presenten resultats de partides jugades per Heuri contra jugadors humans de màxim nivell en espanyol i resultats obtinguts d'un gran nombre d’enfrontaments entre els diferents motors de joc d’Heuri contra el motor Quackle en espanyol, francès i anglès. Tots aquests resultats de partides jugades demostren el rendiment de nivell de campió dels motors d’Heuri en les tres llengües, especialment el de l’últim motor desenvolupat que inclou tècniques de de simulació i modelatge d'oponents. A partir d'aquí s'extreuen les conclusions de la tesi i es preveu treballar de cara al futur.Postprint (published version

    Symbolic Search in Planning and General Game Playing

    Get PDF
    Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented

    Enhancing player experience in computer games: A computational Intelligence approach.

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Refinement-Based Heuristic Method for Decision Making in the Context of Ayo Game

    Get PDF
    Games of strategy, such as chess have served as a convenient test of skills at devising efficient search algorithms, formalizing knowledge, and bringing the power of computation to bear on “intractable” problems. Generally, minimax search has been the fundamental concept of obtaining solution to game problems. However, there are a number of limitations associated with using minimax search in order to offer solution to Ayo game. Among these limitations are: (i.) improper design of a suitable evaluator for moves before the moves are made, and (ii.) inability to select a correct move without assuming that players will play optimally. This study investigated the extent to which the knowledge of minimax search technique could be enhanced with a refinement-based heuristic method for playing Ayo game. This is complemented by the CDG (an end game strategy) for generating procedures such that only good moves are generated at any instance of playing Ayo game by taking cognizance of the opponent strategy of play. The study was motivated by the need to advance the African board game – Ayo – to see how it could be made to be played by humans across the globe, by creating both theoretical and product-oriented framework. This framework provides local Ayo game promotion initiatives in accordance with state-of-the-art practices in the global game playing domain. In order to accomplish this arduous task, both theoretical and empirical approaches were used. The theoretical approach reveals some mathematical properties of Ayo game with specific emphasis on the CDG as an end game strategy and means of obtaining the minimal and maximal CDG configurations. Similarly, a theoretical analysis of the minimax search was given and was enhanced with the Refinement-based heuristics. For the empirical approach, we simulated Ayo game playing on a digital viii computer and studied the behaviour of the various heuristic metrics used and compared the play strategies of the simulation with AWALE (the world known Ayo game playing standard software). Furthermore, empirical judgment was carried out on how experts play Ayo game as a means of evaluating the performance of the heuristics used to evolve the Ayo player in the simulation which gives room for statistical interpretation. This projects novel means of solving the problem of decision making in move selections in computer game playing of Ayo game. The study shows how an indigenous game like Ayo can generate integer sequence, and consequently obtain some self-replicating patterns that repeat themselves at different iterations. More importantly, the study gives an efficient and usable operation support tools in the prototype simulation of Ayo game playing that has improvement over Awal

    Guiding Monte Carlo Tree Search simulations through Bayesian Opponent Modeling in The Octagon Theory

    Get PDF
    Os jogos de tabuleiro apresentam um problema de tomada de decisão desafiador na área da Inteligência Artificial. Embora abordagens clássicas baseadas em árvores de pesquisa tenham sido aplicadas com sucesso em diversos jogos de tabuleiro, como o Xadrez, estas mesmas abordagens ainda são limitadas pela tecnologia actual quando aplicadas a jogos de tabuleiro de maior omplexidade, como o Go. Face a isto, os jogos de maior complexidade só se tornaram no foco de pesquisa com o aparecimento de árvores de pesquisa baseadas em métodos de Monte Carlo (Monte Carlo Tree Search - MCTS), uma vez que começaram a surgir perspectivas de solução neste domínio.Este projecto de dissertação tem como objectivo expandir o estado de arte actual relativo a MCTS, através da investigação da integração de modelação de oponentes (Opponent Modeling) com MCTS. O propósito desta integração é guiar as simulações de um algoritmo típico de MCTS através da obtenção de conhecimento acerca do adversário, utilizando modelação de oponentes Bayesiana (Bayesian Opponent Modeling), com o intuito de reduzir o número de computações irrelevantes que são executadas em métodos puramente estocásticos e independentes de domínio. Para esta investigação, foi utilizado o jogo de tabuleiro deterministico The Octagon Theory, pois as suas regras, dimensão fixa do problema e configuração do tabuleiro apresentam não só um complexo desafio na criação de modelos de oponentes e na execução de MCTS em si, mas também um meio claro de classificação e comparação (benchmark) entre algoritmos. Através da análise de um estudo efectuado sobre a complexidade do jogo, acredita-se que o jogo, quando jogado na maior versão do tabuleiro, se encontra na mesma classe de complexidade do Shogi e da versão 19x19 do Go, transformando-se num jogo de tabuleiro adequado para investigação nesta área. Ao longo deste relatório, diversas políticas e melhoramentos relativos a MCTS são apresentados e comparados não só com a variação proposta, mas também com o método básico de Monte Carlo e com a melhor abordagem (greedy) conhecida no contexto do The Octagon Theory. Os resultados desta investigação revelam que a adição de Move Groups, Decisive Moves, Upper Confidence Bounds for Trees (UCT), Limited Simulation Lengths e Opponent Modeling transformam um agente MCTS previamente perdedor no melhor agente, num domínio com uma complexidade da árvore de jogo (game-tree complexity) estimada de 10^293, mesmo quando o orçamento computacional atribuído ao agente é mínimo.Board games present a very challenging problem in the decision-making topic of Artificial Intelligence. Although classical tree search approaches have been successful in various board games, such as Chess, these approaches are still very limited by modern technology when applied to higher complexity games such as Go. In light of this, it was not until the appearance of Monte Carlo Tree Search (MCTS) methods that higher complexity games became the main focus of research, as solution perspectives started to appear in this domain.This thesis builds on the current state-of-the-art in MCTS methods, by investigating the integration of Opponent Modeling with MCTS. The goal of this integration is to guide the simulations of the MCTS algorithm according to knowledge about the opponent, obtained in real-time through Bayesian Opponent Modeling, with the intention of reducing the number of irrelevant computations that are performed in purely stochastic, domain-independent methods. For this research, the two player deterministic board game The Octagon Theory was used, as its rules, fixed problem length and board configuration, present not only a difficult challenge for both the creation of opponent models and the execution of the MCTS method itself, but also a clear benchmark for comparison between algorithms. Through the analysis of a performed computation on the gametree complexity, the large board version of the game is believed to be in the same complexity class of Shogi and the 19x19 version of Go, turning it into a suitable board game for research in this area. Throughout this report, several MCTS policies and enhancements are presented and compared with not only the proposed variation, but also standard Monte Carlo search and the best known greedy approach for The Octagon Theory. The experiments reveal that a combination of Move Groups, Decisive Moves, Upper Confidence Bounds for Trees (UCT), Limited Simulation Lengths and an Opponent Modeling based simulation policy turn a former losing MCTS agent into the best performing one in a domain with estimated game-tree complexity of 10^293, even when the provided computational budget is kept low
    corecore