6 research outputs found

    Generalizing Agent Plans and Behaviors with Automated Staged Observation in The Real-Time Strategy Game Starcraft

    Get PDF
    In this thesis we investigate the processes involved in learning to play a game. It was inspired by two observations about how human players learn to play. First, learning the domain is intertwined with goal pursuit. Second, games are designed to ramp up in complexity, walking players through a gradual cycle of acquiring, refining, and generalizing knowledge about the domain. This approach does not rely on traces of expert play. We created an integrated planning, learning and execution system that uses StarCraft as its domain. The planning module creates command/event groupings based on the data received. Observations of unit behavior are collected during execution and returned to the learning module which tests the generalization hypothesizes. The planner uses those test results to generate events that will pursue the goal and facilitate learning the domain. We demonstrate that this approach can efficiently learn the subtle traits of commands through multiple scenarios

    Exploiting Opponent Modeling For Learning In Multi-agent Adversarial Games

    Get PDF
    An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent’s actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players’ physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. iii In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics

    Штучний інтелект на основі нейронної мережі для гри в жанрі стратегія в реальному часі

    Get PDF
    Магістерська дисертація міститься на 118 сторінках та включає 44 рисунки, 5 таблицю та 31 бібліографічні посилання. Вона складається з наступних розділів: вступ, 5 розділів для основної частини, висновки, перелік посилань та 8 додатків. Ключові слова: нейронні мережі, штучний інтелект, ієрархічна мережа задач, стратегії в реальному часі, розподілена система навчання нейронних мереж. Магістерська дисертація присвячена розробці та опису штучного інтелекту з використанням нейронних мереж для гри жанрі стратегія в реальному часі. Актуальність обраної теми полягає в підвищенні ефективності агентів штучного інтелекту для ігор в жанрі стратегій в реальному часі, а також використання розподіленої системи з централізованим сервером як елементу покращення ефективності модулів штучного інтелекту. Метою роботи є створення системи агентів штучного інтелекту з використанням штучних мереж для забезпечення високої ефективності роботи ботів в іграх жанру стратегій в реальному часі. Об’єктом дослідження є штучний інтелект в іграх жанру стратегій в реальному часі, зокрема модулі мікроуправління (тактичний) та макроуправління (стратегічний), з використанням штучних нейронних мереж в комбінації з іншими підходами. Предметом дослідження є побудований з використанням штучних нейронних мереж штучний інтелект для гри в жанрі стратегії в реальному часі, а також розподілена система навчання штучного інтелекту з використанням централізованого серверу.The master's dissertation is on 118 pages and includes 44 figures, 5 tables and 31 bibliographic references. It consists of the following sections: introduction, 5 sections for the main part, conclusions, list of references and 8 appendices. The master's dissertation is devoted to the development and description of artificial intelligence using neural networks for the game genre of real-time strategy. The relevance of the chosen topic is to increase the efficiency of artificial intelligence agents for games in the genre of real-time strategy, as well as the use of a distributed system with a centralized server as an element of improving the efficiency of artificial intelligence modules. The aim of the work is to create a system of artificial intelligence agents using artificial networks to ensure high efficiency of bots in games genre strategy in real time. The object of research is artificial intelligence in real-time strategy games, including modules of microcontrol (tactical) and macrocontrol (strategic), using artificial neural networks in combination with other approaches. The subject of the study is artificial intelligence built using artificial neural networks to play in the genre of real-time strategy, as well as a distributed system of artificial intelligence training using a centralized server

    Gestion du raisonnement à base de cas avec l'apprentissage par renforcement pour un jeu contraint dans le temps

    Get PDF
    Dans ces travaux, nous tentons d’améliorer l’aspect comportemental dans les jeux vidéo en utilisant le raisonnement par cas (Case Based Reasoning - CBR), qui simule le comportement humain. Cette technique, provenant du domaine de l’intelligence artificielle, résout de nouveaux problèmes en retrouvant des expériences analogues dans sa base de cas et en les adaptant au nouveau problème considéré. Nous utilisons le CBR pour l’automatisation de décisions prises par des composantes d’un jeu. La construction d’un module CBR nécessite l’accumulation de plusieurs épisodes de jeu pour former la base de cas du module. Cependant, lorsqu’un grand nombre d’épisodes sont emmagasinés dans la base de cas, la réponse en temps du système s’alourdit. Nous sommes alors confrontés au défi d’améliorer le temps de réponse du module CBR tout en gardant un niveau de performance acceptable du système. Dans ce mémoire, nous utilisons le jeu de Tetris pour mener notre étude. Ce jeu présente un intérêt particulier car les décisions à prendre sont contraintes dans le temps. Nous proposons dans ce mémoire de répondre aux questions suivantes : Comment formuler un système CBR pour jouer au jeu Tetris. Quelle est la performance attendue par un système CBR appliqué à ce jeu. Quel est le niveau du jeu qui peut être atteint par l’estimation de la valeur des cas obtenus par apprentissage par renforcement. Comme Tetris est un jeu contraint par le temps, quel est le niveau de dégradation de performances qui peut être perçue par la réduction de la taille de la base de cas.In this work, we try to improve the behavioral aspects of video games using Case Based Reasoning (CBR), which can reproduce human behavior as reasoning by similarity, as well as remembering and forgetting previous experiences. This technique, coming from the Artificial Intelligence field, solves new problems by retrieving similar past experiences in the case base and adapting solution to solve new problems. We use CBR for the automation of decisions made by the game engine. The construction of a CBR system needs to accumulate many episodes from the gaming environment to create the case base of the CBR engine. However, as the number of episodes being saved in the case base increases, the response time of the CBR system slows down. We are then facing a dilemma: reducing the size of the case base to improve the response of the CBR system while keeping an acceptable level of performance. In this master thesis, we use the game of Tetris to conduct our case studies. This game presents some particular interests, as decisions to be made are limited by time constraints. We propose in this thesis to answer the following questions: How to construct a CBR system to play the game of Tetris. What is the expected performance of the system applied to this game? Wich game level can be reached by estimating case value through reinforcement learning? As time response constraints are inherent to Tetris, which degradation of performance can be expected by removing cases from the case base

    Oroboro : modelo para descrição temporal do átomo de hidrogênio através da feature resonance neural network

    Get PDF
    Orientador: Prof. Dr. Roberto Tadeu RaittzCoorientador: Prof. Dr. Dietmar William ForytaDissertação (mestrado) - Universidade Federal do Paraná, Setor de Educação Profissional e Tecnológica, Programa de Pós-Graduação em Bioinformática. Defesa : Curitiba, 15/06/2020Inclui referências: p. 123-132Área de concentração: Inteligência ArtificialResumo: Com importância ampla e inquestionável, o estudo da modelagem e dinâmica moleculares necessita constantemente de avanços e o desenvolvimento de novas técnicas é bem-vindo. O aprendizado de máquina vem sendo reconhecido na literatura como um grande aliado na simulação de sistemas químico-quânticos. Sem solução analítica para sistemas de N-corpos, a equação de Schrödinger exige solução por aproximações, as quais possuem falhas. Dentre os trabalhos revisados da literatura, não encontramos nenhum trabalho que busque obter trajetórias ou uma representação pontual do elétron no tempo utilizando recursos do ML, nem foram encontradas aplicações de redes neurais para a previsão de energias do elétron ligado a um núcleo. Propomos, portanto, a elaboração de um modelo baseado em inteligência artificial, com o uso de redes neurais, para representar o comportamento do elétron utilizando a interpretação de Copenhagen. Por não haver na literatura nenhum método de aprendizado que satisfaça as exigências para o modelo de comportamento do elétron, foi desenvolvida a rede Feature Resonance Neural Network (FRes). A FRes é uma rede generalista - não especializada - com função de custo customizável, de otimização evolutiva por algoritmo genético, arquitetura baseada em ressonância, e aprendizado local (supervisionada) e/ou global (reforço esparso). Validamos a FRes com testes do 'ou exclusivo' (XOR), de classificação e augmentation. Tais testes demonstraram que a ressonância é essencial para a solução de problemas não lineares, e que a rede é capaz de solucionar problemas simples e complexos. Obtivemos acurácia média de (93.99 ± 2.96)% e (96.69 ± 1.12)% na classificação dos dados da Iris e câncer de mama respectivamente, e acurácia de (96.32 ± 0.84)% para a classificação dos dados gerados pelas redes FRes geradoras. Após a validação da FRes, apresentamos um modelo alternativo para representação elétron no tempo e pontualmente, mas não por trajetórias literais. Desenvolvemos um modelo sem supervisão capaz de avaliar apenas globalmente o comportamento da partícula, respeitando a densidade probabilidade de Schrödinger. Após testes iniciais para verificação dos parâmetros adequados, obtivemos 12 redes com erro médio inferior a 10% nos parâmetros avaliados. A rede de melhor desempenho foi a código FRes14B#34, que obteve erros percentuais de 3.51 ± 0.44 %, 6.91 ± 0.39 %, 5.36 ± 0.43 % e 0.72 ± 0.42 % na comparação dos histogramas radial, angulares (? e ?) e no raio médio, respectivamente. Também obteve desvio do centro de massa de 0.0597 ± 0.0054 Å. A análise dos resultados revelaram uma propriedade atratora das redes - atratores caóticos. Conseguimos idealizar e desenvolver um modelo de representação do hidrogênio preliminar bem sucedido, que ainda necessita de mais estudos, mas que abre portas para a elaboração de um novo modelo alternativo para interações de partículas. Além do modelo para o átomo de um elétron, a rede FRes possui potencial para outras aplicações. Palavras-chave: Física Computacional.Ressonância.Redes Neurais.Aprendizado por Otimização.Abstract: With broad and unquestionable importance, the study of molecular modeling and dynamics constantly needs advances and the development of new techniques is welcome. The machine learning has been recognized in the literature as a great ally in the simulation of chemical-quantum systems. Without an analytical solution for N-body systems, the Schrödinger equation requires solution by approximations, which have flaws. Among the reviewed papers in the literature, we didn't find any work that seeks to obtain trajectories or a point representation of the electron over time using ML resources. Nor were found any applications of neural networks for predicting the energies of the binded electron to a nucleus. Therefore, we propose the development of a model based on artificial intelligence, with neural networks, to represent the behavior of the electron using the Copenhagen interpretation. As there is no learning method in the literature that satisfies the requirements for the electron behavior model, the Feature Resonance Neural Network (FRes) network was developed. FRes is a generalist network - not specialized - with a customizable cost function, evolutionary optimization by genetic algorithm, architecture based on resonance, and local (supervised) and/or global (sparse reinforcement) learning. We validate FRes with 'exclusive or '(XOR), classification and augmentation tests. Such tests demonstrated that resonance is essential for solving nonlinear problems, and that the network is capable of solving simple and complex problems. We obtained average accuracy of (93.99 ± 2.96) % and (96.69 ± 1.12) % in the classification from Iris and breast cancer data respectively, and accuracy of (96.32 ± 0.84) % for the classification of generated data by FRes generating networks. After validation of FRes, we present an alternative model for electron representation in time and punctualy, but not by literal trajectories. We developed an unsupervised model capable of evaluating the particle behavior only globally, respecting Schrödinger's probability density. After initial tests to check the appropriate parameters, we obtained 12 networks with an average error of less than 10% in the evaluated parameters. The network with the best performance was code FRes14B #34, which got percentage errors of 3.51 ± 0.44 %, 6.91 ± 0.39 %, 5.36 ± 0.43 % and 0.72 ± 0.42 % in the comparison of the radial and angular (? and ?) histograms, and in the mean radius, respectively. It also obtained a 0.0597 ± 0.0054 Åcenter of mass deviation. The analysis of the results revealed an attractive property of the networks - chaotic attractors. We were able to devise and develop a successful preliminary hydrogen representation model, which still needs further studies, but which opens doors to the development of a new alternative model for particle interactions. In addition the model for an electron atom, the FRes network has potential for other applications. Keywords: Computational Physics. Resonance. Neural Networks. Learning by Optmization

    Learning continuous action models in a real-time strategy environment

    No full text
    Although several researchers have integrated methods for reinforcement learning (RL) with case-based reasoning (CBR) to model continuous action spaces, existing integrations typically employ discrete approximations of these models. This limits the set of actions that can be modeled, and may lead to non-optimal solutions. We introduce the Continuous Action and State Space Learner (CASSL), an integrated RL/CBR algorithm that uses continuous models directly. Our empirical study shows that CASSL significantly outperforms two baseline approaches for selecting actions on a task from a real-time strategy gaming environment. 1
    corecore