Search CORE

2,882 research outputs found

Using Cultural Coevolution for Learning in General Game Playing

Author: Sharma Shiven
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2009
Field of study

Traditionally, the construction of game playing agents relies on using pre-programmed heuristics and architectures tailored for a specific game. General Game Playing (GGP) provides a challenging alternative to this approach, with the aim being to construct players that are able to play any game, given just the rules. This thesis describes the construction of a General Game Player that is able to learn and build knowledge about the game in a multi-agent setup using cultural coevolution and reinforcement learning. We also describe how this knowledge can be used to complement UCT search, a Monte-Carlo tree search that has already been used successfully in GGP. Experiments are conducted to test the effectiveness of the knowledge by playing several games between our player and a player using random moves, and also a player using standard UCT search. The results show a marked improvement in performance when using the knowledge

Scholarship at UWindsor

Agents Explore the Environment Beyond Good Actions to Improve Their Model for Better Decisions

Author: Unverzagt Matthias
Publication venue
Publication date: 06/06/2023
Field of study

Improving the decision-making capabilities of agents is a key challenge on the road to artificial intelligence. To improve the planning skills needed to make good decisions, MuZero's agent combines prediction by a network model and planning by a tree search using the predictions. MuZero's learning process can fail when predictions are poor but planning requires them. We use this as an impetus to get the agent to explore parts of the decision tree in the environment that it otherwise would not explore. The agent achieves this, first by normal planning to come up with an improved policy. Second, it randomly deviates from this policy at the beginning of each training episode. And third, it switches back to the improved policy at a random time step to experience the rewards from the environment associated with the improved policy, which is the basis for learning the correct value expectation. The simple board game Tic-Tac-Toe is used to illustrate how this approach can improve the agent's decision-making ability. The source code, written entirely in Java, is available at https://github.com/enpasos/muzero.Comment: Submitted to NeurIPS 202

arXiv.org e-Print Archive

Mobile application platform heterogeneity: Android vs Windows phone vs iOS vs Firefox OS

Author: Ghinea G
Gronli T-M
Hansen J
Younas M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2014
Field of study

Modern smartphones have a rich spectrum of increasingly sophisticated features, opening opportunities for software-led innovation. Of the large number of platforms to develop new software on, in this paper we look closely at three platforms identified as market leaders for the smartphone market by Gartner Group in 2013 and one platform, Firefox OS, representing a new paradigm for operating systems based on web technologies. We compare the platforms in several different categories, such as software architecture, application development, platform capabilities and constraints, and, finally, developer support. Using the implementation of a mobile version of the tic-tac-toe game on all the four platforms, we seek to investigate strengths, weaknesses and challenges of mobile application development on these platforms. Big differences are highlighted when inspecting community environments, hardware abilities and platform maturity. These inevitably impact upon developer choices when deciding on mobile platform development strategies

Crossref

Brunel University Research Archive

Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play

Author: Bairamian Daniel
Marcotte Philippe
Nowrouzezahrai Derek
Robert Gabriel
Romoff Joshua
Publication venue
Publication date: 28/11/2023
Field of study

Recent advances in Competitive Self-Play (CSP) have achieved, or even surpassed, human level performance in complex game environments such as Dota 2 and StarCraft II using Distributed Multi-Agent Reinforcement Learning (MARL). One core component of these methods relies on creating a pool of learning agents -- consisting of the Main Agent, past versions of this agent, and Exploiter Agents -- where Exploiter Agents learn counter-strategies to the Main Agents. A key drawback of these approaches is the large computational cost and physical time that is required to train the system, making them impractical to deploy in highly iterative real-life settings such as video game productions. In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency. We validate our approach in a diversity of settings, including simple turn based games, the arcade learning environment, and For Honor, a modern video game. The Minimax Exploiter consistently outperforms strong baselines, demonstrating improved stability and data efficiency, leading to a robust CSP-MARL method that is both flexible and easy to deploy

arXiv.org e-Print Archive

Relational Representations in Reinforcement Learning: Review and Open Problems

Author: Otterlo Martijn van
Publication venue: ICML
Publication date: 01/01/2002
Field of study

This paper is about representation in RL.We discuss some of the concepts in representation and generalization in reinforcement learning and argue for higher-order representations, instead of the commonly used propositional representations. The paper contains a small review of current reinforcement learning systems using higher-order representations, followed by a brief discussion. The paper ends with research directions and open problems.\u

CiteSeerX

University of Twente Research Information

Aprendizaje profundo aplicado a juegos de tablero por turnos

Author: Sanz Sanz Pablo
Villanueva Quirós Juan Carlos
Publication venue
Publication date: 01/01/2021
Field of study

Trabajo fin de Grado en Doble Grado en Ingeniería Informatica-Matemáticas, Facultad de Informática UCM, Departamento de Ingeniería del Software e Inteligencia Artificial, Curso 2020-2021Due to the astonishing growth rate in computational power, artificial intelligence is achieving milestones that were considered as inconceivable just a few decades ago. One of them is AlphaZero, an algorithm capable of reaching superhuman performance in chess, shogi and Go, with just a few hours of self-play and given no domain knowledge except the game rules. In this paper, we review the fundamentals, explain how the algorithm works, and develop our own version of it, capable of being executed on a personal computer. Despite the lack of available computational resources, we have managed to master less complex games such as Tic-Tac-Toe and Connect 4. To verify learning, we test our implementation against other strategies and analyze the results obtained.Gracias al ritmo vertiginoso al que crece la capacidad computacional, la inteligencia artificial está ́logrando hitos que hace tan solo unas décadas se consideraban impensables. Uno de ellos es AlphaZero, un algoritmo capaz de alcanzar un nivel de juego sobrehumano en ajedrez, shogi y Go, mediante unas pocas horas de autoaprendizaje y sin conocimiento del dominio excepto las reglas del juego. En este trabajo, revisamos los fundamentos, explicamos cómo funciona el algoritmo y desarrollamos nuestra propia versión de este, capaz de ser ejecutada en un ordenador personal. A pesar de la escasez de recursos computacionales disponibles, hemos conseguido dominar juegos menos complejos como el Tres en Raya y el Conecta 4. Para verificar el aprendizaje, probamos nuestra implementación contra otras estrategias y analizamos los resultados obtenidos.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

Docta Complutense

In search of no-loss strategies for the game of Tic-tac-toe using a customized genetic algorithm

Author: Bhatt Anurag
Deb Kalyanmoy
Varshney Pratul
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

The game of Tic-tac-toe is one of the most commonly known games. This game does not allow one to win all the time and a significant proportion of games played results in a draw. Thus, the best a player can hope is to not lose the game. This study is aimed at evolving a number of no-loss strategies using genetic algorithms and comparing them with existing methodologies. To efficiently evolve no-loss strategies, we have developed innovative ways of representing and evaluating a solution, initializing the GA population, developing GA operators including an elite preserving scheme. Interestingly, our GA implementation is able to find more than 72 thousands no-loss strategies for playing the game. Moreover, an analysis of these solutions has given us insights about how to play the game to not lose it. Based on this experience, we have developed specialized efficient strategies having a high win-to-draw ratio. The study and its results are interesting and can be encouraging for the techniques to be applied to other board games for finding efficient strategies

Automatic Generation of Alternative Starting Positions for Simple Traditional Board Games

Author: Ahmed Umair Z.
Chatterjee Krishnendu
Gulwani Sumit
Publication venue
Publication date: 14/11/2014
Field of study

Simple board games, like Tic-Tac-Toe and CONNECT-4, play an important role not only in the development of mathematical and logical skills, but also in the emotional and social development. In this paper, we address the problem of generating targeted starting positions for such games. This can facilitate new approaches for bringing novice players to mastery, and also leads to discovery of interesting game variants. We present an approach that generates starting states of varying hardness levels for player~

1

in a two-player board game, given rules of the board game, the desired number of steps required for player~

1

to win, and the expertise levels of the two players. Our approach leverages symbolic methods and iterative simulation to efficiently search the extremely large state space. We present experimental results that include discovery of states of varying hardness levels for several simple grid-based board games. The presence of such states for standard game variants like

4 \times 4

Tic-Tac-Toe opens up new games to be played that have never been played as the default start state is heavily biased.Comment: A conference version of the paper will appear in AAAI 201

arXiv.org e-Print Archive

CiteSeerX

IST Austria: PubRep (Institute of Science and Technology)

Association for the Advancement of Artificial Intelligence: AAAI Publications