Search CORE

673 research outputs found

GA-Gammon:A Backgammon Player Program Based on Evolutionary Algorithms

Author: Cruz-Cortes Nareli
Irineo-Fuentes Oscar
Larsen Henrik Legind
Ortiz-Arroyo Daniel
Rodriguez-Henriquez Francisco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Abalearn: a risk-sensitive approach to self-play learning in Abalone

Author: Campos Pedro
Langlois Thibault
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.info:eu-repo/semantics/publishedVersio

Crossref

Repositório Digital da Universidade da Madeira

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Author: Stone P.
Taylor M.E.
Whiteson S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

International Migration, Integration and Social Cohesion online publications

The Hanabi Challenge: A New Frontier for AI Research

Author: Bard Nolan
Bellemare Marc G.
Bowling Michael
Burch Neil
Chandar Sarath
Dumoulin Vincent
Dunning Iain
Foerster Jakob N.
Hughes Edward
Lanctot Marc
Larochelle Hugo
Moitra Subhodeep
Mourad Shibl
Parisotto Emilio
Song H. Francis
Publication venue: 'Elsevier BV'
Publication date: 06/12/2019
Field of study

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques.Comment: 32 pages, 5 figures, In Press (Artificial Intelligence

arXiv.org e-Print Archive

PolyPublie

Session 5: Development, Neuroscience and Evolutionary Psychology

Author: Machamer Peter
Quartz Steven
Scarantino Andrea
Sullivan Jackie
Publication venue
Publication date: 01/01/2002
Field of study

Proceedings of the Pittsburgh Workshop in History and Philosophy of Biology, Center for Philosophy of Science, University of Pittsburgh, March 23-24 2001 Session 5: Development, Neuroscience and Evolutionary Psycholog

PhilSci Archive

Temporal Difference Learning in Complex Domains

Author: Smith Martin C.
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/1999
Field of study

PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully used for backgammon (Tesauro 1994) and applies them to other complex games that are less amenable to simple pattem-matching approaches. The games investigated are chess and shogi, both of which (unlike backgammon) require significant amounts of computational effort to be expended on search in order to achieve expert play. The improved methods are also tested in a non-game domain. In the chess domain, the adapted TD(k) method is shown to successfully learn the relative values of the pieces, and matches using these learnt piece values indicate that they perform at least as well as piece values widely quoted in elementary chess books. The adapted TD(X) method is also shown to work well in shogi, considered by many researchers to be the next challenge for computer game-playing, and for which there is no standardised set of piece values. An original method to automatically set and adjust the major control parameters used by TD(k) is presented. The main performance advantage comes from the learning rate adjustment, which is based on a new concept called temporal coherence. Experiments in both chess and a random-walk domain show that the temporal coherence algorithm produces both faster learning and more stable values than both human-chosen parameters and an earlier method for learning rate adjustment. The methods presented in this thesis allow programs to learn with as little input of external knowledge as possible, exploring the domain on their own rather than by being taught. Further experiments show that the method is capable of handling many hundreds of weights, and that it is not necessary to perform deep searches during the leaming phase in order to learn effective weight

Queen Mary Research Online

Policy Improvement in Cribbage

Author: Lang Sean Ryan
Publication venue: Helsingfors universitet
Publication date: 01/01/2018
Field of study

Cribbage is a card game involving multiple methods of scoring which each receive varying emphasis over the course of a typical game. Reinforcement learning is a machine learning strategy in which an agent learns to accomplish a task via direct experience by collecting rewards based on performance. In this thesis, reinforcement learning is applied to the game of cribbage, improving an agent’s policy of combining multiple basic strategies, according to the needs of the dynamic state of the game. From inspection, a reasonable policy is learned by the agent over the course of a million games, but an increase in performance was not demonstrated

Helsingin yliopiston digitaalinen arkisto

M2ICAL: A technique for analyzing imperfect comparison algorithms using Markov chains

Author: OON WEE CHONG
Publication venue
Publication date: 16/10/2007
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS