Search CORE

15 research outputs found

The phenomenon of Decision Oscillation: a new consequence of pathology in Game Trees

Author: Fenner Trevor
Levene Mark
Publication venue: 'Wiley'
Publication date: 03/03/2023
Field of study

Random minimaxing studies the consequences of using a random number for scoring the leaf nodes of a full width game tree and then computing the best move using the standard minimax procedure. Experiments in Chess showed that the strength of play increases as the depth of the lookahead is increased. Previous research by the authors provided a partial explanation of why random minimaxing can strengthen play by showing that, when one move dominates another move, then the dominating move is more likely to be chosen by minimax. This paper examines a special case of determining the move probability when domination does not occur. Specifically, we show that, under a uniform branching game tree model, whether the probability that one move is chosen rather than another depends not only on the branching factors of the moves involved, but also on whether the number of ply searched is odd or even. This is a new type of game tree pathology, where the minimax procedure will change its mind as to which move is best, independently of the true value of the game, and oscillate between moves as the depth of lookahead alternates between odd and even

Birkbeck Institutional Research Online

Forthcoming Papers

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

Forthcoming Papers

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

Master Index—Volumes 121–130

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

Studies in Machine Learning Using Game Playing�

Author: Seale Michael Wayne
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/1990
Field of study

Computer Scienc

SHAREOK repository

Author Index—Volume 130 (2001)

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

Recommended from our members

The Expected-Outcome Model of Two-Player Games

Author: Abramson Bruce
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1987
Field of study

This paper introduces a new, crisp definition of two-player evaluation functions. These functions calculate a node's expected-outcome value. or the probability that a randomly chosen leaf beneath it will represent a win. The utility of these values to game programs will be assessed by a series of experiments that compare the performance of expected-outcome functions with that of some popular, previously studied evaluators. To help demonstrate the domain-independence of these new functions, the experiments will be run on variants of several games, including tic-tac-toe, Othello, and chess. In addition, the paper outlines a. new probabilistic model of game-trees which involves rethinking many long-accepted assumptions in light of the newly defined expected-outcome functions

Columbia University Academic Commons

Intelligent strategy for two-person non-random perfect information zero-sum game.

Author
Publication venue
Publication date: 01/01/2003
Field of study

Tong Kwong-Bun.Thesis submitted in: December 2002.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 77-[80]).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- An Overview --- p.1Chapter 1.2 --- Tree Search --- p.2Chapter 1.2.1 --- Minimax Algorithm --- p.2Chapter 1.2.2 --- The Alpha-Beta Algorithm --- p.4Chapter 1.2.3 --- Alpha-Beta Enhancements --- p.5Chapter 1.2.4 --- Selective Search --- p.9Chapter 1.3 --- Construction of Evaluation Function --- p.16Chapter 1.4 --- Contribution of the Thesis --- p.17Chapter 1.5 --- Structure of the Thesis --- p.19Chapter 2 --- The Probabilistic Forward Pruning Framework --- p.20Chapter 2.1 --- Introduction --- p.20Chapter 2.2 --- The Generalized Probabilistic Forward Cuts Heuristic --- p.21Chapter 2.3 --- The GPC Framework --- p.24Chapter 2.3.1 --- The Alpha-Beta Algorithm --- p.24Chapter 2.3.2 --- The NegaScout Algorithm --- p.25Chapter 2.3.3 --- The Memory-enhanced Test Algorithm --- p.27Chapter 2.4 --- Summary --- p.27Chapter 3 --- The Fast Probabilistic Forward Pruning Framework --- p.30Chapter 3.1 --- Introduction --- p.30Chapter 3.2 --- The Fast GPC Heuristic --- p.30Chapter 3.2.1 --- The Alpha-Beta algorithm --- p.32Chapter 3.2.2 --- The NegaScout algorithm --- p.32Chapter 3.2.3 --- The Memory-enhanced Test algorithm --- p.35Chapter 3.3 --- Performance Evaluation --- p.35Chapter 3.3.1 --- Determination of the Parameters --- p.35Chapter 3.3.2 --- Result of Experiments --- p.38Chapter 3.4 --- Summary --- p.42Chapter 4 --- The Node-Cutting Heuristic --- p.43Chapter 4.1 --- Introduction --- p.43Chapter 4.2 --- Move Ordering --- p.43Chapter 4.2.1 --- Quality of Move Ordering --- p.44Chapter 4.3 --- Node-Cutting Heuristic --- p.46Chapter 4.4 --- Performance Evaluation --- p.48Chapter 4.4.1 --- Determination of the Parameters --- p.48Chapter 4.4.2 --- Result of Experiments --- p.50Chapter 4.5 --- Summary --- p.55Chapter 5 --- The Integrated Strategy --- p.56Chapter 5.1 --- Introduction --- p.56Chapter 5.2 --- "Combination of GPC, FGPC and Node-Cutting Heuristic" --- p.56Chapter 5.3 --- Performance Evaluation --- p.58Chapter 5.4 --- Summary --- p.63Chapter 6 --- Conclusions and Future Works --- p.64Chapter 6.1 --- Conclusions --- p.64Chapter 6.2 --- Future Works --- p.65Chapter A --- Examples --- p.67Chapter B --- The Rules of Chinese Checkers --- p.73Chapter C --- Application to Chinese Checkers --- p.75Bibliography --- p.7

CUHK Digital Repository

Temporal Difference Learning in Complex Domains

Author: Smith Martin C.
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/1999
Field of study

PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully used for backgammon (Tesauro 1994) and applies them to other complex games that are less amenable to simple pattem-matching approaches. The games investigated are chess and shogi, both of which (unlike backgammon) require significant amounts of computational effort to be expended on search in order to achieve expert play. The improved methods are also tested in a non-game domain. In the chess domain, the adapted TD(k) method is shown to successfully learn the relative values of the pieces, and matches using these learnt piece values indicate that they perform at least as well as piece values widely quoted in elementary chess books. The adapted TD(X) method is also shown to work well in shogi, considered by many researchers to be the next challenge for computer game-playing, and for which there is no standardised set of piece values. An original method to automatically set and adjust the major control parameters used by TD(k) is presented. The main performance advantage comes from the learning rate adjustment, which is based on a new concept called temporal coherence. Experiments in both chess and a random-walk domain show that the temporal coherence algorithm produces both faster learning and more stable values than both human-chosen parameters and an earlier method for learning rate adjustment. The methods presented in this thesis allow programs to learn with as little input of external knowledge as possible, exploring the domain on their own rather than by being taught. Further experiments show that the method is capable of handling many hundreds of weights, and that it is not necessary to perform deep searches during the leaming phase in order to learn effective weight

Queen Mary Research Online

Temoral Difference Learning in Complex Domains

Author: Smith Martin C.
Publication venue
Publication date: 30/12/2013
Field of study

Submitted to the University of London for the Degree of Doctor of Philosophy in Computer Scienc

Queen Mary Research Online