558 research outputs found
Search versus Knowledge: An Empirical Study of Minimax on KRK
This article presents the results of an empirical experiment designed to gain insight into what is the effect of the minimax algorithm on the evaluation function. The experiment’s simulations were performed upon the KRK chess endgame. Our results show that dependencies between evaluations of sibling nodes in a game tree and an abundance of possibilities to commit blunders present in the KRK endgame are not sufficient to explain the success of the minimax principle in practical game-playing as was previously believed. The article shows that minimax in combination with a noisy evaluation function introduces a bias into the backed-up evaluations and argues that this bias is what masked the effectiveness of the minimax in previous studies
The effect of simulation bias on action selection in Monte Carlo Tree Search
A dissertation submitted to the Faculty of Science, University of the Witwatersrand,
in fulfilment of the requirements for the degree of Master of Science. August 2016.Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread
attention in recent years. It combines a traditional tree-search approach with Monte Carlo
simulations, using the outcome of these simulations (also known as playouts or rollouts) to evaluate
states in a look-ahead tree. That MCTS does not require an evaluation function makes it particularly
well-suited to the game of Go — seen by many to be chess’s successor as a grand challenge of
artificial intelligence — with MCTS-based agents recently able to achieve expert-level play on
19Ă—19 boards. Furthermore, its domain-independent nature also makes it a focus in a variety of
other fields, such as Bayesian reinforcement learning and general game-playing.
Despite the vast amount of research into MCTS, the dynamics of the algorithm are still not
yet fully understood. In particular, the effect of using knowledge-heavy or biased simulations in
MCTS still remains unknown, with interesting results indicating that better-informed rollouts do
not necessarily result in stronger agents. This research provides support for the notion that MCTS
is well-suited to a class of domain possessing a smoothness property. In these domains, biased
rollouts are more likely to produce strong agents. Conversely, any error due to incorrect bias
is compounded in non-smooth domains, and in particular for low-variance simulations. This is
demonstrated empirically in a number of single-agent domains.LG201
Estimating the bias of a noisy coin
Optimal estimation of a coin's bias using noisy data is surprisingly
different from the same problem with noiseless data. We study this problem
using entropy risk to quantify estimators' accuracy. We generalize the "add
Beta" estimators that work well for noiseless coins, and we find that these
hedged maximum-likelihood (HML) estimators achieve a worst-case risk of
O(N^{-1/2}) on noisy coins, in contrast to O(1/N) in the noiseless case. We
demonstrate that this increased risk is unavoidable and intrinsic to noisy
coins, by constructing minimax estimators (numerically). However, minimax
estimators introduce extreme bias in return for slight improvements in the
worst-case risk. So we introduce a pointwise lower bound on the minimum
achievable risk as an alternative to the minimax criterion, and use this bound
to show that HML estimators are pretty good. We conclude with a survey of
scientific applications of the noisy coin model in social science, physical
science, and quantum information science.Comment: 10 page
The phenomenon of Decision Oscillation: a new consequence of pathology in Game Trees
Random minimaxing studies the consequences of using a random number for scoring
the leaf nodes of a full width game tree and then computing the best move using the
standard minimax procedure. Experiments in Chess showed that the strength of play
increases as the depth of the lookahead is increased. Previous research by the authors
provided a partial explanation of why random minimaxing can strengthen play by showing
that, when one move dominates another move, then the dominating move is more likely
to be chosen by minimax. This paper examines a special case of determining the move
probability when domination does not occur. Specifically, we show that, under a uniform
branching game tree model, whether the probability that one move is chosen rather than
another depends not only on the branching factors of the moves involved, but also on
whether the number of ply searched is odd or even. This is a new type of game tree
pathology, where the minimax procedure will change its mind as to which move is best,
independently of the true value of the game, and oscillate between moves as the depth of
lookahead alternates between odd and even
- …