558 research outputs found

    Search versus Knowledge: An Empirical Study of Minimax on KRK

    Get PDF
    This article presents the results of an empirical experiment designed to gain insight into what is the effect of the minimax algorithm on the evaluation function. The experiment’s simulations were performed upon the KRK chess endgame. Our results show that dependencies between evaluations of sibling nodes in a game tree and an abundance of possibilities to commit blunders present in the KRK endgame are not sufficient to explain the success of the minimax principle in practical game-playing as was previously believed. The article shows that minimax in combination with a noisy evaluation function introduces a bias into the backed-up evaluations and argues that this bias is what masked the effectiveness of the minimax in previous studies

    The effect of simulation bias on action selection in Monte Carlo Tree Search

    Get PDF
    A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Master of Science. August 2016.Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread attention in recent years. It combines a traditional tree-search approach with Monte Carlo simulations, using the outcome of these simulations (also known as playouts or rollouts) to evaluate states in a look-ahead tree. That MCTS does not require an evaluation function makes it particularly well-suited to the game of Go — seen by many to be chess’s successor as a grand challenge of artificial intelligence — with MCTS-based agents recently able to achieve expert-level play on 19×19 boards. Furthermore, its domain-independent nature also makes it a focus in a variety of other fields, such as Bayesian reinforcement learning and general game-playing. Despite the vast amount of research into MCTS, the dynamics of the algorithm are still not yet fully understood. In particular, the effect of using knowledge-heavy or biased simulations in MCTS still remains unknown, with interesting results indicating that better-informed rollouts do not necessarily result in stronger agents. This research provides support for the notion that MCTS is well-suited to a class of domain possessing a smoothness property. In these domains, biased rollouts are more likely to produce strong agents. Conversely, any error due to incorrect bias is compounded in non-smooth domains, and in particular for low-variance simulations. This is demonstrated empirically in a number of single-agent domains.LG201

    Estimating the bias of a noisy coin

    Full text link
    Optimal estimation of a coin's bias using noisy data is surprisingly different from the same problem with noiseless data. We study this problem using entropy risk to quantify estimators' accuracy. We generalize the "add Beta" estimators that work well for noiseless coins, and we find that these hedged maximum-likelihood (HML) estimators achieve a worst-case risk of O(N^{-1/2}) on noisy coins, in contrast to O(1/N) in the noiseless case. We demonstrate that this increased risk is unavoidable and intrinsic to noisy coins, by constructing minimax estimators (numerically). However, minimax estimators introduce extreme bias in return for slight improvements in the worst-case risk. So we introduce a pointwise lower bound on the minimum achievable risk as an alternative to the minimax criterion, and use this bound to show that HML estimators are pretty good. We conclude with a survey of scientific applications of the noisy coin model in social science, physical science, and quantum information science.Comment: 10 page

    The phenomenon of Decision Oscillation: a new consequence of pathology in Game Trees

    Get PDF
    Random minimaxing studies the consequences of using a random number for scoring the leaf nodes of a full width game tree and then computing the best move using the standard minimax procedure. Experiments in Chess showed that the strength of play increases as the depth of the lookahead is increased. Previous research by the authors provided a partial explanation of why random minimaxing can strengthen play by showing that, when one move dominates another move, then the dominating move is more likely to be chosen by minimax. This paper examines a special case of determining the move probability when domination does not occur. Specifically, we show that, under a uniform branching game tree model, whether the probability that one move is chosen rather than another depends not only on the branching factors of the moves involved, but also on whether the number of ply searched is odd or even. This is a new type of game tree pathology, where the minimax procedure will change its mind as to which move is best, independently of the true value of the game, and oscillate between moves as the depth of lookahead alternates between odd and even
    • …
    corecore