2,217 research outputs found

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Price of Competition and Dueling Games

    Get PDF
    We study competition in a general framework introduced by Immorlica et al. and answer their main open question. Immorlica et al. considered classic optimization problems in terms of competition and introduced a general class of games called dueling games. They model this competition as a zero-sum game, where two players are competing for a user's satisfaction. In their main and most natural game, the ranking duel, a user requests a webpage by submitting a query and players output an ordering over all possible webpages based on the submitted query. The user tends to choose the ordering which displays her requested webpage in a higher rank. The goal of both players is to maximize the probability that her ordering beats that of her opponent and gets the user's attention. Immorlica et al. show this game directs both players to provide suboptimal search results. However, they leave the following as their main open question: "does competition between algorithms improve or degrade expected performance?" In this paper, we resolve this question for the ranking duel and a more general class of dueling games. More precisely, we study the quality of orderings in a competition between two players. This game is a zero-sum game, and thus any Nash equilibrium of the game can be described by minimax strategies. Let the value of the user for an ordering be a function of the position of her requested item in the corresponding ordering, and the social welfare for an ordering be the expected value of the corresponding ordering for the user. We propose the price of competition which is the ratio of the social welfare for the worst minimax strategy to the social welfare obtained by a social planner. We use this criterion for analyzing the quality of orderings in the ranking duel. We prove the quality of minimax results is surprisingly close to that of the optimum solution

    On Monte-Carlo tree search for deterministic games with alternate moves and complete information

    Full text link
    We consider a deterministic game with alternate moves and complete information, of which the issue is always the victory of one of the two opponents. We assume that this game is the realization of a random model enjoying some independence properties. We consider algorithms in the spirit of Monte-Carlo Tree Search, to estimate at best the minimax value of a given position: it consists in simulating, successively, nn well-chosen matches, starting from this position. We build an algorithm, which is optimal, step by step, in some sense: once the nn first matches are simulated, the algorithm decides from the statistics furnished by the nn first matches (and the a priori we have on the game) how to simulate the (n+1)(n+1)-th match in such a way that the increase of information concerning the minimax value of the position under study is maximal. This algorithm is remarkably quick. We prove that our step by step optimal algorithm is not globally optimal and that it always converges in a finite number of steps, even if the a priori we have on the game is completely irrelevant. We finally test our algorithm, against MCTS, on Pearl's game and, with a very simple and universal a priori, on the games Connect Four and some variants. The numerical results are rather disappointing. We however exhibit some situations in which our algorithm seems efficient

    Analysis of Dialogical Argumentation via Finite State Machines

    Get PDF
    Dialogical argumentation is an important cognitive activity by which agents exchange arguments and counterarguments as part of some process such as discussion, debate, persuasion and negotiation. Whilst numerous formal systems have been proposed, there is a lack of frameworks for implementing and evaluating these proposals. First-order executable logic has been proposed as a general framework for specifying and analysing dialogical argumentation. In this paper, we investigate how we can implement systems for dialogical argumentation using propositional executable logic. Our approach is to present and evaluate an algorithm that generates a finite state machine that reflects a propositional executable logic specification for a dialogical argumentation together with an initial state. We also consider how the finite state machines can be analysed, with the minimax strategy being used as an illustration of the kinds of empirical analysis that can be undertaken.Comment: 10 page

    Free Energy and the Generalized Optimality Equations for Sequential Decision Making

    Full text link
    The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.Comment: 10 pages, 2 figure
    corecore