56 research outputs found

    Monte-Carlo tree search enhancements for one-player and two-player domains

    Get PDF

    MCTS-minimax hybrids with state evaluations

    Get PDF
    Monte-Carlo Tree Search (MCTS) has been found to show weaker play than minimax-based search in some tactical game domains. In order to combine the tactical strength of minimax and the strategic strength of MCTS, MCTS-minimax hybrids have been proposed in prior work. This arti

    Explainable search

    Get PDF
    Search-based AI agents are state of the art in many challenging sequential decision-making domains. However, contemporary approaches lack the ability to explain, summarize, or visualize their plans and decisions, and how they are derived from traversing complex spaces of possible futures, contingencies, and eventualities, spanned by the available actions of the agent. This limits human trust in high-stakes scenarios, as well as effective human-AI collaboration. In this paper, we pr

    Novelty and MCTS

    Get PDF
    Novelty search has become a popular technique in different fields such as evolutionary computing, classical AI planning, and deep reinforcement learning. Searching for novelty instead of, or in addition to, directly maximizing the search objective, aims at avoiding dead ends and local minima, and overall improving exploration. We propose and test the integration of novelty into Monte Carlo Tree Search (MCTS), a state-of-the-art framework for online RL planning, by linearly combining value estim

    ME-MCTS: Online generalization by combining multiple value estimators

    Get PDF
    This paper addresses the challenge of online gen- eralization in tree search. We propose Multiple Estimator Monte Carlo Tree Search (ME-MCTS), with a two-fold contribution: first, we introduce a formalization of online generalization that can rep- resent existing techniques such as “history heuris- tics”, “RAVE”, or “OMA” – contextual action value estimators or abstractors that generalize across spe- cific contexts. Second, we incorporate recent ad- vances in estimator averaging that enable guiding search by combining the online action value esti- mates of any number of such abstractors or sim- ilar types of action value estimators. Unlike pre- vious work, which usually proposed a single ab- stractor for either the selection or the rollout phase of MCTS simulations, our approach focuses on the combination of multiple estimators and applies them to all move choices in MCTS simulations. As the MCTS tree itself is just another value estima- tor – unbiased, but without abstraction – this blurs the traditional distinction between action choices inside and outside of the MCTS tree. Experi- ments with three abstractors in four board games show significant improvements of ME-MCTS over MCTS using only a single abstractor, both for MCTS with random rollouts as well as for MCTS with static evaluation functions. While we used deterministic, fully observable games, ME-MCTS naturally extends to more challenging settings

    Guiding multiplayer MCTS by focusing on yourself

    Get PDF
    In n-player sequential move games, the second root-player move appears at tree depth n + 1. Depending on n and time, tree search techniques can struggle to expand the game tree deeply enough to find multiple-move plans of the root player, which is often more important for strategic play than considering every possible opponent move in between. The minimax-based Paranoid search and BRS+ algorithms currently achieve state-of-the-art performance, especially at short time settings, by using a generally incorrect opponent model.

    Towards explainable MCTS

    Get PDF
    Monte-Carlo Tree Search (MCTS) is a family of sampling-based search algorithms widely used for online planning in sequential decision-making domains, and at the heart of many recent breakthroughs in AI. Understanding the behavior of MCTS agents is non-trivial for developers and users, as it results from often large and complex search trees, consisting of many simulated possible futures, their evaluations, and relationships to each other. This p
    • …
    corecore