82 research outputs found

    Preference-Based Monte Carlo Tree Search

    Full text link
    Monte Carlo tree search (MCTS) is a popular choice for solving sequential anytime problems. However, it depends on a numeric feedback signal, which can be difficult to define. Real-time MCTS is a variant which may only rarely encounter states with an explicit, extrinsic reward. To deal with such cases, the experimenter has to supply an additional numeric feedback signal in the form of a heuristic, which intrinsically guides the agent. Recent work has shown evidence that in different areas the underlying structure is ordinal and not numerical. Hence erroneous and biased heuristics are inevitable, especially in such domains. In this paper, we propose a MCTS variant which only depends on qualitative feedback, and therefore opens up new applications for MCTS. We also find indications that translating absolute into ordinal feedback may be beneficial. Using a puzzle domain, we show that our preference-based MCTS variant, wich only receives qualitative feedback, is able to reach a performance level comparable to a regular MCTS baseline, which obtains quantitative feedback.Comment: To be publishe

    In situ compatibilisation of alkenyl-terminated polymer blends using cross metathesis

    No full text
    Several compatibilised polyolefin-based blends have been obtained via rather simple and robust chemistry: olefin cross metathesis using Grubbs' second-generation catalyst (G2) of alkenyl-terminated macromolecules of different nature. The viability of the concept was first demonstrated for low molecular weight polyolefin macromolecules before being extended to higher molecular weight polymers, including polar ones such as poly(3-caprolactone) (PCL), poly(pentadecalactone) (PPDL) and poly(methylmethacrylate) (PMMA). When taking all the possible cross metathesis reactions into account, a statistical distribution of homopolymers and diblock copolymers is likely to be formed. While clear macrophase separation is visible in the uncompatibilised blends of macromolecules, it is absent for the in situ compatibilised products, as was confirmed by optical microscopy. It was demonstrated that even small amounts of diblock copolymers can effectively compatibilise the two phases. All materials were analysed by HT SEC, DSC, HT HPLC and optical microscopy. Such a proof of principle indicates that using cross metathesis on a large library of macromolecules might be a versatile "synthetic handle" to reach a variety of in situ compatibilised blends

    Strokes services gespiegeld

    No full text

    Strokes services gespiegeld

    No full text

    Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search

    No full text
    Regret minimization is important in both the Multi-Armed Bandit problem and Monte-Carlo Tree Search (MCTS). Recently, sim-ple regret, i.e., the regret of not recommending the best action, has been proposed as an alternative to cumulative regret in MCTS, i.e., regret accumulated over time. Each type of regret is appropriate in different contexts. Although the majority of MCTS research applies the UCT se-lection policy for minimizing cumulative regret in the tree, this paper introduces a new MCTS variant, Hybrid MCTS (H-MCTS), which min-imizes both types of regret in different parts of the tree. H-MCTS uses SHOT, a recursive version of Sequential Halving, to minimize simple regret near the root, and UCT to minimize cumulative regret when de-scending further down the tree. We discuss the motivation for this new search technique, and show the performance of H-MCTS in six distinc
    • …
    corecore