34,348 research outputs found

    Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning

    Full text link
    We propose a novel Parallel Monte Carlo tree search with Batched Simulations (PMBS) algorithm for accelerating long-horizon, episodic robotic planning tasks. Monte Carlo tree search (MCTS) is an effective heuristic search algorithm for solving episodic decision-making problems whose underlying search spaces are expansive. Leveraging a GPU-based large-scale simulator, PMBS introduces massive parallelism into MCTS for solving planning tasks through the batched execution of a large number of concurrent simulations, which allows for more efficient and accurate evaluations of the expected cost-to-go over large action spaces. When applied to the challenging manipulation tasks of object retrieval from clutter, PMBS achieves a speedup of over 30×30\times with an improved solution quality, in comparison to a serial MCTS implementation. We show that PMBS can be directly applied to real robot hardware with negligible sim-to-real differences. Supplementary material, including video, can be found at https://github.com/arc-l/pmbs.Comment: Accepted for IROS 202

    HyP-DESPOT: A Hybrid Parallel Algorithm for Online Planning under Uncertainty

    Full text link
    Planning under uncertainty is critical for robust robot performance in uncertain, dynamic environments, but it incurs high computational cost. State-of-the-art online search algorithms, such as DESPOT, have vastly improved the computational efficiency of planning under uncertainty and made it a valuable tool for robotics in practice. This work takes one step further by leveraging both CPU and GPU parallelization in order to achieve near real-time online planning performance for complex tasks with large state, action, and observation spaces. Specifically, we propose Hybrid Parallel DESPOT (HyP-DESPOT), a massively parallel online planning algorithm that integrates CPU and GPU parallelism in a multi-level scheme. It performs parallel DESPOT tree search by simultaneously traversing multiple independent paths using multi-core CPUs and performs parallel Monte-Carlo simulations at the leaf nodes of the search tree using GPUs. Experimental results show that HyP-DESPOT speeds up online planning by up to several hundred times, compared with the original DESPOT algorithm, in several challenging robotic tasks in simulation

    A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal Fleet Composition Considering Vehicle Routing Using Branch & Bound

    Full text link
    In this paper, a Monte-Carlo Tree Search (MCTS)-based metaheuristic is developed that guides a Branch & Bound (B&B) algorithm to find the globally optimal solution to the heterogeneous fleet composition problem while considering vehicle routing. Fleet Size and Mix Vehicle Routing Problem with Time Windows (FSMVRPTW). The metaheuristic and exact algorithms are implemented in a parallel hybrid optimization algorithm where the metaheuristic rapidly finds feasible solutions that provide candidate upper bounds for the B&B algorithm which runs simultaneously. The MCTS additionally provides a candidate fleet composition to initiate the B&B search. Experiments show that the proposed approach results in significant improvements in computation time and convergence to the optimal solution.Comment: Submitted to the IEEE Intelligent Vehicles Symposium 202

    Scaling Monte Carlo Tree Search on Intel Xeon Phi

    Full text link
    Many algorithms have been parallelized successfully on the Intel Xeon Phi coprocessor, especially those with regular, balanced, and predictable data access patterns and instruction flows. Irregular and unbalanced algorithms are harder to parallelize efficiently. They are, for instance, present in artificial intelligence search algorithms such as Monte Carlo Tree Search (MCTS). In this paper we study the scaling behavior of MCTS, on a highly optimized real-world application, on real hardware. The Intel Xeon Phi allows shared memory scaling studies up to 61 cores and 244 hardware threads. We compare work-stealing (Cilk Plus and TBB) and work-sharing (FIFO scheduling) approaches. Interestingly, we find that a straightforward thread pool with a work-sharing FIFO queue shows the best performance. A crucial element for this high performance is the controlling of the grain size, an approach that we call Grain Size Controlled Parallel MCTS. Our subsequent comparing with the Xeon CPUs shows an even more comprehensible distinction in performance between different threading libraries. We achieve, to the best of our knowledge, the fastest implementation of a parallel MCTS on the 61 core Intel Xeon Phi using a real application (47 relative to a sequential run).Comment: 8 pages, 9 figure
    • …
    corecore