11,081 research outputs found

    Scaling Monte Carlo Tree Search on Intel Xeon Phi

    Full text link
    Many algorithms have been parallelized successfully on the Intel Xeon Phi coprocessor, especially those with regular, balanced, and predictable data access patterns and instruction flows. Irregular and unbalanced algorithms are harder to parallelize efficiently. They are, for instance, present in artificial intelligence search algorithms such as Monte Carlo Tree Search (MCTS). In this paper we study the scaling behavior of MCTS, on a highly optimized real-world application, on real hardware. The Intel Xeon Phi allows shared memory scaling studies up to 61 cores and 244 hardware threads. We compare work-stealing (Cilk Plus and TBB) and work-sharing (FIFO scheduling) approaches. Interestingly, we find that a straightforward thread pool with a work-sharing FIFO queue shows the best performance. A crucial element for this high performance is the controlling of the grain size, an approach that we call Grain Size Controlled Parallel MCTS. Our subsequent comparing with the Xeon CPUs shows an even more comprehensible distinction in performance between different threading libraries. We achieve, to the best of our knowledge, the fastest implementation of a parallel MCTS on the 61 core Intel Xeon Phi using a real application (47 relative to a sequential run).Comment: 8 pages, 9 figure

    Process algebra with strategic interleaving

    Get PDF
    In process algebras such as ACP (Algebra of Communicating Processes), parallel processes are considered to be interleaved in an arbitrary way. In the case of multi-threading as found in contemporary programming languages, parallel processes are actually interleaved according to some interleaving strategy. An interleaving strategy is what is called a process-scheduling policy in the field of operating systems. In many systems, for instance hardware/software systems, we have to do with both parallel processes that may best be considered to be interleaved in an arbitrary way and parallel processes that may best be considered to be interleaved according to some interleaving strategy. Therefore, we extend ACP in this paper with the latter form of interleaving. The established properties of the extension concerned include an elimination property, a conservative extension property, and a unique expansion property.Comment: 19 pages, this version is a revision of the published versio

    A thread calculus with molecular dynamics

    Get PDF
    We present a theory of threads, interleaving of threads, and interaction between threads and services with features of molecular dynamics, a model of computation that bears on computations in which dynamic data structures are involved. Threads can interact with services of which the states consist of structured data objects and computations take place by means of actions which may change the structure of the data objects. The features introduced include restriction of the scope of names used in threads to refer to data objects. Because that feature makes it troublesome to provide a model based on structural operational semantics and bisimulation, we construct a projective limit model for the theory.Comment: 47 pages; examples and results added, phrasing improved, references replace

    Solution of the Skyrme-Hartree-Fock-Bogolyubov equations in the Cartesian deformed harmonic-oscillator basis. (VII) HFODD (v2.49t): a new version of the program

    Full text link
    We describe the new version (v2.49t) of the code HFODD which solves the nuclear Skyrme Hartree-Fock (HF) or Skyrme Hartree-Fock-Bogolyubov (HFB) problem by using the Cartesian deformed harmonic-oscillator basis. In the new version, we have implemented the following physics features: (i) the isospin mixing and projection, (ii) the finite temperature formalism for the HFB and HF+BCS methods, (iii) the Lipkin translational energy correction method, (iv) the calculation of the shell correction. A number of specific numerical methods have also been implemented in order to deal with large-scale multi-constraint calculations and hardware limitations: (i) the two-basis method for the HFB method, (ii) the Augmented Lagrangian Method (ALM) for multi-constraint calculations, (iii) the linear constraint method based on the approximation of the RPA matrix for multi-constraint calculations, (iv) an interface with the axial and parity-conserving Skyrme-HFB code HFBTHO, (v) the mixing of the HF or HFB matrix elements instead of the HF fields. Special care has been paid to using the code on massively parallel leadership class computers. For this purpose, the following features are now available with this version: (i) the Message Passing Interface (MPI) framework, (ii) scalable input data routines, (iii) multi-threading via OpenMP pragmas, (iv) parallel diagonalization of the HFB matrix in the simplex breaking case using the ScaLAPACK library. Finally, several little significant errors of the previous published version were corrected.Comment: Accepted for publication to Computer Physics Communications. Program files re-submitted to Comp. Phys. Comm. Program Library after correction of several minor bug
    • …
    corecore