11,081 research outputs found
Scaling Monte Carlo Tree Search on Intel Xeon Phi
Many algorithms have been parallelized successfully on the Intel Xeon Phi
coprocessor, especially those with regular, balanced, and predictable data
access patterns and instruction flows. Irregular and unbalanced algorithms are
harder to parallelize efficiently. They are, for instance, present in
artificial intelligence search algorithms such as Monte Carlo Tree Search
(MCTS). In this paper we study the scaling behavior of MCTS, on a highly
optimized real-world application, on real hardware. The Intel Xeon Phi allows
shared memory scaling studies up to 61 cores and 244 hardware threads. We
compare work-stealing (Cilk Plus and TBB) and work-sharing (FIFO scheduling)
approaches. Interestingly, we find that a straightforward thread pool with a
work-sharing FIFO queue shows the best performance. A crucial element for this
high performance is the controlling of the grain size, an approach that we call
Grain Size Controlled Parallel MCTS. Our subsequent comparing with the Xeon
CPUs shows an even more comprehensible distinction in performance between
different threading libraries. We achieve, to the best of our knowledge, the
fastest implementation of a parallel MCTS on the 61 core Intel Xeon Phi using a
real application (47 relative to a sequential run).Comment: 8 pages, 9 figure
Process algebra with strategic interleaving
In process algebras such as ACP (Algebra of Communicating Processes),
parallel processes are considered to be interleaved in an arbitrary way. In the
case of multi-threading as found in contemporary programming languages,
parallel processes are actually interleaved according to some interleaving
strategy. An interleaving strategy is what is called a process-scheduling
policy in the field of operating systems. In many systems, for instance
hardware/software systems, we have to do with both parallel processes that may
best be considered to be interleaved in an arbitrary way and parallel processes
that may best be considered to be interleaved according to some interleaving
strategy. Therefore, we extend ACP in this paper with the latter form of
interleaving. The established properties of the extension concerned include an
elimination property, a conservative extension property, and a unique expansion
property.Comment: 19 pages, this version is a revision of the published versio
A thread calculus with molecular dynamics
We present a theory of threads, interleaving of threads, and interaction
between threads and services with features of molecular dynamics, a model of
computation that bears on computations in which dynamic data structures are
involved. Threads can interact with services of which the states consist of
structured data objects and computations take place by means of actions which
may change the structure of the data objects. The features introduced include
restriction of the scope of names used in threads to refer to data objects.
Because that feature makes it troublesome to provide a model based on
structural operational semantics and bisimulation, we construct a projective
limit model for the theory.Comment: 47 pages; examples and results added, phrasing improved, references
replace
Solution of the Skyrme-Hartree-Fock-Bogolyubov equations in the Cartesian deformed harmonic-oscillator basis. (VII) HFODD (v2.49t): a new version of the program
We describe the new version (v2.49t) of the code HFODD which solves the
nuclear Skyrme Hartree-Fock (HF) or Skyrme Hartree-Fock-Bogolyubov (HFB)
problem by using the Cartesian deformed harmonic-oscillator basis. In the new
version, we have implemented the following physics features: (i) the isospin
mixing and projection, (ii) the finite temperature formalism for the HFB and
HF+BCS methods, (iii) the Lipkin translational energy correction method, (iv)
the calculation of the shell correction. A number of specific numerical methods
have also been implemented in order to deal with large-scale multi-constraint
calculations and hardware limitations: (i) the two-basis method for the HFB
method, (ii) the Augmented Lagrangian Method (ALM) for multi-constraint
calculations, (iii) the linear constraint method based on the approximation of
the RPA matrix for multi-constraint calculations, (iv) an interface with the
axial and parity-conserving Skyrme-HFB code HFBTHO, (v) the mixing of the HF or
HFB matrix elements instead of the HF fields. Special care has been paid to
using the code on massively parallel leadership class computers. For this
purpose, the following features are now available with this version: (i) the
Message Passing Interface (MPI) framework, (ii) scalable input data routines,
(iii) multi-threading via OpenMP pragmas, (iv) parallel diagonalization of the
HFB matrix in the simplex breaking case using the ScaLAPACK library. Finally,
several little significant errors of the previous published version were
corrected.Comment: Accepted for publication to Computer Physics Communications. Program
files re-submitted to Comp. Phys. Comm. Program Library after correction of
several minor bug
- …