27,431 research outputs found
Parallel Excluded Volume Tempering for Polymer Melts
We have developed a technique to accelerate the acquisition of effectively
uncorrelated configurations for off-lattice models of dense polymer melts which
makes use of both parallel tempering and large scale Monte Carlo moves. The
method is based upon simulating a set of systems in parallel, each of which has
a slightly different repulsive core potential, such that a thermodynamic path
from full excluded volume to an ideal gas of random walks is generated. While
each system is run with standard stochastic dynamics, resulting in an NVT
ensemble, we implement the parallel tempering through stochastic swaps between
the configurations of adjacent potentials, and the large scale Monte Carlo
moves through attempted pivot and translation moves which reach a realistic
acceptance probability as the limit of the ideal gas of random walks is
approached. Compared to pure stochastic dynamics, this results in an increased
efficiency even for a system of chains as short as monomers, however
at this chain length the large scale Monte Carlo moves were ineffective. For
even longer chains the speedup becomes substantial, as observed from
preliminary data for
GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP
Full detector simulation was among the largest CPU consumer in all CERN
experiment software stacks for the first two runs of the Large Hadron Collider
(LHC). In the early 2010's, the projections were that simulation demands would
scale linearly with luminosity increase, compensated only partially by an
increase of computing resources. The extension of fast simulation approaches to
more use cases, covering a larger fraction of the simulation budget, is only
part of the solution due to intrinsic precision limitations. The remainder
corresponds to speeding-up the simulation software by several factors, which is
out of reach using simple optimizations on the current code base. In this
context, the GeantV R&D project was launched, aiming to redesign the legacy
particle transport codes in order to make them benefit from fine-grained
parallelism features such as vectorization, but also from increased code and
data locality. This paper presents extensively the results and achievements of
this R&D, as well as the conclusions and lessons learnt from the beta
prototype.Comment: 34 pages, 26 figures, 24 table
A Fast Causal Profiler for Task Parallel Programs
This paper proposes TASKPROF, a profiler that identifies parallelism
bottlenecks in task parallel programs. It leverages the structure of a task
parallel execution to perform fine-grained attribution of work to various parts
of the program. TASKPROF's use of hardware performance counters to perform
fine-grained measurements minimizes perturbation. TASKPROF's profile execution
runs in parallel using multi-cores. TASKPROF's causal profile enables users to
estimate improvements in parallelism when a region of code is optimized even
when concrete optimizations are not yet known. We have used TASKPROF to isolate
parallelism bottlenecks in twenty three applications that use the Intel
Threading Building Blocks library. We have designed parallelization techniques
in five applications to in- crease parallelism by an order of magnitude using
TASKPROF. Our user study indicates that developers are able to isolate
performance bottlenecks with ease using TASKPROF.Comment: 11 page
- …