4,935 research outputs found

    An adaptive hierarchical domain decomposition method for parallel contact dynamics simulations of granular materials

    Full text link
    A fully parallel version of the contact dynamics (CD) method is presented in this paper. For large enough systems, 100% efficiency has been demonstrated for up to 256 processors using a hierarchical domain decomposition with dynamic load balancing. The iterative scheme to calculate the contact forces is left domain-wise sequential, with data exchange after each iteration step, which ensures its stability. The number of additional iterations required for convergence by the partially parallel updates at the domain boundaries becomes negligible with increasing number of particles, which allows for an effective parallelization. Compared to the sequential implementation, we found no influence of the parallelization on simulation results.Comment: 19 pages, 15 figures, published in Journal of Computational Physics (2011

    Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

    Full text link
    GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin

    Efficient Implementations of Molecular Dynamics Simulations for Lennard-Jones Systems

    Full text link
    Efficient implementations of the classical molecular dynamics (MD) method for Lennard-Jones particle systems are considered. Not only general algorithms but also techniques that are efficient for some specific CPU architectures are also explained. A simple spatial-decomposition-based strategy is adopted for parallelization. By utilizing the developed code, benchmark simulations are performed on a HITACHI SR16000/J2 system consisting of IBM POWER6 processors which are 4.7 GHz at the National Institute for Fusion Science (NIFS) and an SGI Altix ICE 8400EX system consisting of Intel Xeon processors which are 2.93 GHz at the Institute for Solid State Physics (ISSP), the University of Tokyo. The parallelization efficiency of the largest run, consisting of 4.1 billion particles with 8192 MPI processes, is about 73% relative to that of the smallest run with 128 MPI processes at NIFS, and it is about 66% relative to that of the smallest run with 4 MPI processes at ISSP. The factors causing the parallel overhead are investigated. It is found that fluctuations of the execution time of each process degrade the parallel efficiency. These fluctuations may be due to the interference of the operating system, which is known as OS Jitter.Comment: 33 pages, 19 figures, add references and figures are revise

    Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes

    Full text link
    Multiscale and inhomogeneous molecular systems are challenging topics in the field of molecular simulation. In particular, modeling biological systems in the context of multiscale simulations and exploring material properties are driving a permanent development of new simulation methods and optimization algorithms. In computational terms, those methods require parallelization schemes that make a productive use of computational resources for each simulation and from its genesis. Here, we introduce the heterogeneous domain decomposition approach which is a combination of an heterogeneity sensitive spatial domain decomposition with an \textit{a priori} rearrangement of subdomain-walls. Within this approach, the theoretical modeling and scaling-laws for the force computation time are proposed and studied as a function of the number of particles and the spatial resolution ratio. We also show the new approach capabilities, by comparing it to both static domain decomposition algorithms and dynamic load balancing schemes. Specifically, two representative molecular systems have been simulated and compared to the heterogeneous domain decomposition proposed in this work. These two systems comprise an adaptive resolution simulation of a biomolecule solvated in water and a phase separated binary Lennard-Jones fluid.Comment: 14 pages, 12 figure
    • …
    corecore