Search CORE

12 research outputs found

Scaling of the GROMACS 4.6 molecular dynamics code on SuperMUC.

Author: Apostolov R.
Grubmüller H.
Hess B.
Kutzner C.
Publication venue: 'IOS Press'
Publication date: 31/03/2014
Field of study

Here we report on the performance of GROMACS 4.6 on the SuperMUC cluster at the Leibniz Rechenzentrum in Garching. We carried out benchmarks with three biomolecular systems consisting of eighty thousand to twelve million atoms in a strong scaling test each. The twelve million atom simulation system reached a performance of 49 nanoseconds per day on 32,768 cores

MPG.PuRe

TweTriS: Twenty trillion-atom simulation

Author: Bernreuther Martin
Bungartz Hans-Joachim
Glass Colin W.
Gratl Fabio
Hammer Nicolay
Hasse Hans
Heinen Matthias
Horsch Martin
Kranzlmüller Dieter
Krischok Bernd
Neumann Philipp
Niethammer Christoph
Resch Michael
Seckler Steffen
Tchipev Nikola
Vrabec Jadran
Publication venue
Publication date: 17/10/2019
Field of study

Significant improvements are presented for the molecular dynamics code ls1 mardyn — a linked cell-based code for simulating a large number of small, rigid molecules with application areas in chemical engineering. The changes consist of a redesign of the SIMD vectorization via wrappers, MPI improvements and a software redesign to allow memory-efficient execution with the production trunk to increase portability and extensibility. Two novel, memory-efficient OpenMP schemes for the linked cell-based force calculation are presented, which are able to retain Newton’s third law optimization. Comparisons to well-optimized Verlet list-based codes, such as LAMMPS and GROMACS, demonstrate the viability of the linked cell-based approach. The present version of ls1 mardyn is used to run simulations on entire supercomputers, maximizing the number of sampled atoms. Compared to the preceding version of ls1 mardyn on the entire set of 9216 nodes of SuperMUC, Phase 1, 27% more atoms are simulated. Weak scaling performance is increased by up to 40% and strong scaling performance by up to more than 220%. On Hazel Hen, strong scaling efficiency of up to 81% and 189 billion molecule updates per second is attained, when scaling from 8 to 7168 nodes. Moreover, a total of 20 trillion atoms is simulated at up to 88% weak scaling efficiency running at up to 1.33 PFLOPS. This represents a fivefold increase in terms of the number of atoms simulated to date.BMBF, 01IH16008, Verbundprojekt: TaLPas - Task-basierte Lastverteilung und Auto-Tuning in der Partikelsimulatio

DepositOnce

Algorithmische und Code-Optimierungen Molekulardynamiksimulationen für Verfahrenstechnik

Author: Tchipev Nikola Plamenov
Publication venue: Technische Universität München
Publication date
Field of study

The focus of this work lies on implementational improvements and, in particular, node-level performance optimization of the simulation software ls1-mardyn. Through data structure improvements, SIMD vectorization and, especially, OpenMP parallelization, the world’s first simulation of 2*1013 molecules at over 1 PFLOP/sec was enabled. To allow for long-range interactions, the Fast Multipole Method was introduced to ls1-mardyn. The algorithm was optimized for sequential, shared-memory, and distributed-memory execution on up to 32,768 MPI processes.Der Fokus dieser Arbeit liegt auf Code-Optimierungen und insbesondere Leistungsoptimierung auf Knoten-Ebene für die Simulationssoftware ls1-mardyn. Durch verbesserte Datenstrukturen, SIMD-Vektorisierung und vor allem OpenMP-Parallelisierung wurde die weltweit erste Petaflop-Simulation von 2*1013 Molekülen ermöglicht. Zur Simulation von langreichweitigen Wechselwirkungen wurde die Fast-Multipole-Methode in ls1-mardyn eingeführt. Sequenzielle, Shared- und Distributed-Memory-Optimierungen wurden angewandt und erlaubten eine Ausführung auf bis zu 32768 MPI-Prozessen

GPU fast multipole method with lambda-dynamics features

Author: Kohnke Bartosz
Publication venue: University Goettingen Repository
Publication date: 24/11/2020
Field of study

A significant and computationally most demanding part of molecular dynamics simulations is the calculation of long-range electrostatic interactions. Such interactions can be evaluated directly by the naïve pairwise summation algorithm, which is a ubiquitous showcase example for the compute power of graphics processing units (GPUS). However, the pairwise summation has O(N^2) computational complexity for N interacting particles; thus, an approximation method with a better scaling is required. Today, the prevalent method for such approximation in the field is particle mesh Ewald (PME). PME takes advantage of fast Fourier transforms (FFTS) to approximate the solution efficiently. However, as the underlying FFTS require all-to-all communication between ranks, PME runs into a communication bottleneck. Such communication overhead is negligible only for a moderate parallelization. With increased parallelization, as needed for high-performance applications, the usage of PME becomes unprofitable. Another PME drawback is its inability to perform constant pH simulations efficiently. In such simulations, the protonation states of a protein are allowed to change dynamically during the simulation. The description of this process requires a separate evaluation of the energies for each protonation state. This can not be calculated efficiently with PME as the algorithm requires a repeated FFT for each state, which leads to a linear overhead with respect to the number of states. For a fast approximation of pairwise Coulombic interactions, which does not suffer from PME drawbacks, the Fast Multipole Method (FMM) has been implemented and fully parallelized with CUDA. To assure the optimal FMM performance for diverse MD systems multiple parallelization strategies have been developed. The algorithm has been efficiently incorporated into GROMACS and subsequently tested to determine the optimal FMM parameter set for MD simulations. Finally, the FMM has been incorporated into GROMACS to allow for out-of-the-box electrostatic calculations. The performance of the single-GPU FMM implementation, tested in GROMACS 2019, achieves about a third of highly optimized CUDA PME performance when simulating systems with uniform particle distributions. However, the FMM is expected to outperform PME at high parallelization because the FMM global communication overhead is minimal compared to that of PME. Further, the FMM has been enhanced to provide the energies of an arbitrary number of titratable sites as needed in the constant-pH method. The extension is not fully optimized yet, but the first results show the strength of the FMM for constant pH simulations. For a relatively large system with half a million particles and more than a hundred titratable sites, a straightforward approach to compute alternative energies requires the repetition of a simulation for each state of the sites. The FMM calculates all energy terms only a factor 1.5 slower than a single simulation step. Further improvements of the GPU implementation are expected to yield even more speedup compared to the actual implementation.2021-11-1

Georg-August-University Göttingen

Software for Exascale Computing - SPPEXA 2016-2019

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

OAPEN Library

Multi-Scale Simulations of Collagen Failure and Mechanoradicals

Author: Rennekamp Benedikt
Publication venue
Publication date: 01/01/2023
Field of study

Collagen, the most abundant protein in the human body, must withstand high mechanical loads due to its structural role in tendons, skin, bones, and other connective tissue. It was recently found that tensed collagen creates mechanoradicals by homolytic bond scission in the sub-failure regime. The locations and types of initial rupture sites critically decide on both the mechanical and chemical impact of these micro-ruptures on the tissue, but are yet to be explored. We here employ hybrid scale-bridging simulations to determine these first breakage points in collagen, combining existing and newly developed methods tailored towards collagen’s hierarchical structure. We improved our Kinetic Monte Carlo/Molecular Dynamics scheme to simulate bond scissions at the all-atom level, and also developed a mesoscopic ultra coarse-grained description of a collagen fibril. We find collagen crosslinks to rupture first, and identify individual sacrificial bonds in trivalent crosslinks that break preferentially, without compromising structural integrity. Collagen’s weak bonds funnel ruptures such that the potentially harmful mechanoradicals are readily stabilized. Our simulations further suggest the length of helices between pairs of crosslinks to determine the trade-off between overall strength and breakage specificity. The combined results suggest this unique failure mode of collagen to be tailored towards combatting an early onset of macroscopic failure and material ageing

Heidelberger Dokumentenserver

Recommended from our members

FabSim3: An automation toolkit for verified simulations using high performance computing

Author: Arabnejad H
Bronik K
Coveney P
Edeling W
Groen D
Monnier N
Raffin E
Suleimenova D
Xue Y
Publication venue: 'Elsevier BV'
Publication date: 24/11/2022
Field of study

A common feature of computational modelling and simulation research is the need to perform many tasks in complex sequences to achieve a usable result. This will typically involve tasks such as preparing input data, pre-processing, running simulations on a local or remote machine, post-processing, and performing coupling communications, validations and/or optimisations. Tasks like these can involve manual steps which are time and effort intensive, especially when it involves the management of large ensemble runs. Additionally, human errors become more likely and numerous as the research work becomes more complex, increasing the risk of damaging the credibility of simulation results. Automation tools can help ensure the credibility of simulation results by reducing the manual time and effort required to perform these research tasks, by making more rigorous procedures tractable, and by reducing the probability of human error due to a reduced number of manual actions. In addition, efficiency gained through automation can help researchers to perform more research within the budget and effort constraints imposed by their projects. This paper presents the main software release of FabSim3, and explains how our automation toolkit can improve and simplify a range of tasks for researchers and application developers. FabSim3 helps to prepare, submit, execute, retrieve, and analyze simulation workflows. By providing a suitable level of abstraction, FabSim3 reduces the complexity of setting up and managing a large-scale simulation scenario, while still providing transparent access to the underlying layers for effective debugging. The tool also facilitates job submission and management (including staging and curation of files and environments) for a range of different supercomputing environments. Although FabSim3 itself is application-agnostic, it supports a provably extensible plugin system where users automate simulation and analysis workflows for their own application domains. To highlight this, we briefly describe a selection of these plugins and we demonstrate the efficiency of the toolkit in handling large ensemble workflows.EPSRC under grant agreement EP/W007711/1, as well as by the VECMA and HiDALGO projects, which have received funding from the European Union Horizon 2020 research and innovation programme under grant agreement nos 800925 and 824115. In addition, FabFlee was supported by the ITFLOWS project and FabCovid19 by the STAMINA project, both of which have received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 882986 and No 883441 respectivel

Brunel University Research Archive

FabSim3: An automation toolkit for verified simulations using high performance computing

Author: Arabnejad H. (Hamid)
Bronik K. (Kevin)
Coveney P.V. (Peter)
Edeling W.N. (Wouter)
Groen D. (Derek)
Monnier N. (Nicolas)
Raffin E. (Erwan)
Suleimenova D. (Diana)
Xue Y. (Yani)
Publication venue: 'Elsevier BV'
Publication date: 24/11/2022
Field of study

CWI's Institutional Repository

UCL Discovery

Brunel University Research Archive