570 research outputs found

    Acceleration of Coarse Grain Molecular Dynamics on GPU Architectures

    Get PDF
    Coarse grain (CG) molecular models have been proposed to simulate complex sys- tems with lower computational overheads and longer timescales with respect to atom- istic level models. However, their acceleration on parallel architectures such as Graphic Processing Units (GPU) presents original challenges that must be carefully evaluated. The objective of this work is to characterize the impact of CG model features on parallel simulation performance. To achieve this, we implemented a GPU-accelerated version of a CG molecular dynamics simulator, to which we applied specic optimizations for CG models, such as dedicated data structures to handle dierent bead type interac- tions, obtaining a maximum speed-up of 14 on the NVIDIA GTX480 GPU with Fermi architecture. We provide a complete characterization and evaluation of algorithmic and simulated system features of CG models impacting the achievable speed-up and accuracy of results, using three dierent GPU architectures as case studie

    Multi-Architecture Monte-Carlo (MC) Simulation of Soft Coarse-Grained Polymeric Materials: SOft coarse grained Monte-carlo Acceleration (SOMA)

    Full text link
    Multi-component polymer systems are important for the development of new materials because of their ability to phase-separate or self-assemble into nano-structures. The Single-Chain-in-Mean-Field (SCMF) algorithm in conjunction with a soft, coarse-grained polymer model is an established technique to investigate these soft-matter systems. Here we present an im- plementation of this method: SOft coarse grained Monte-carlo Accelera- tion (SOMA). It is suitable to simulate large system sizes with up to billions of particles, yet versatile enough to study properties of different kinds of molecular architectures and interactions. We achieve efficiency of the simulations commissioning accelerators like GPUs on both workstations as well as supercomputers. The implementa- tion remains flexible and maintainable because of the implementation of the scientific programming language enhanced by OpenACC pragmas for the accelerators. We present implementation details and features of the program package, investigate the scalability of our implementation SOMA, and discuss two applications, which cover system sizes that are difficult to reach with other, common particle-based simulation methods

    More Bang for Your Buck: Improved use of GPU Nodes for GROMACS 2018

    Get PDF
    We identify hardware that is optimal to produce molecular dynamics trajectories on Linux compute clusters with the GROMACS 2018 simulation package. Therefore, we benchmark the GROMACS performance on a diverse set of compute nodes and relate it to the costs of the nodes, which may include their lifetime costs for energy and cooling. In agreement with our earlier investigation using GROMACS 4.6 on hardware of 2014, the performance to price ratio of consumer GPU nodes is considerably higher than that of CPU nodes. However, with GROMACS 2018, the optimal CPU to GPU processing power balance has shifted even more towards the GPU. Hence, nodes optimized for GROMACS 2018 and later versions enable a significantly higher performance to price ratio than nodes optimized for older GROMACS versions. Moreover, the shift towards GPU processing allows to cheaply upgrade old nodes with recent GPUs, yielding essentially the same performance as comparable brand-new hardware.Comment: 41 pages, 13 figures, 4 tables. This updated version includes the following improvements: - most notably, added benchmarks for two coarse grain MARTINI systems VES and BIG, resulting in a new Figure 13 - fixed typos - made text clearer in some places - added two more benchmarks for MEM and RIB systems (E3-1240v6 + RTX 2080 / 2080Ti

    NBSymple, a double parallel, symplectic N-body code running on Graphic Processing Units

    Full text link
    We present and discuss the characteristics and performances, both in term of computational speed and precision, of a numerical code which numerically integrates the equation of motions of N 'particles' interacting via Newtonian gravitation and move in an external galactic smooth field. The force evaluation on every particle is done by mean of direct summation of the contribution of all the other system's particle, avoiding truncation error. The time integration is done with second-order and sixth-order symplectic schemes. The code, NBSymple, has been parallelized twice, by mean of the Computer Unified Device Architecture to make the all-pair force evaluation as fast as possible on high-performance Graphic Processing Units NVIDIA TESLA C 1060, while the O(N) computations are distributed on various CPUs by mean of OpenMP Application Program. The code works both in single precision floating point arithmetics or in double precision. The use of single precision allows the use at best of the GPU performances but, of course, limits the precision of simulation in some critical situations. We find a good compromise in using a software reconstruction of double precision for those variables that are most critical for the overall precision of the code. The code is available on the web site astrowww.phys.uniroma1.it/dolcetta/nbsymple.htmlComment: Paper composed by 29 pages, including 9 figures. Submitted to New Astronomy

    Graphics Processing Unit Accelerated Coarse-Grained Protein-Protein Docking

    Get PDF
    Graphics processing unit (GPU) architectures are increasingly used for general purpose computing, providing the means to migrate algorithms from the SISD paradigm, synonymous with CPU architectures, to the SIMD paradigm. Generally programmable commodity multi-core hardware can result in significant speed-ups for migrated codes. Because of their computational complexity, molecular simulations in particular stand to benefit from GPU acceleration. Coarse-grained molecular models provide reduced complexity when compared to the traditional, computationally expensive, all-atom models. However, while coarse-grained models are much less computationally expensive than the all-atom approach, the pairwise energy calculations required at each iteration of the algorithm continue to cause a computational bottleneck for a serial implementation. In this work, we describe a GPU implementation of the Kim-Hummer coarse-grained model for protein docking simulations, using a Replica Exchange Monte-Carlo (REMC) method. Our highly parallel implementation vastly increases the size- and time scales accessible to molecular simulation. We describe in detail the complex process of migrating the algorithm to a GPU as well as the effect of various GPU approaches and optimisations on algorithm speed-up. Our benchmarking and profiling shows that the GPU implementation scales very favourably compared to a CPU implementation. Small reference simulations benefit from a modest speedup of between 4 to 10 times. However, large simulations, containing many thousands of residues, benefit from asynchronous GPU acceleration to a far greater degree and exhibit speed-ups of up to 1400 times. We demonstrate the utility of our system on some model problems. We investigate the effects of macromolecular crowding, using a repulsive crowder model, finding our results to agree with those predicted by scaled particle theory. We also perform initial studies into the simulation of viral capsids assembly, demonstrating the crude assembly of capsid pieces into a small fragment. This is the first implementation of REMC docking on a GPU, and the effectuate speed-ups alter the tractability of large scale simulations: simulations that otherwise require months or years can be performed in days or weeks using a GPU

    Soft Computing Techiniques for the Protein Folding Problem on High Performance Computing Architectures

    Get PDF
    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.This work is jointly supported by the FundaciónSéneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants 15290/PI/2010 and 18946/JLI/13, by the Spanish MEC and European Commission FEDER under grant with reference TEC2012-37945-C02-02 and TIN2012-31345, by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF). We also thank NVIDIA for hardware donation within UCAM GPU educational and research centers.Ingeniería, Industria y Construcció
    corecore