Search CORE

5,736 research outputs found

On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

Author: Bussmann Michael
Choi Jong Youl
Huebl Axel
Klasky Scott
Matthes Alexander
Podhorszki Norbert
Schmitt Felix
Widera Rene
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2017
Field of study

We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threaded data-transformations for the I/O library ADIOS as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/O latency.Comment: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'1

arXiv.org e-Print Archive

Crossref

Dynamical heterogeneities as fingerprints of a backbone structure in Potts models

Author: D. P. Landau
E. E. Ferrero
F. Romá
P. M. Gleiser
S. Bustingorry
Publication venue: 'American Physical Society (APS)'
Publication date: 12/09/2012
Field of study

We investigate slow non-equilibrium dynamical processes in two-dimensional

q

--state Potts model with both ferromagnetic and

\pm J

couplings. Dynamical properties are characterized by means of the mean-flipping time distribution. This quantity is known for clearly unveiling dynamical heterogeneities. Using a two-times protocol we characterize the different time scales observed and relate them to growth processes occurring in the system. In particular we target the possible relation between the different time scales and the spatial heterogeneities originated in the ground state topology, which are associated to the presence of a backbone structure. We perform numerical simulations using an approach based on graphics processing units (GPUs) which permits to reach large system sizes. We present evidence supporting both the idea of a growing process in the preasymptotic regime of the glassy phases and the existence of a backbone structure behind this processes.Comment: 9 pages, 7 figures, Accepted for publication in PR

arXiv.org e-Print Archive

Crossref

Topology-aware GPU scheduling for learning workloads in cloud environments

Author: Amaral Marcelo
Carrera David
Polo Bardés Jordà
Seelam Seetharami
Steinder Malgorzata
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2017
Field of study

Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef and Asser Tantawi for the valuable discussions. We also thank SC17 committee member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Acceleration of Coarse Grain Molecular Dynamics on GPU Architectures

Author: Anderson
Bauer
Berendsen
Brown
Brown
Colberg
Dullweber
Friedrichs
Ganesan
Gay
Harvey
Högberg
Liu
Liu
MacCallum
Mourtisen
Müller
Nguyen
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Phillips
Plimpton
Rapaport
Rapaport
Schmid
Stone
Stone
Stone
Sunarso
van Meel
Wang
Wohlert
Zhmurov
Publication venue: John Wiley & Sons Limited:1 Oldlands Way, Bognor Regis, P022 9SA United Kingdom:011 44 1243 779777, EMAIL: [email protected], INTERNET: http://www.wiley.co.uk, Fax: 011 44 1243 843232
Publication date: 01/01/2013
Field of study

Coarse grain (CG) molecular models have been proposed to simulate complex sys- tems with lower computational overheads and longer timescales with respect to atom- istic level models. However, their acceleration on parallel architectures such as Graphic Processing Units (GPU) presents original challenges that must be carefully evaluated. The objective of this work is to characterize the impact of CG model features on parallel simulation performance. To achieve this, we implemented a GPU-accelerated version of a CG molecular dynamics simulator, to which we applied specic optimizations for CG models, such as dedicated data structures to handle dierent bead type interac- tions, obtaining a maximum speed-up of 14 on the NVIDIA GTX480 GPU with Fermi architecture. We provide a complete characterization and evaluation of algorithmic and simulated system features of CG models impacting the achievable speed-up and accuracy of results, using three dierent GPU architectures as case studie

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

PORTO Publications Open Repository TOrino

Monte Carlo algorithm based on internal bridging moves for the atomistic simulation of thiophene oligomers and polymers

Author: Mavrantzas Vlasis G.
Peristeras Loukas D.
Peroukidis Stavros D.
Tsourtou Flora D.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 04/07/2018
Field of study

We introduce a powerful Monte Carlo (MC) algorithm for the atomistic simulation of bulk models of oligo- and poly-thiophenes by redesigning MC moves originally developed for considerably simpler polymer structures and architectures, such as linear and branched polyethylene, to account for the ring structure of the thiophene monomer. Elementary MC moves implemented include bias reptation of an end thiophene ring, flip of an internal thiophene ring, rotation of an end thiophene ring, concerted rotation of three thiophene rings, rigid translation of an entire molecule, rotation of an entire molecule and volume fluctuation. In the implementation of all moves we assume that thiophene ring atoms remain rigid and strictly co-planar; on the other hand, inter-ring torsion and bond bending angles remain fully flexible subject to suitable potential energy functions. Test simulations with the new algorithm of an important thiophene oligomer, {\alpha}-sexithiophene ({\alpha}-6T), at a high enough temperature (above its isotropic-to-nematic phase transition) using a new united atom model specifically developed for the purpose of this work provide predictions for the volumetric, conformational and structural properties that are remarkably close to those obtained from detailed atomistic Molecular Dynamics (MD) simulations using an all-atom model. The new algorithm is particularly promising for exploring the rich (and largely unexplored) phase behavior and nanoscale ordering of very long (also more complex) thiophene-based polymers which cannot be addressed by conventional MD methods due to the extremely long relaxation times characterizing chain dynamics in these systems

arXiv.org e-Print Archive

FigShare