Search CORE

777 research outputs found

SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App

Author: Cabezón Rubén M.
Cavelan Aurélien
Ciorba Florina M.
Guerrera Danilo
Imbert David
Mayer Lucio
Mohammed Ali
Piccinali Jean-Guillaume
Reed Darren
Publication venue
Publication date: 01/01/2019
Field of study

Numerical simulations of fluids in astrophysics and computational fluid dynamics (CFD) are among the most computationally-demanding calculations, in terms of sustained floating-point operations per second, or FLOP/s. It is expected that these numerical simulations will significantly benefit from the future Exascale computing infrastructures, that will perform 10^18 FLOP/s. The performance of the SPH codes is, in general, adversely impacted by several factors, such as multiple time-stepping, long-range interactions, and/or boundary conditions. In this work an extensive study of three SPH implementations SPHYNX, ChaNGa, and XXX is performed, to gain insights and to expose any limitations and characteristics of the codes. These codes are the starting point of an interdisciplinary co-design project, SPH-EXA, for the development of an Exascale-ready SPH mini-app. We implemented a rotating square patch as a joint test simulation for the three SPH codes and analyzed their performance on a modern HPC system, Piz Daint. The performance profiling and scalability analysis conducted on the three parent codes allowed to expose their performance issues, such as load imbalance, both in MPI and OpenMP. Two-level load balancing has been successfully applied to SPHYNX to overcome its load imbalance. The performance analysis shapes and drives the design of the SPH-EXA mini-app towards the use of efficient parallelization methods, fault-tolerance mechanisms, and load balancing approaches.Comment: arXiv admin note: substantial text overlap with arXiv:1809.0801

arXiv.org e-Print Archive

edoc

Simulation of a flowing snow avalanche using molecular dynamics

Author: Güçer D.
Özgüç H.B.
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 01/01/2014
Field of study

This paper presents an approach for the modeling and simulation of a flowing snow avalanche, which is formed of dry and liquefied snow that slides down a slope, using molecular dynamics and the discrete element method. A particle system is utilized as a base method for the simulation and marching cubes with real-time shaders are employed for rendering. A uniform grid-based neighbor search algorithm is used for collision detection for interparticle and particleterrain interactions. A mass-spring model of the collision resolution is employed to mimic the compressibility of the snow and particle attraction forces are put into use between the particles and terrain surface. In order to achieve greater performance, general purpose GPU language and multithreaded programming are utilized for collision detection and resolution. The results are displayed with different combinations of rendering methods for the realistic representation of the flowing avalanche. © TÜB̄TAK

Bilkent University Institutional Repository

Parallel cloth simulation using OpenMp and CUDA

Author: Sims Gillian David
Publication venue: LSU Digital Commons
Publication date: 01/01/2009
Field of study

The widespread availability of parallel computing architectures has lead to research regarding algorithms and techniques that best exploit available parallelism. In addition to the CPU parallelism available; the GPU has emerged as a parallel computational device. The goal of this study was to explore the combined use of CPU and GPU parallelism by developing a hybrid parallel CPU/GPU cloth simulation application. In order to evaluate the benefits of the hybrid approach, the application was first developed in sequential CPU form, followed by a parallel CPU form. The application uses Backward Euler implicit time integration to solve the differential equations of motion associated with the physical system. The Conjugate Gradient (CG) algorithm is used to determine the solution vector for the system of equations formed by the Backward Euler approach. The matrix/vector, vector/vector, and vector/scalar operations required by CG are handled by calls to BLAS level 1 and level 2 functions. In the sequential CPU and parallel CPU versions, the Intel Math Kernel Library implementation of BLAS is used. In the hybrid parallel CPU/GPU version, the Nvidia CUDA based BLAS implementation (CUBLAS) is used. In the parallel CPU and hybrid implementations, OpenMP directives are used to parallelize the force application loop that traverses the list of forces acting on the system. Runtimes were collected for each version of the application while simulating cloth meshes with particle resolutions of 20x20, 40x40, and 60x60. The performance of each version was compared at each mesh resolution. The level of performance degradation experienced when transitioning to the larger mesh sizes was also determined. The hybrid parallel CPU/GPU implementation yielded the highest frame rate for the 40x40 and 60x60 meshes. The parallel CPU implementation yielded the highest frame rate for the 20x20 mesh. The performance of the hybrid parallel CPU/GPU implementation degraded the least as it transitioned to the two larger mesh sizes. The results of this study will potentially lead to further research regarding the use of GPUs to perform the matrix/vector operations associated with the CG algorithm under more complex cloth simulation scenarios

Louisiana State University

Parallel packing code for propellant microstructure analysis

Author: Baietta Alessandro
Maggi Filippo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In recent years, packing codes have become a successful alternative to experimental data collection for microstructure investigation of heterogeneous materials. Composite solid rocket propellants are interesting representatives of this category, consisting of a mix of fuel and oxidizer powders embedded in a polymeric binder. Their macroscopic properties are strictly dependent on the peculiar microstructure, which influences mechanical, combustion, as well as physical features. This work addresses algorithm development, validation, and scalability of POLIPack, a parallel packing code based on the Lubachevsky–Stillinger algorithm, developed at the Space Propulsion Laboratory (SPLab) of Politecnico di Milano. The application can reproduce the organization of spheres of any diameter inside a cube with periodic boundary. In addition to the general code description, the paper identifies a collision condition not addressed by the original Lubachevsky's algorithm (here called back impact), introduces a novel post-impact handling granting a minimum separation velocity between particles, and presents a parallelization approach based on OpenMP shared memory paradigm. Monomodal and bimodal packs have been compared to experimental data through statistic descriptors and packing maps

Archivio istituzionale della ricerca - Politecnico di Milano

A pipeline virtual environment architecture for multicore processor systems

Author: Alan Liu
B. Thomaszewski
C. Augonnet
C.-K. Luk
C.M. Wittenbrink
E. Acosta
E. Hermann
E. Hermann
Eric Acosta
F. Wieland
G. Humphreys
G. Voß
H.T. Vo
J. Allard
J. Allard
J. Allard
J. Huang
J.P. Schulze
J.R. Wernsing
K. Montgomery
L. Deligiannidis
L. Jerabkova
M. Agus
M. Figueiredo
M. Frigo
M. Kicherer
M.C. Cavusoglu
M.C. Cavusoglu
N.K. Govindaraju
S. Eilemann
S. Molnar
S. Muraki
T. Gautier
T.S.M.C. Farias
W. Gropp
W. Huagen
W. Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Simple and Robust Boolean Operations for Triangulated Surfaces

Author: Mei Gang
Tipper John C.
Publication venue
Publication date: 25/08/2013
Field of study

Boolean operations of geometric models is an essential issue in computational geometry. In this paper, we develop a simple and robust approach to perform Boolean operations on closed and open triangulated surfaces. Our method mainly has two stages: (1) We firstly find out candidate intersected-triangles pairs based on Octree and then compute the inter-section lines for all pairs of triangles with parallel algorithm; (2) We form closed or open intersection-loops, sub-surfaces and sub-blocks quite robustly only according to the cleared and updated topology of meshes while without coordinate computations for geometric enti-ties. A novel technique instead of inside/outside classification is also proposed to distinguish the resulting union, subtraction and intersection. Several examples have been given to illus-trate the effectiveness of our approach.Comment: Novel method for determining Union, Subtraction and Intersectio

arXiv.org e-Print Archive