Search CORE

584 research outputs found

Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation

Author: G. Goumas
G. Schubert
G. Wellein
M. Butler
M.D. Piggott
N. Bell
P. Balaji
S. Williams
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The increasing number of processing elements and decreas- ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used scientific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD application code which uses PETSc as its linear solver engine, we evaluate the effect of explicit communication overlap using task-based parallelism and show how to further improve performance by explicitly load balancing threads within MPI processes. We demonstrate a significant speedup over the pure-MPI mode and efficient strong scaling of sparse matrix-vector multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

Author: Bosilca George
Faverge Mathieu
Lacoste Xavier
Ramet Pierre
Thibault Samuel
Publication venue
Publication date: 06/01/2014
Field of study

The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this paper, we study the benefits and limits of replacing the highly specialized internal scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and StarPU. The tasks graph of the factorization step is made available to the two runtimes, providing them the opportunity to process and optimize its traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its native internal scheduler, PaRSEC, and StarPU frameworks, on different execution environments, is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer.Comment: Heterogeneity in Computing Workshop (2014

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Head-On collisions of different initial data

Author: Bruegmann Bernd
Gonzalez Jose
Hannam Mark
Husa Sascha
Sperhake Ulrich
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 14/05/2007
Field of study

We discuss possible origins for discrepancies observed in the radiated energies in head-on collisions of non-spinning binaries starting from Brill-Lindquist and superposed Kerr-Schild data. For this purpose, we discuss the impact of different choices of gauge parameters and a small initial boost of the black holes.Comment: Proceedings of the Eleventh Marcel Grossmann Meeting; 3 pages (limit

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

Cactus: Issues for Sustainable Simulation Software

Author: Allen Gabrielle
Brandt Steven R.
Löffler Frank
Schnetter Erik
Publication venue
Publication date: 15/09/2013
Field of study

The Cactus Framework is an open-source, modular, portable programming environment for the collaborative development and deployment of scientific applications using high-performance computing. Its roots reach back to 1996 at the National Center for Supercomputer Applications and the Albert Einstein Institute in Germany, where its development jumpstarted. Since then, the Cactus framework has witnessed major changes in hardware infrastructure as well as its own community. This paper describes its endurance through these past changes and, drawing upon lessons from its past, also discusses futureComment: submitted to the Workshop on Sustainable Software for Science: Practice and Experiences 201

arXiv.org e-Print Archive

Directory of Open Access Journals

On the Easy Use of Scientific Computing Services for Large Scale Linear Algebra and Parallel Decision Making with the P-Grade Portal

Author: A Gabrielle
A Hurault
Aurelie Hurault
E Caron
H Astsatryan
HA Vorst van der
Harutyun Terzyan
Hrachya Astsatryan
J Czyzyk
Levon Hovhannisyan
M Arioli
M Arioli
M Daydé
Michel Daydé
P Kacsuk
P Kacsuk
Ronan Guivarch
S Blackford
S Filippone
S Tomov
V Ardizzone
Vladimir Sahakyan
Yuri Shoukouryan
Z Farkas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2013
Field of study

International audienceScientific research is becoming increasingly dependent on the large-scale analysis of data using distributed computing infrastructures (Grid, cloud, GPU, etc.). Scientific computing (Petitet et al. 1999) aims at constructing mathematical models and numerical solution techniques for solving problems arising in science and engineering. In this paper, we describe the services of an integrated portal based on the P-Grade (Parallel Grid Run-time and Application Development Environment) portal (http://www.p-grade.hu) that enables the solution of large-scale linear systems of equations using direct solvers, makes easier the use of parallel block iterative algorithm and provides an interface for parallel decision making algorithms. The ultimate goal is to develop a single sign on integrated multi-service environment providing an easy access to different kind of mathematical calculations and algorithms to be performed on hybrid distributed computing infrastructures combining the benefits of large clusters, Grid or cloud, when needed

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte