Search CORE

1,773 research outputs found

Task-based adaptive multiresolution for time-space multi-scale reaction-diffusion systems on multi-core architectures

Author: Descombes Stéphane
Duarte Max
Dumont Thierry
Guillet Thomas
Louvet Violaine
Massot Marc
Publication venue: 'Cellule MathDoc/CEDRAM'
Publication date: 14/10/2016
Field of study

A new solver featuring time-space adaptation and error control has been recently introduced to tackle the numerical solution of stiff reaction-diffusion systems. Based on operator splitting, finite volume adaptive multiresolution and high order time integrators with specific stability properties for each operator, this strategy yields high computational efficiency for large multidimensional computations on standard architectures such as powerful workstations. However, the data structure of the original implementation, based on trees of pointers, provides limited opportunities for efficiency enhancements, while posing serious challenges in terms of parallel programming and load balancing. The present contribution proposes a new implementation of the whole set of numerical methods including Radau5 and ROCK4, relying on a fully different data structure together with the use of a specific library, TBB, for shared-memory, task-based parallelism with work-stealing. The performance of our implementation is assessed in a series of test-cases of increasing difficulty in two and three dimensions on multi-core and many-core architectures, demonstrating high scalability

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-UJM

The SMAI journal of computational mathematics

Numérisation de Documents Anciens Mathématiques

Hal-Diderot

HAL-Polytechnique

HAL-Rennes 1

From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

Author: Amini Parsa
Biddiscombe John
Daiß Gregor
Diehl Patrick
Frank Juhan
Huck Kevin
Kaiser Hartmut
Marcello Dominic
Pfander David
Pflüger Dirk
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/08/2019
Field of study

We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems, Octo-Tiger relies on high-level programming abstractions. We use HPX with its futurization capabilities to ensure scalability both between nodes and within, and present first results replacing MPI with libfabric achieving up to a 2.8x speedup. We extend Octo-Tiger to heterogeneous GPU-accelerated supercomputers, demonstrating node-level performance and portability. We show scalability up to full system runs on Piz Daint. For the scenario's maximum resolution, the compute-critical parts (hydrodynamics and gravity) achieve 68.1% parallel efficiency at 2048 nodes.Comment: Accepted at SC1

arXiv.org e-Print Archive

Crossref

Louisiana State University

Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures

Author: Cebrián Juan Manuel
Cecilia Canales José María
García Carrasco José Manuel
Hernández Mario
Imbernón Tudela Baldomero
Navarro Juan Miguel
Publication venue: Elsevier Ltd.
Publication date: 01/01/2015
Field of study

The diffusion equation model is a popular tool in room acoustics modeling. The 3-D Finite Difference (3D-FD) implementation predicts the energy decay function and the sound pressure level in closed environments. This simulation is computationally expensive, as it depends on the resolution used to model the room. With such high computational requirements, a high-level programming language (e.g., Matlab) cannot deal with real life scenario simulations. Thus, it becomes mandatory to use our computational resources more efficiently. Manycore architectures, such as NVIDIA GPUs or Intel Xeon Phi offer new opportunities to enhance scientific computations, increasing the performance per watt, but shifting to a different programming model. This paper shows the roadmap to use massively parallel architectures in a 3D-FD simulation. We evaluate the latest generation of NVIDIA and Intel architectures. Our experimental results reveal that NVIDIA architectures outperform by a wide margin the Intel Xeon Phi co-processor while dissipating approximately 50 W less (25%) for large-scale input problems.Ingeniería, Industria y Construcció

Institutional Repository UCAM

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Summary of research in applied mathematics, numerical analysis and computer science at the Institute for Computer Applications in Science and Engineering

Author
Publication venue
Publication date
Field of study

Research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis and computer science during the period October 1, 1983 through March 31, 1984 is summarized

NASA Technical Reports Server

ParMooN - a modernized program package based on mapped finite elements

Author: Ahmed Naveed
Alia Najib
Anker Felix
Bartsch Clemens
Blank Laura
Caiazzo Alfonso
Ganesan Sashikumaar
Giere Swetlana
John Volker
Matthies Gunar
Meesala Raviteja
Shamim Abdus
Venkatesan Jagannath
Wilbrandt Ulrich
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

{\sc ParMooN} is a program package for the numerical solution of elliptic and parabolic partial differential equations. It inherits the distinct features of its predecessor {\sc MooNMD} \cite{JM04}: strict decoupling of geometry and finite element spaces, implementation of mapped finite elements as their definition can be found in textbooks, and a geometric multigrid preconditioner with the option to use different finite element spaces on different levels of the multigrid hierarchy. After having presented some thoughts about in-house research codes, this paper focuses on aspects of the parallelization for a distributed memory environment, which is the main novelty of {\sc ParMooN}. Numerical studies, performed on compute servers, assess the efficiency of the parallelized geometric multigrid preconditioner in comparison with some parallel solvers that are available in the library {\sc PETSc}. The results of these studies give a first indication whether the cumbersome implementation of the parallelized geometric multigrid method was worthwhile or not.Comment: partly supported by European Union (EU), Horizon 2020, Marie Sk{\l}odowska-Curie Innovative Training Networks (ITN-EID), MIMESIS, grant number 67571

arXiv.org e-Print Archive

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Open Access Repository of IISc Research Publications

Repositorium für Naturwissenschaften und Technik

Designing a scalable dynamic load -balancing algorithm for pipelined single program multiple data applications on a non-dedicated heterogeneous network of workstations

Author: Osman Ashraf
Publication venue: The Research Repository @ WVU
Publication date: 01/12/2003
Field of study

Dynamic load balancing strategies have been shown to be the most critical part of an efficient implementation of various applications on large distributed computing systems. The need for dynamic load balancing strategies increases when the underlying hardware is a non-dedicated heterogeneous network of workstations (HNOW). This research focuses on the single program multiple data (SPMD) programming model as it has been extensively used in parallel programming for its simplicity and scalability in terms of computational power and memory size.;This dissertation formally defines and addresses the problem of designing a scalable dynamic load-balancing algorithm for pipelined SPMD applications on non-dedicated HNOW. During this process, the HNOW parameters, SPMD application characteristics, and load-balancing performance parameters are identified.;The dissertation presents a taxonomy that categorizes general load balancing algorithms and a methodology that facilitates creating new algorithms that can harness the HNOW computing power and still preserve the scalability of the SPMD application.;The dissertation devises a new algorithm, DLAH (Dynamic Load-balancing Algorithm for HNOW). DLAH is based on a modified diffusion technique, which incorporates the HNOW parameters. Analytical performance bound for the worst-case scenario of the diffusion technique has been derived.;The dissertation develops and utilizes an HNOW simulation model to conduct extensive simulations. These simulations were used to validate DLAH and compare its performance to related dynamic algorithms. The simulations results show that DLAH algorithm is scalable and performs well for both homogeneous and heterogeneous networks. Detailed sensitivity analysis was conducted to study the effects of key parameters on performance

The Research Repository @ WVU (West Virginia University)

Semiannual final report, 1 October 1991 - 31 March 1992

Author
Publication venue
Publication date
Field of study

A summary of research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science during the period 1 Oct. 1991 through 31 Mar. 1992 is presented

NASA Technical Reports Server

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Index to 1984 NASA Tech Briefs, volume 9, numbers 1-4

Author
Publication venue
Publication date: 01/02/1987
Field of study

Short announcements of new technology derived from the R&D activities of NASA are presented. These briefs emphasize information considered likely to be transferrable across industrial, regional, or disciplinary lines and are issued to encourage commercial application. This index for 1984 Tech B Briefs contains abstracts and four indexes: subject, personal author, originating center, and Tech Brief Number. The following areas are covered: electronic components and circuits, electronic systems, physical sciences, materials, life sciences, mechanics, machinery, fabrication technology, and mathematics and information sciences

NASA Technical Reports Server