Search CORE

442 research outputs found

Optimization Techniques for Stencil Data Parallel Programs: Methodologies and Applications

Author: LOTTARINI ANDREA
Publication venue: 'Pisa University Press'
Publication date: 18/07/2012
Field of study

The optimization of data parallel programs is a challenging open problem. We analyzed in detail the optimization techniques for stencil computations, which are a subset of data parallel computations. Drawing from previous research, we developed a structured model to describe the program transformations. We used this model to compare the different optimizations presented in literature and study the interaction between them

Electronic Thesis and Dissertation Archive - Università di Pisa

A multi-GPU shallow-water simulation with transport of contaminants

Author: Amor Margarita
Arenaz Silva Manuel
Castro M.J.
Doallo Ramón
Fraguela Basilio B.
García J.A.
Lobeiras Blanco Jacobo
Viñas Buceta Moisés
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

[Abstract] This work presents cost-effective multi-graphics processing unit (GPU) parallel implementations of a finite-volume numerical scheme for solving pollutant transport problems in bidimensional domains. The fluid is modeled by 2D shallow-water equations, whereas the transport of pollutant is modeled by a transport equation. The 2D domain is discretized using a first-order Roe finite-volume scheme. Specifically, this paper presents multi-GPU implementations of both a solution that exploits recomputation on the GPU and an optimized solution that is based on a ghost cell decoupling approach. Our multi-GPU implementations have been optimized using nonblocking communications, overlapping communications and computations and the application of ghost cell expansion to minimize communications. The fastest one reached a speedup of 78 × using four GPUs on an InfiniBand network with respect to a parallel execution on a multicore CPU with six cores and two-way hyperthreading per core. Such performance, measured using a realistic problem, enabled the calculation of solutions not only in real time but also in orders of magnitude faster than the simulated time.Copyright © 2012 John Wiley & Sons, Ltd

Repositorio da Universidade da Coruña

JAX-DIPS: Neural bootstrapping of finite discretization methods and application to elliptic problems with discontinuities

Author: Gibou Frederic
Ilango Rajesh
Mistani Pouria
Pakravan Samira
Publication venue
Publication date: 13/08/2023
Field of study

We present a scalable strategy for development of mesh-free hybrid neuro-symbolic partial differential equation solvers based on existing mesh-based numerical discretization methods. Particularly, this strategy can be used to efficiently train neural network surrogate models of partial differential equations by (i) leveraging the accuracy and convergence properties of advanced numerical methods, solvers, and preconditioners, as well as (ii) better scalability to higher order PDEs by strictly limiting optimization to first order automatic differentiation. The presented neural bootstrapping method (hereby dubbed NBM) is based on evaluation of the finite discretization residuals of the PDE system obtained on implicit Cartesian cells centered on a set of random collocation points with respect to trainable parameters of the neural network. Importantly, the conservation laws and symmetries present in the bootstrapped finite discretization equations inform the neural network about solution regularities within local neighborhoods of training points. We apply NBM to the important class of elliptic problems with jump conditions across irregular interfaces in three spatial dimensions. We show the method is convergent such that model accuracy improves by increasing number of collocation points in the domain and predonditioning the residuals. We show NBM is competitive in terms of memory and training speed with other PINN-type frameworks. The algorithms presented here are implemented using \texttt{JAX} in a software package named \texttt{JAX-DIPS} (https://github.com/JAX-DIPS/JAX-DIPS), standing for differentiable interfacial PDE solver. We open sourced \texttt{JAX-DIPS} to facilitate research into use of differentiable algorithms for developing hybrid PDE solvers

arXiv.org e-Print Archive

Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory

Author: Hager Georg
Wellein Gerhard
Wittmann Markus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/12/2009
Field of study

New algorithms and optimization techniques are needed to balance the accelerating trend towards bandwidth-starved multicore chips. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the pressure on the memory interface. We introduce a new pipelined approach that makes explicit use of shared caches in multicore environments and minimizes synchronization and boundary overhead. For clusters of shared-memory nodes we demonstrate how temporal blocking can be employed successfully in a hybrid shared/distributed-memory environment.Comment: 9 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Proceedings for the ICASE Workshop on Heterogeneous Boundary Conditions

Author: Perkins A. Louise
Scroggs Jeffrey S.
Publication venue
Publication date
Field of study

Domain Decomposition is a complex problem with many interesting aspects. The choice of decomposition can be made based on many different criteria, and the choice of interface of internal boundary conditions are numerous. The various regions under study may have different dynamical balances, indicating that different physical processes are dominating the flow in these regions. This conference was called in recognition of the need to more clearly define the nature of these complex problems. This proceedings is a collection of the presentations and the discussion groups

NASA Technical Reports Server

Facilitating the development of stencil applications using the Heterogeneous Programming Library

Author: Andrade Diego
Doallo Ramón
Fraguela Basilio B.
Viñas Buceta Moisés
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

[Abstract] Stencil computations are very common in scientific codes. Heterogeneous systems achieve good results solving these problems, but their programming is complex because of the ghost regions required in multi-device implementations and the difficulty to properly exploit their hardware. The Heterogeneous Programming Library (HPL) is a recent framework that improves the programmability of heterogeneous devices. This paper describes two extensions of HPL focused on stencil computations. The first one allows to automatically update the ghost regions they involve. The second one automates the implementation of the computational kernels of these algorithms. In our evaluation, the first mechanism reduces on average the number of lines of code and the Halstead programming effort of the host code of comparable HPL baselines by 34% and 64.2%, respectively, while the second contribution reduces these metrics by 72% and 79% in the computational kernels, respectively. Also, the first technique has negligible performance overheads, while the second one matches the performance of manually developed kernels. As an added benefit, the facilitation of the development of these codes thanks to these techniques helps programmers experiment with optimizations suited for this applications such as the ghost cell expansion technique, which provides speedups of up to 13% in our experiments.Ministerio de Economía y Competitividad de España; TIN2013-42148-PMinisterio de Economía y Competitividad de España; TIN2016-75845-PXunta de Galicia; ED431G/0

Repositorio da Universidade da Coruña

Multiscale Modeling with Differential Equations

Author: Astuto Clarissa
Russo Giovanni
Publication venue
Publication date: 02/09/2023
Field of study

Many physical systems are governed by ordinary or partial differential equations (see, for example, Chapter ''Differential equations'', ''System of Differential Equations''). Typically the solution of such systems are functions of time or of a single space variable (in the case of ODE's), or they depend on multidimensional space coordinates or on space and time (in the case of PDE's). In some cases, the solutions may depend on several time or space scales. An example governed by ODE's is the damped harmonic oscillator, in the two extreme cases of very small or very large damping, the cardiovascular system, where the thickness of the arteries and veins varies from centimeters to microns, shallow water equations, which are valid when water depth is small compared to typical wavelength of surface waves, and sorption kinetics, in which the range of interaction of a surfactant with an air bubble is much smaller than the size of the bubble itself. In all such cases a detailed simulation of the models which resolves all space or time scales is often inefficient or intractable, and usually even unnecessary to provide a reasonable description of the behavior of the system. In the Chapter ''Multiscale modeling with differential equations'' we present examples of systems described by ODE's and PDE's which are intrinsically multiscale, and illustrate how suitable modeling provide an effective way to capture the essential behavior of the solutions of such systems without resolving the small scales.Comment: 40 pages, 20 figures, to be published as a book chapter in a SIAM boo

arXiv.org e-Print Archive