Search CORE

33 research outputs found

Cost models for shared memory architectures

Author: LUPORINI FABIO
Publication venue: 'Pisa University Press'
Publication date: 14/12/2011
Field of study

We address the gap between structured parallel programming and parallel architectures by formalizing a cost model for shared memory architectures. The cost model captures most of architectural details (processors, memory hierarchy, interconnection network, etc.) to evaluate the under-load shared memory access latency. Analytical and Numerical resolution techniques will be provided and compared. The former ones will be based on Queueing Theory. The latter ones will resort on Markov Chains constructed by means of the stochastic process algebra PEPA

Electronic Thesis and Dissertation Archive - Università di Pisa

Devito: Towards a generic Finite Difference DSL using Symbolic Python

Author: Gorman Gerard
Kazakas Paulius
Kukreja Navjot
Lange Michael
Louboutin Mathias
Luporini Fabio
Pandolfo Vincenzo
Velesko Paulius
Vieira Felippe
Publication venue
Publication date: 01/01/2016
Field of study

Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerful mechanism to speed up scientific Python computation that goes beyond traditional vectorization and pre-compilation approaches, while allowing domain scientists to build applications within the comforts of the Python software ecosystem. In this paper we present Devito, a new finite difference DSL that provides optimized stencil computation from high-level problem specifications based on symbolic Python expressions. We demonstrate Devito's symbolic API and performance advantages over traditional Python acceleration methods before highlighting its use in the scientific context of seismic inversion problems.Comment: pyHPC 2016 conference submissio

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Automatic Differentiation for Adjoint Stencil Loops

Author: Gorman Gerard
Hovland Paul
Hückelheim Jan
Kukreja Navjot
Luporini Fabio
Narayanan Sri Hari Krishna
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/07/2019
Field of study

Stencil loops are a common motif in computations including convolutional neural networks, structured-mesh solvers for partial differential equations, and image processing. Stencil loops are easy to parallelise, and their fast execution is aided by compilers, libraries, and domain-specific languages. Reverse-mode automatic differentiation, also known as algorithmic differentiation, autodiff, adjoint differentiation, or back-propagation, is sometimes used to obtain gradients of programs that contain stencil loops. Unfortunately, conventional automatic differentiation results in a memory access pattern that is not stencil-like and not easily parallelisable. In this paper we present a novel combination of automatic differentiation and loop transformations that preserves the structure and memory access pattern of stencil loops, while computing fully consistent derivatives. The generated loops can be parallelised and optimised for performance in the same way and using the same tools as the original computation. We have implemented this new technique in the Python tool PerforAD, which we release with this paper along with test cases derived from seismic imaging and computational fluid dynamics applications.Comment: ICPP 201

arXiv.org e-Print Archive

Crossref