Search CORE

25 research outputs found

The HARWEST High Level Synthesis Flow to Design a Special-Purpose Architecture to Simulate the 3D Ising Model

Author: Italy}
Marongiu Alessandro
Palazzari Paolo
Publication venue: John von Neumann Institute for Computing
Publication date: 01/01/2007
Field of study

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms

arXiv.org e-Print Archive

TEXTAROSSA: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale

Author: Agosta Giovanni
Aldinucci Marco
Alvarez Carlos
Ammendola Roberto
Arfat Yasir
Beaumont Olivier
Bernaschi Massimo
Biagioni Andrea
Boccali Tommaso
Bramas Bérenger
Brandolese Carlo
Cantalupo Barbara
Cattaneo Daniele
Celino Massimo
Colonnelli Iacopo
Cretaro Paolo
d'Ambra Pasqua
Danelutto Marco
Esposito Roberto
Eyraud-Dubois Lionel
Filgueras Antonio
Fornaciari William
Frezza Ottorino
Galimberti Andrea
Giacomini Francesco
Goglin Brice
Guermouche Abdou
Iannone Francesco
Kulczewski Michal
Lo Cicero Francesca
Lonardo Alessandro
Martinelli Alberto,
Martorell Xavier
Massari Giuseppe
Mittone Gianluca
Montangero Simone
Namyst Raymond
Oleksiak Ariel
Palazzari Paolo
Reghenzani Federico
Saporana Sergio
Simula Francesca
Stanislao Paolucci Pier
Terraneo Federico
Thibault Samuel
Torquati Massimo
Turisini Matteo
Vicini Piero
Vidal Miquel
Zoni Davide
Zummo Giuseppe
Publication venue: HAL CCSD
Publication date: 01/09/2021
Field of study

International audienceTo achieve high performance and high energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetics; methods andtools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research

INRIA a CCSD electronic archive server

Massively Parallel Processing Approach To Fractal Image Compression

Author: Moreno Coli
Paolo Palazzari
Publication venue
Publication date
Field of study

In the last years Image Fractal Compression techniques (IFS) have gained ever more interest because of their capability to achieve high compression ratios while maintaining very good quality for the reconstructed image. The main drawback of such techniques is the very high computing time needed to determine the compressed code. In this paper, after a brief description of the IFS theory, we discuss its parallel implementation by comparing the different level of exploitable parallelism. In the paper we show that Massively Parallel Processing on SIMD machines is the best way to use all the large granularity parallelism present in this problem. Finally, we give some results achieved implementing IFS compression technique on the MPP APE100/Quadrics machine. 1. INTRODUCTION Image compression fractal techniques were introduced by Barnsley [Bar 88]. The image is represented through a piecewise linear contractive function F and is reconstructed by iteratively applying F to a randomly chosen st..

CiteSeerX

Hyper-Systolic Implementation of BLAS-3 Routines on the APE100/Quadrics Machine

Author: Marco Coletta
Paolo Palazzari
Thomas Lippert
Publication venue
Publication date: 01/01/1998
Field of study

. Basic Linear Algebra Subroutines (BLAS-3) [1] are building blocks to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their efficient implementation on a given parallel machine is a key issue for the maximal exploitation of the system's computational power. In this work we refer to a massively parallel processing SIMD machine (the APE100/Quadrics [2]) and to the adoption of the hyper-systolic method [3, 6, 4] to efficiently implement BLAS-3 on such a machine. The results we achieved (nearly 60-70% of the peak performances for large matrices) demonstrate the validity of the proposed approach. The work is structured as follows: section 1 is devoted to review BLAS-3, in section 2 we recall the hyper-systolic method, subsequently (section 3), the target machine is described and (section 4) the HS implementation is shown. Finally (section 5), some experimental results are given. Keywords: BLAS-3, hyper-systolic, massively p..

CiteSeerX

Juelich Shared Electronic Resources

Hyper-Systolic Processing on the Quadrics: Improving Inter-Processor Communication by Simulated Annealing

Author: Lippert Thomas
Palazzari Paolo
Schilling Klaus
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

Crossref

Juelich Shared Electronic Resources