Search CORE

2 research outputs found

Transformations of High-Level Synthesis Codes for High-Performance Computing

Author: Besta Maciej
Hoefler Torsten
Licht Johannes de Fine
Meierhans Simon
Publication venue
Publication date: 29/10/2019
Field of study

Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms. While this has enabled a wider audience to target specialized hardware, the optimization principles known from traditional software design are no longer sufficient to implement high-performance codes. Fast and efficient codes for reconfigurable platforms are thus still challenging to design. To alleviate this, we present a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications. Our work provides a toolbox for developers, where we systematically identify classes of transformations, the characteristics of their effect on the HLS code and the resulting hardware (e.g., increases data reuse or resource consumption), and the objectives that each transformation can target (e.g., resolve interface contention, or increase parallelism). We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures. To quantify the effect of our transformations, we use them to optimize a set of throughput-oriented FPGA kernels, demonstrating that our enhancements are sufficient to scale up parallelism within the hardware constraints. With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS

arXiv.org e-Print Archive

Repository for Publications and Research Data

Parallel framework for earthquake induced response computation of the SDOF structure

Author: A. B. M. Saiful Islam
Raja Rizwan Hussain
Sarfraz Munir
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 01/03/2014
Field of study

Parallel computing briskly diminishes computation time through simultaneous use of multiple computing resources. In this research, parallel computing techniques have been developed to parallelize a program for obtaining a response of single degree of freedom (SDOF) structure under earthquake loading. The study uses Distributed Memory Processors (DMP) hardware architecture and Message Passing Interface (MPI) compilers directives to parallelize the program. The program is made parallel by domain decomposition. Concurrency in the program is created by dividing the program into two parts to run on different computers, calculating forced response and free response of the first half and the second half. Parallel framework successfully creates concurrency and finds structural responses in significant lesser time than sequential programs

Directory of Open Access Journals

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)