Search CORE

7,869 research outputs found

Efficiency analysis methodology of FPGAs based on lost frequencies, area and cycles

Author: Braeken An
Cornelis Jan G.
da Silva Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

We propose a methodology to study and to quantify efficiency and the impact of overheads on runtime performance. Most work on High-Performance Computing (HPC) for FPGAs only studies runtime performance or cost, while we are interested in how far we are from peak performance and, more importantly, why. The efficiency of runtime performance is defined with respect to the ideal computational runtime in absence of inefficiencies. The analysis of the difference between actual and ideal runtime reveals the overheads and bottlenecks. A formal approach is proposed to decompose the efficiency into three components: frequency, area and cycles. After quantification of the efficiencies, a detailed analysis has to reveal the reasons for the lost frequencies, lost area and lost cycles. We propose a taxonomy of possible causes and practical methods to identify and quantify the overheads. The proposed methodology is applied on a number of use cases to illustrate the methodology. We show the interaction between the three components of efficiency and show how bottlenecks are revealed

Ghent University Academic Bibliography

Performance and toolchain of a combined GPU/FPGA desktop

Author: Braeken An
Cornelis Jan G.
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2013
Field of study

Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculations. Combining the power of GPUs with the flexibility of FPGAs enlarges the scope of problems that can be accelerated. We describe the performance analysis of a desktop equipped with a GPU Tesla 2050 and an FPGA Virtex- 6 LX 240T. The balance between the I/O and the raw peak performance is analyzed using the roofline model. A well-tuned accelerator- based codesign, identifying the parallelism, the computation and data patterns of different classes of algorithms, will enable to maximize the performance of the combined GPU/FPGA system

Ghent University Academic Bibliography

Study of combining GPU/FPGA accelerators for high-performance computing

Author: Braeken An
Cornelis Jan G
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: HiPEAC
Publication date: 01/01/2013
Field of study

This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerators, using OpenCL for the GPU and a high-level synthesis compiler for the FPGAs. The performance model is used to evaluate the different high-level synthesis optimizations, taking into account the resource usage, and to compare the compute power of the FPGA with the GP

Ghent University Academic Bibliography

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

Author: Braeken An
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography