Search CORE

4 research outputs found

Exploiting FPGA-aware merging of custom instructions for runtime reconfiguration

Author: Clarke C.T.
Lam S.-K.
Srikanthan T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2014
Field of study

Runtime reconfiguration is a promising solution for reducing hardware cost in embedded systems, without compromising on performance. We present a framework that aims to increase the performance benefits of reconfigurable processors that support full or partial runtime reconfiguration. The proposed framework achieves this by: (1) providing a means for choosing suitable custom instruction selection heuristics, (2) leveraging FPGA-aware merging of custom instructions to maximize the reconfigurable logic block utilization in each configuration, and (3) incorporating a hierarchical loop partitioning strategy to reduce runtime reconfiguration overhead. We show that the performance gain can be improved by employing suitable custom instruction selection heuristics that, in turn, depend on the reconfigurable resource constraints and the merging factor (extent to which the selected custom instructions can be merged). The hierarchical loop partitioning strategy leads to an average performance gain of over 31% and 46% for full and partial runtime reconfiguration, respectively. Performance gain can be further increased to over 52% and 70% for full and partial runtime reconfiguration, respectively, by exploiting FPGA-aware merging of custom instructions.</jats:p

OPUS

Crossref

FPGA-aware techniques for rapid generation of profitable custom instructions

Author: Clarke C.T.
Lam S.-K.
Prakash A.
Srikanthan T.
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

OPUS

Crossref

Rapid evaluation of custom instruction selection approaches with FPGA estimation

Author: Clarke Christopher T.
Lam Siew Kei
Srikanthan Thambipillai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/03/2014
Field of study

The main aim of this article is to demonstrate that a fast and accurate FPGA estimation engine is indispensable in design flows for custom instruction (template) selection. The need for a FPGA estimation engine stems from the difficulty in predicting the FPGA performance measures of selected custom instructions. We will present a FPGA estimation technique that partitions the high-level representation of custom instructions into clusters based on the structural organization of the target FPGA, while taking into account general logic synthesis principles adopted by FPGA tools. In this work, we have evaluated a widely used graph covering algorithm with various heuristics for custom instruction selection. In addition, we present an algorithm called Refined Largest Fit First (RLFF) that relies on a graph covering heuristic to select non-overlapping superset templates, which typically incorporate frequently used basic templates. The initial solution is further refined by considering overlapping templates that were ignored previously to see if their introduction could lead to higher performance. While RLFF provides the most efficient cover compared to the ILP method and other graph covering heuristics, FPGA estimation results reveals that RLFF leads to the worst performance in certain applications. It is therefore a worthy proposition to equip design flows with accurate FPGA estimation in order to rapidly determine the most profitable custom instruction approach for a given application.</jats:p

OPUS

Crossref

Methoden zur applikationsspezifischen Effizienzsteigerung adaptiver Prozessorplattformen

Author: Tradowsky Carsten
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

General-Purpose Prozessoren sind für den durchschnittlichen Anwendungsfall optimiert, wodurch vorhandene Ressourcen nicht effizient genutzt werden. In der vorliegenden Arbeit wird untersucht, in wie weit es möglich ist, einen General-Purpose Prozessor an einzelne Anwendungen anzupassen und so die Effizienz zu steigern. Die Adaption kann zur Laufzeit durch das Prozessor- oder Laufzeitsystem anhand der jeweiligen Systemparameter erfolgen, um eine Effizienzsteigerung zu erzielen

KITopen