Search CORE

88 research outputs found

Performance and resource modeling for FPGAs using high-level synthesis tools

Author: Braeken An
D'Hollander Erik
da Silva Gomes Bruno
Touhafi Abdellah
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

High-performance computing with FPGAs is gaining momentum with the advent of sophisticated High-Level Synthesis (HLS) tools. The performance of a design is impacted by the input-output bandwidth, the code optimizations and the resource consumption, making the performance estimation a challenge. This paper proposes a performance model which extends the roofline model to take into account the resource consumption and the parameters used in the HLS tools. A strategy is developed which maximizes the performance and the resource utilization within the area of the FPGA. The model is used to optimize the design exploration of a class of window-based image processing application

Ghent University Academic Bibliography

Software Implementation vs. Hardware Implementation: The Avionic Test System Case-Study

Author: Afonso George
Ben Atitallah Rabie
Dekeyser Jean-Luc
Publication venue: HAL CCSD
Publication date: 03/03/2012
Field of study

International audienceThis paper presents a development methodology that helps designers to map efficiently applications onto a heterogeneous CPU/FPGA system. An industrial case study is presented and aims at meeting performance and real-time requirements with the help of our architecture capabilities and avionic model parallelization. Different avionic model implementations will be presented in order to explain how to find the best trade-off between performance and design-time

HAL - Lille 3

INRIA a CCSD electronic archive server

A combined GPGPU-FPGA high-performance desktop

Author: Braeken An
Cornelis Jan
D'Hollander Erik
da Silva Gomes Bruno
Enescu Valentin
Lemeire Jan
Touhafi Abdellah
Publication venue
Publication date: 01/01/2012
Field of study

Computation of intensive interactive software applications on R&D desktops require a versatile hardware and software high-performance environment. Present-day solutions focus on one technology, e.g. GPUs, grids, multi-cores, clusters, … To leverage the power of different technologies, a hybrid solution is presented, combining the power of General-Purpose Graphical Processing units (GPGPUs) and Field Programmable Gate Arrays (FPGAs

Ghent University Academic Bibliography

GCC-Plugin for Automated Accelerator Generation and Integration on Hybrid FPGA-SoCs

Author: Castrillon Jeronimo
Hempel Gerald
Hochberger Christian
Vogt Markus
Publication venue
Publication date: 01/09/2015
Field of study

In recent years, architectures combining a reconfigurable fabric and a general purpose processor on a single chip became increasingly popular. Such hybrid architectures allow extending embedded software with application specific hardware accelerators to improve performance and/or energy efficiency. Aiding system designers and programmers at handling the complexity of the required process of hardware/software (HW/SW) partitioning is an important issue. Current methods are often restricted, either to bare-metal systems, to subsets of mainstream programming languages, or require special coding guidelines, e.g., via annotations. These restrictions still represent a high entry barrier for the wider community of programmers that new hybrid architectures are intended for. In this paper we revisit HW/SW partitioning and present a seamless programming flow for unrestricted, legacy C code. It consists of a retargetable GCC plugin that automatically identifies code sections for hardware acceleration and generates code accordingly. The proposed workflow was evaluated on the Xilinx Zynq platform using unmodified code from an embedded benchmark suite.Comment: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320

arXiv.org e-Print Archive

TUbiblio

Dataflow Computing with Polymorphic Registers

Author: Ciobanu Catalin
Gaydadjiev Georgi N.
Pilato Christian
Sciuto Donatella
Publication venue
Publication date: 01/01/2013
Field of study

Heterogeneous systems are becoming increasingly popular for data processing. They improve performance of simple kernels applied to large amounts of data. However, sequential data loads may have negative impact. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high speed, parallel access to performance-critical data. Furthermore, by PRF customization, specific data path features are exposed to the programmer in a very convenient way. PRFs allow additional control over the registers dimensions, and the number of elements which can be simultaneously accessed by computational units. This paper shows how PRFs can be integrated in dataflow computational platforms. In particular, starting from an annotated source code, we present a compiler-based methodology that automatically generates the customized PRFs and the enhanced computational kernels that efficiently exploit them

Archivio istituzionale della ricerca - Politecnico di Milano

Chalmers Research

Chalmers Publication Library

AMC: Advanced Multi-accelerator Controller

Author: Ayguadé Parra Eduard
Gursal Shakaib A.
Haider Amna
Hussain Tassadaq
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The rapid advancement, use of diverse architectural features and introduction of High Level Synthesis (HLS) tools in FPGA technology have enhanced the capacity of data-level parallelism on a chip. A generic FPGA based HLS multi-accelerator system requires a microprocessor (master core) that manages memory and schedules accelerators. In a real environment, such HLS multi-accelerator systems do not give a perfect performance due to memory bandwidth issues. Thus, a system demands a memory manager and a scheduler that improves performance by managing and scheduling the multi-accelerator’s memory access patterns efficiently. In this article, we propose the integration of an intelligent memory system and efficient scheduler in the HLS-based multi-accelerator environment called Advanced Multi-accelerator Controller (AMC). The AMC system is evaluated with memory intensive accelerators, High Performance Computing (HPC) applications and implemented and tested on a Xilinx Virtex-5 ML505 evaluation FPGA board. The performance of the system is compared against the microprocessor-based systems that have been integrated with the operating system. Results show that the AMC based HLS multi-accelerator system achieves 10.4x and 7x of speedup compared to the MicroBlaze and Intel Core based HLS multi-accelerator systems.Peer ReviewedPostprint (author’s final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A Survey and Evaluation of FPGA High-Level Synthesis Tools

Author: Anderson Jason
Bertels Koen
Brown Stephen
Canis Andrew
Chen Yu Ting
Choi Jongsok
Ferrandi Fabrizio
Fort Blair
Hsiao Hsuan
Nane Razvan
Pilato Christian
Sima Vlad-Mihai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today's system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing field-programmable gate array circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources

Archivio istituzionale della ricerca - Politecnico di Milano