Search CORE

2,097 research outputs found

Efficient FPGA Cost-Performance Space Exploration Using Type-driven Program Transformations

Author: Nabi Syed Waqar
Urlea Cristian
Vanderbauwhede Wim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/02/2020
Field of study

Many numerical simulation applications from the scientific, financial and machine-learning domains require large amounts of compute capacity. They can often be implemented with a streaming data-flow architecture. Field Programmable Gate Arrays (FPGA) are particularly power-efficient hardware architectures suitable for streaming data-flow applications. Although numerous programming languages and frameworks target FPGAs, expert knowledge is still required to optimise the throughput of such applications for each target FPGA device. The process of selecting which optimising transformations to apply, and where to apply them is dubbed Design Space Exploration (DSE). We contribute an elegant and efficient compiler based DSE strategy for FPGAs by merging information sourced from the compiled application's semantic structure, an accurate cost-performance model and a description of hardware resource limits for particular FPGAs. Our work leverages developments in functional programming and dependent type theory to bring performance portability to the realm of High-Level Synthesis (HLS) tools targeting FPGAs. We showcase our approach by presenting achievable speedups for three example applications. Results indicate considerable improvements in throughput of up to 58× in one example. These results are obtained by traversing a minute fraction of the total Design Space

Crossref

Enlighten

Type-driven automated program transformations and cost modelling for optimising streaming programs on FPGAs

Author: Nabi Syed Waqar
Urlea Cristian
Vanderbauwhede Wim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/04/2018
Field of study

In this paper we present a novel approach to program optimisation based on compiler-based type-driven program transformations and a fast and accurate cost/performance model for the target architecture. We target streaming programs for the problem domain of scientific computing, such as numerical weather prediction. We present our theoretical framework for type-driven program transformation, our target high-level language and intermediate representation languages and the cost model and demonstrate the effectiveness of our approach by comparison with a commercial toolchain

Crossref

Enlighten

Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code

Author: Carro Manuel
Mariño Julio
Tamarit Salvador
Vigueras Guillermo
Publication venue
Publication date: 09/03/2016
Field of study

The current trend in next-generation exascale systems goes towards integrating a wide range of specialized (co-)processors into traditional supercomputers. However, the integration of different specialized devices increases the degree of heterogeneity and the complexity in programming such type of systems. Due to the efficiency of heterogeneous systems in terms of Watt and FLOPS per surface unit, opening the access of heterogeneous platforms to a wider range of users is an important problem to be tackled. In order to bridge the gap between heterogeneous systems and programmers, in this paper we propose a machine learning-based approach to learn heuristics for defining transformation strategies of a program transformation system. Our approach proposes a novel combination of reinforcement learning and classification methods to efficiently tackle the problems inherent to this type of systems. Preliminary results demonstrate the suitability of the approach for easing the programmability of heterogeneous systems.Comment: Part of the Program Transformation for Programmability in Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March 2016, 9 pages, LaTe

arXiv.org e-Print Archive

Directory of Open Access Journals

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Recommended from our members

Memory-Based High-Level Synthesis Optimizations Security Exploration on the Power Side-Channel

Author: Blackstone Jeremy
Hu Wei
Kastner Ryan
Mu Dejun
Tai Yu
Zhang Lu
Publication venue: eScholarship, University of California
Publication date: 01/10/2020
Field of study

High-level synthesis (HLS) allows hardware designers to think algorithmically and not worry about low-level, cycle-by-cycle details. This provides the ability to quickly explore the architectural design space and tradeoffs between resource utilization and performance. Unfortunately, security evaluation is not a standard part of the HLS design flow. In this article, we aim to understand the effects of memory-based HLS optimizations on power side-channel leakage. We use Xilinx Vivado HLS to develop different cryptographic cores, implement them on a Spartan-6 FPGA, and collect power traces. We evaluate the designs with respect to resource utilization, performance, and information leakage through power consumption. We have two important observations and contributions. First, the choice of resource optimization directive results in different levels of side-channel vulnerabilities. Second, the partitioning optimization directive can greatly compromise the hardware cryptographic system through power side-channel leakage due to the deployment of memory control logic. We describe an evaluation procedure for power side-channel leakage and use it to make best-effort recommendations about how to design more secure architectures in the cryptographic domain

eScholarship - University of California