Search CORE

646 research outputs found

Rewriting History: Repurposing Domain-Specific CGRAs

Author: Ainsworth Sam
Brauckmann Alexander
Cummins Chris
Koehler Thomas
O'Boyle Michael F. P.
Woodruff Jackson
Publication venue
Publication date: 16/09/2023
Field of study

Coarse-grained reconfigurable arrays (CGRAs) are domain-specific devices promising both the flexibility of FPGAs and the performance of ASICs. However, with restricted domains comes a danger: designing chips that cannot accelerate enough current and future software to justify the hardware cost. We introduce FlexC, the first flexible CGRA compiler, which allows CGRAs to be adapted to operations they do not natively support. FlexC uses dataflow rewriting, replacing unsupported regions of code with equivalent operations that are supported by the CGRA. We use equality saturation, a technique enabling efficient exploration of a large space of rewrite rules, to effectively search through the program-space for supported programs. We applied FlexC to over 2,000 loop kernels, compiling to four different research CGRAs and 300 generated CGRAs and demonstrate a 2.2

\times

increase in the number of loop kernels accelerated leading to 3

\times

speedup compared to an Arm A5 CPU on kernels that would otherwise be unsupported by the accelerator

arXiv.org e-Print Archive

Compiling Geometric Algebra Computations into Reconfigurable Hardware Accelerators

Author: Hildenbrand Dietmar
Huthmann Jens
Koch Andreas
Stock Florian
Publication venue: Dagstuhl Seminar Proceedings. 10281 - Dynamically Reconfigurable Architectures
Publication date: 01/01/2010
Field of study

Geometric Algebra (GA), a generalization of quaternions and complex numbers, is a very powerful framework for intuitively expressing and manipulating the complex geometric relationships common to engineering problems. However, actual processing of GA expressions is very compute intensive, and acceleration is generally required for practical use. GPUs and FPGAs offer such acceleration, while requiring only low-power per operation. In this paper, we present key components of a proof-of-concept compile flow combining symbolic and hardware optimization techniques to automatically generate hardware accelerators from the abstract GA descriptions that are suitable for high-performance embedded computing

Dagstuhl Research Online Publication Server

Automatic synthesis of reconfigurable instruction set accelerators

Author: Kastrup B.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2001
Field of study

Repository TU/e

Pure OAI Repository

The PARSE Programming Paradigm. Part I: Software Development Methodology. Part II: Software Development Support Tools

Author: Casavant T. L.
Dietz Henry G.
Sheu P. C.-Y.
Siegel H. J.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/06/1987
Field of study

The programming methodology of PARSE (parallel software environment), a software environment being developed for reconfigurable non-shared memory parallel computers, is described. This environment will consist of an integrated collection of language interfaces, automatic and semi-automatic debugging and analysis tools, and operating system —all of which are made more flexible by the use of a knowledge-based implementation for the tools that make up PARSE. The programming paradigm supports the user freely choosing among three basic approaches /abstractions for programming a parallel machine: logic-based descriptive, sequential-control procedural, and parallel-control procedural programming. All of these result in efficient parallel execution. The current work discusses the methodology underlying PARSE, whereas the companion paper, “The PARSE Programming Paradigm — II: Software Development Support Tools,” details each of the component tools

Purdue E-Pubs

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

FOS: A Modular FPGA Operating System for Dynamic Workloads

Author: Koch Dirk
Pham Khoa
Powell Joseph
Vaishnav Anuj
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/01/2020
Field of study

With FPGAs now being deployed in the cloud and at the edge, there is a need for scalable design methods which can incorporate the heterogeneity present in the hardware and software components of FPGA systems. Moreover, these FPGA systems need to be maintainable and adaptable to changing workloads while improving accessibility for the application developers. However, current FPGA systems fail to achieve modularity and support for multi-tenancy due to dependencies between system components and lack of standardised abstraction layers. To solve this, we introduce a modular FPGA operating system -- FOS, which adopts a modular FPGA development flow to allow each system component to be changed and be agnostic to the heterogeneity of EDA tool versions, hardware and software layers. Further, to dynamically maximise the utilisation transparently from the users, FOS employs resource-elastic scheduling to arbitrate the FPGA resources in both time and spatial domain for any type of accelerators. Our evaluation on different FPGA boards shows that FOS can provide performance improvements in both single-tenant and multi-tenant environments while substantially reducing the development time and, at the same time, improving flexibility

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository