Search CORE

2 research outputs found

FUSE: Front-End User Framework for O/S Abstraction of Hardware Accelerators

Author: Aws Ismail
Lesley Shannon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Abstract—SoCs can be implemented on a single FPGA, offering designers a unique opportunity for Embedded Sys-tems. Instead of defining a fixed architecture early in the design process, the reconfigurable platform allows architec-tural redesign to meet the system’s specific needs. However, the ability to instantiate new modules in the reconfigurable hardware provides a unique set of challenges for integration, particularly to the software (SW) designer. Specifically, the Operating System (OS) cannot automatically abstract these platform changes without redesign. In this paper, we present FUSE, a framework for HW accelerator abstraction that provides: 1) transparency to the SW designer at the application level; and 2) OS support for easy HW accelerator integration. We illustrate FUSE as an API for an embedded Linux OS with POSIX threads on Xilinx’s MicroBlaze on a Virtex5. For three different applications and HW accelerators, we achieve performance speedups ranging from 6.4-37x. I

CiteSeerX

Crossref

High-performance architectures for accelerating sparse LU computation

Author: Cunningham Kevin
Publication venue: Drexel University
Publication date: 01/06/2011
Field of study

Sparse Lower-Upper (LU) Triangular Decomposition is important to many di erent applications, including power system analysis. High-performance sparse linear algebra software packages, executing on general-purpose processors, experience lower performance when processing power system matrices. This observation motivated previous work on the design of custom hardware, implemented, in FPGA, to improve performance of sparse LU. While improved performance was obtained, signi cant e ort was required to design and implement the hardware. This thesis investigates the combination of general purpose architectures and a hardware accelerator, for a crucial component of sparse LU, to achieve similar performance results without the design overhead. One architecture, combining a general-purpose processor with a hardware accelerator, achieves a 1.29X speedup over software for a 26K-Bus power system. The second architecture, a modi cation of the Data Pump Architecture, provides a 2.27X speedup over software on the 26K-bus power system. These results show that speedup for sparse LU is possible, without designing a complete custom hardware solution, using a small hardware accelerator, provided a tightly coupled architecture is available to feed data to the accelerator.M.S., Computer Engineering -- Drexel University, 201

Drexel Libraries E-Repository and Archives