Search CORE

17,940 research outputs found

High performance computing with FPGAs

Author: Beyls Kristof
D'Hollander Erik
Publication venue: 'IOS Press'
Publication date: 01/01/2009
Field of study

Field-programmable gate arrays represent an army of logical units which can be organized in a highly parallel or pipelined fashion to implement an algorithm in hardware. The flexibility of this new medium creates new challenges to find the right processing paradigm which takes into account of the natural constraints of FPGAs: clock frequency, memory footprint and communication bandwidth. In this paper first use of FPGAs as a multiprocessor on a chip or its use as a highly functional coprocessor are compared, and the programming tools for hardware/software codesign are discussed. Next a number of techniques are presented to maximize the parallelism and optimize the data locality in nested loops. This includes unimodular transformations, data locality improving loop transformations and use of smart buffers. Finally, the use of these techniques on a number of examples is demonstrated. The results in the paper and in the literature show that, with the proper programming tool set, FPGAs can speedup computation kernels significantly with respect to traditional processors

Ghent University Academic Bibliography

ParaFPGA : parallel computing with flexible hardware

Author: D'Hollander Erik
Stroobandt Dirk
Touhafi Abdellah
Publication venue: 'IOS Press'
Publication date: 01/01/2010
Field of study

ParaFPGA 2009 is a Mini-Symposium on parallel computing with field programmable gate arrays (FPGAs), held in conjunction with the ParCo conference on parallel computing. FPGAs allow to map an algorithm directly onto the hardware, optimize the architecture for parallel execution, and dynamically reconfigure the system in between different phases of the computation. Compared to e.g. Cell processors, GPGPU's (general-purpose GPU's) and other high-performance devices, FPGAs are considered as flexible hardware in the sense that the building blocks of one or more single or multiple FPGAs can be interconnected freely to build a highly parallel system. In this Mini-Symposium the following topics are addressed: clustering FPGAs, evolvable hardware using FPGAs and fast dynamic reconfiguration

Ghent University Academic Bibliography

Performance evaluation of FPGA implementations of high-speed addition algorithms

Author: Xing S
Yu WWH
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/1996
Field of study

Driven by the excellent properties of FPGAs and the need for high-performance and flexible computing machines, interest in FPGA-based computing machines has increased dramatically. Fixed-point adders are essential building blocks of any computing systems. In this work, various high-speed addition algorithms are implemented in FPGAs devices, and their performance is evaluated with the objective of finding and developing the most appropriate addition algorithms for implementing in FPGAs, and laying the ground-work for evaluating and constructing FPGA-based computing machines. The results demonstrate that the performance of adders built with the FPGAs dedicated carry logic combined with some other addition algorithms will be greatly improved, especially for larger adders.published_or_final_versio

HKU Scholars Hub

High Performance Biological Pairwise Sequence Alignment: FPGA versus GPU versus Cell BE versus GPP

Author: Akoglu Ali
Benkrid Khaled
Ling Cheng
Liu Ying
Song Yang
Tian Xiang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

This paper explores the pros and cons of reconfigurable computing in the form of FPGAs for high performance efficient computing. In particular, the paper presents the results of a comparative study between three different acceleration technologies, namely, Field Programmable Gate Arrays (FPGAs), Graphics Processor Units (GPUs), and IBM’s Cell Broadband Engine (Cell BE), in the design and implementation of the widely-used Smith-Waterman pairwise sequence alignment algorithm, with general purpose processors as a base reference implementation. Comparison criteria include speed, energy consumption, and purchase and development costs. The study shows that FPGAs largely outperform all other implementation platforms on performance per watt criterion and perform better than all other platforms on performance per dollar criterion, although by a much smaller margin. Cell BE and GPU come second and third, respectively, on both performance per watt and performance per dollar criteria. In general, in order to outperform other technologies on performance per dollar criterion (using currently available hardware and development tools), FPGAs need to achieve at least two orders of magnitude speed-up compared to general-purpose processors and one order of magnitude speed-up compared to domain-specific technologies such as GPUs

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

ParaFPGA 2013: Harnessing Programs, Power and Performance in Parallel FPGA applications

Author: D'Hollander Erik
Stroobandt Dirk
Touhafi Abdellah
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

Future computing systems will require dedicated accelerators to achieve high-performance. The mini-symposium ParaFPGA explores parallel computing with FPGAs as an interesting avenue to reduce the gap between the architecture and the application. Topics discussed are the power of functional and dataflow languages, the performance of high-level synthesis tools, the automatic creation of hardware multi-cores using C-slow retiming, dynamic power management to control the energy consumption, real-time reconfiguration of streaming image processing filters and memory optimized event image segmentation

Ghent University Academic Bibliography

ParaFPGA 2011 : high performance computing with multiple FPGAs : design, methodology and applications

Author: D'Hollander Erik
Stroobandt Dirk
Touhafi Abdellah
Publication venue: 'IOS Press'
Publication date: 01/01/2012
Field of study

ParaFPGA 2011 marks the third mini-symposium devoted to the methodology, design and implementation of parallel applications using FPGAs. The focus of the contributions is mainly on organizing parallel applications in multiple FPGAs. This includes experiences from building a supercomputer with FPGAs, automatic and dedicated balancing of different tasks on heterogeneous FPGA constellations and designing optimal interconnects between collaborating FPGAs

Ghent University Academic Bibliography

High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis

Author: Sun Junqing
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2008
Field of study

Field Programmable Gate Arrays (FPGAs) enable powerful performance acceleration for scientific computations because of their intrinsic parallelism, pipeline ability, and flexible architecture. This dissertation explores the computational power of FPGAs for an important scientific application: linear algebra. First of all, optimized linear algebra subroutines are presented based on enhancements to both algorithms and hardware architectures. Compared to microprocessors, these routines achieve significant speedup. Second, computing with mixed-precision data on FPGAs is proposed for higher performance. Experimental analysis shows that mixed-precision algorithms on FPGAs can achieve the high performance of using lower-precision data while keeping higher-precision accuracy for finding solutions of linear equations. Third, an execution time model is built for reconfigurable computers (RC), which plays an important role in performance analysis and optimal resource utilization of FPGAs. The accuracy and efficiency of parallel computing performance models often depend on mean maximum computations. Despite significant prior work, there have been no sufficient mathematical tools for this important calculation. This work presents an Effective Mean Maximum Approximation method, which is more general, accurate, and efficient than previous methods. Together, these research results help address how to make linear algebra applications perform better on high performance reconfigurable computing architectures

University of Tennessee, Knoxville: Trace

A Study of Reconfigurable Accelerators for Cloud Computing

Author: Alam S. B. Y.
Eguro K.
Kachris D. D. C.
Kidane H. L.
Mondol J. A. M.
Morcel M. E.
Nguyen N.
Park H. L.
Radhika K. V.
Singh J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/09/2018
Field of study

Due to the exponential increase in network traffic in the data centers, thousands of servers interconnected with high bandwidth switches are required. Field Programmable Gate Arrays (FPGAs) with Cloud ecosystem offer high performance in efficiency and energy, making them active resources, easy to program and reconfigure. This paper looks at FPGAs as reconfigurable accelerators for the cloud computing presents the main hardware accelerators that have been presented in various widely used cloud computing applications such as: MapReduce, Spark, Memcached, Databases

Crossref

University of Northampton's Research Explorer

NECTAR