30 research outputs found

    Customizing floating-point units for FPGAs: Area-performance-standard trade-offs

    Get PDF
    The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexit

    Preliminary investigation of advanced electrostatics in molecular dynamics on reconfigurable computers

    No full text
    Scientific computing is marked by applications with very high performance demands. As technology has improved, reconfigurable hardware has become a viable platform to provide application acceleration, even for floating-point-intensive scientific applications. Now, reconfigurable computers—computers with general purpose microprocessors, reconfigurable hardware, memory, and high performance interconnect—are emerging as platforms that allow complete applications to be partitioned into parts that execute in software and parts that are accelerated in hardware. In this paper, we study molecular dynamics simulation. Specifically, we study the use of the smooth particle mesh Ewald technique in a molecular dynamics simulation program that takes advantage of the hardware acceleration capabilities of a reconfigurable computer. We demonstrate a 2.7–2.9× speed-up over the corresponding software-only simulation program. Along the way, we note design issues and techniques related to the use of reconfigurable computers for scientific computing in general

    A library of parameterizable floating-point cores for FPGAs and their application to scientific computing

    No full text
    Abstract — Advances in field programmable gate arrays (FP-GAs), which are the platform of choice for reconfigurable computing, have made it possible to use FPGAs in increasingly many areas of computing, including complex scientific applications. These applications demand high performance and high-precision, floating-point arithmetic. Until now, most of the research has not focussed on compliance with IEEE standard 754, focusing instead upon custom formats and bitwidths. In this paper, we present double-precision floating-point cores that are parameterized by their degree of pipelining and the features of IEEE standard 754 that they implement. We then analyze the effects of supporting the standard when these cores are used in an FPGA-based accelerator for Lennard-Jones force and potential calculations that are part of molecular dynamics (MD) simulations. I

    General Terms

    No full text
    In this paper, we present techniques for energy-efficient design at the algorithm level using FPGAs. We then use these techniques to create energy-efficient designs for two signal processing kernel applications: fast Fourier transform (FFT) and matrix multiplication. We evaluate the performance, in terms of both latency and energy efficiency, of FPGAs in performing these tasks. Using a Xilinx Virtex-II as the target FPGA, we compare the performance of our designs to those from the Xilinx library as well as to conventional algorithms run on the PowerPC core embedded in the Virtex-II Pro and the Texas Instruments TMS320C6415. Our evaluations are done both through estimation based on energy and latency equations and through low-level simulation. For FFT, our designs dissipated an average of 60 % less energy than the design from the Xilinx library and 56 % less than the DSP. Our designs showed a factor of 10 improvement over the embedded processor. These results provide concrete evidence to substantiate the idea that FPGAs can outperform DSPs and embedded processors in signal processing. Further, they show that FPGAs can achieve this performance while still dissipating less energy than the other two types of devices

    Area-Efficient Arithmetic Expression Evaluation Using Deeply Pipelined Floating-Point Cores

    No full text
    corecore