64 research outputs found

    HDL IMPLEMENTATION AND ANALYSIS OF A RESIDUAL REGISTER FOR A FLOATING-POINT ARITHMETIC UNIT

    Get PDF
    Processors used in lower-end scientific applications like graphic cards and video game consoles have IEEE single precision floating-point hardware [23]. Double precision offers higher precision at higher implementation cost and lower performance. The need for high precision computations in these applications is not enough to justify the use double precision hardware and the extra hardware complexity needed [23]. Native-pair arithmetic offers an interesting and feasible solution to this problem. This technique invented by T. J. Dekker uses single-length floating-point numbers to represent higher precision floating-point numbers [3]. Native-pair arithmetic has been proposed by Dr. William R. Dieter and Dr. Henry G. Dietz to achieve better accuracy using standard IEEE single precision floating point hardware [1]. Native-pair arithmetic results in better accuracy however it decreases the performance by 11x and 17x for addition and multiplication respectively [2]. The proposed implementation uses a residual register to store the error residual term [2]. This addition is not only cost efficient but also results in acceptable accuracy with 10 times the performance of 64-bit hardware. This thesis demonstrates the implementation of a 32-bit floating-point unit with residual register and estimates the hardware cost and performance

    Floating-Point Matrix Product on FPGA

    Get PDF
    This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.---- Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE

    Architectural Improvements in IEEE-Compliant Floating-Point Multiplication

    Get PDF
    Multiplication has long been an important part of any computer architecture. It has usually been a common case for most computer architecture decisions to include in any microarchitecture. However, the difficulty in creating hardware for multiplication because of its inherent shifting of the radix point has been a cogent reason for the need for floating-point hardware in scientific applications. The IEEE 754 floating-point standard was originally ratified in 1985 and later amended in 2008 to make floating-point multiplication easier for users to implement applications. Although floating-point arithmetic creates a mechanism to make things easier for using multiplication, it is complicated both algorithmically and practically for hardware implementations.This dissertation discusses possible architectural improvements in IEEE-compliant floating-point multiplication for Machine Learning/Deep Learning applications. First, a combined IEEE half and single precision floating-point multipliers is proposed to reduce power dissipation for Deep Learning applications. Second, a novel rounding scheme is proposed that is simpler but comparable with the state-of-the-art rounding schemes. Third, an optimized design is proposed that can handle both denormal and normal numbers. Finally, a hybrid precision design is proposed, aiming to improve the power consumption of Machine Learning/Deep Learning applications. Proposed designs are targeted to Machine Learning/Deep Learning applications-specific processors to improve the latency and power consumption. All designs are implemented in RTL-level Verilog, verified for correctness against open-source TestFloat generated test vectors, and synthesized using an ARM 32nm CMOS library for Global Foundries (GF) cmos32soi technology for estimated power, area and delay analysis.Electrical Engineerin

    Customizing floating-point units for FPGAs: Area-performance-standard trade-offs

    Get PDF
    The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexit

    An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithm

    Get PDF
    n this work we present an FPGA implementation of a single-precision °oating-point arith- metic powering unit. Our powering unit is based on an indirect method that transforms xy into a chain of operations involving a logarithm, a multiplication, an exponential function and dedicated logic for the case of a negative base. This approach allows to use the full input range for the base and exponent without limiting the range of the exponent as in direct methods. A tailored hardware implementation is exploited to increase the accuracy of the unit reducing the relative errors of the operations while high performance is obtained taking advantage of the FPGA capabilities for parallel architectures. A careful design of the pipeline stages of the involved operators allows a clock cycle of 201.3 MHz on a Xilinx Virtex-4 FPG

    Design of efficient reversible floating-point arithmetic unit on field programmable gate array platform and its performance analysis

    Get PDF
    The reversible logic gates are used to improve the power dissipation in modern computer applications. The floating-point numbers with reversible features are added advantage to performing complex algorithms with high-performance computations. This manuscript implements an efficient reversible floating-point arithmetic (RFPA) unit, and its performance metrics are realized in detail. The RFP adder/subtractor (A/S), RFP multiplier, and RFP divider units are designed as a part of the RFP arithmetic unit. The RFPA unit is designed by considering basic reversible gates. The mantissa part of the RFP multiplier is created using a 24x24 Wallace tree multiplier. In contrast, the reciprocal unit of the RFP divider is designed using Newton Raphson’s method. The RFPA unit and its submodules are executed in parallel by utilizing one clock cycle individually. The RFPA unit and its submodules are synthesized separately on the Vivado IDE environment and obtained the implementation results on Artix-7 field programmable gate array (FPGA). The RFPA unit utilizes only 18.44% slice look-up tables (LUTs) by consuming the 0.891 W total power on Artix-7 FPGA. The RFPA unit sub-models are compared with existing approaches with better performance metrics and chip resource utilization improvements

    Study of the posit number system: a practical approach

    Get PDF
    The IEEE Standard for Floating-Point Arithmetic (IEEE 754) has been for decades the standard for floating-point arithmetic and is implemented in a vast majority of modern computer systems. Recently, a new number representation format called posit (Type III unum) introduced by John L. Gustafson – who claims this new format can provide higher accuracy using equal or less number of bits and simpler hardware than current standard – is proposed as an alternative to the now omnipresent IEEE 754 arithmetic. In this Bachelor dissertation, the novel posit number format, its characteristics and properties – presented in literature – are analyzed and compared with the standard for floating-point numbers (floats). Based on the literature assertions, we focus on determining whether posits would be a good “drop-in replacement” for floats. With the help of Wolfram Mathematica and Python, different environments are created to compare the performance of IEEE 754 floating-point standard with Type III unum: posits. In order to get a more practical approach, first, we propose different numerical problems to compare the accuracy of both formats, including algebraic problems and numerical methods. Then, we focus on the possible use of posits in Deep Learning problems, such as training artificial Neural Networks or preforming low-precision inference on Convolutional Neural Networks. To conclude this work, we propose a low-level design for posit arithmetic multiplier using the FloPoCo tool to generate synthesizable VHDL code

    A monte-carlo floating-point unit for self-validating arithmetic

    Full text link
    Monte-Carlo arithmetic is a form of self-validating arith-metic that accounts for the effect of rounding errors. We have implemented a floating point unit that can perform ei-ther IEEE 754 or Monte-Carlo floating point computation, allowing hardware accelerated validation of results during execution. Experiments show that our approach has a mod-est hardware overhead and allows the propagation of round-ing error to be accurately estimated

    Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

    Get PDF
    The Floating point numbers are being widely used in various fields because of their great dynamic range, high precision and easy operation rules. In this paper, architecture of generic floating point unit is proposed and discussed. This generic unit is compatible with all three IEEE-754 binary formats. Further based on this architecture, floating point adder, subtractor and multiplier modules are designed and functionally verified for Virtex-4 FPGA. The design is working properly and giving accurate result up to the last point. DOI: 10.17762/ijritcc2321-8169.15054
    corecore