Search CORE

2 research outputs found

HDL IMPLEMENTATION AND ANALYSIS OF A RESIDUAL REGISTER FOR A FLOATING-POINT ARITHMETIC UNIT

Author: Kaveti Akil
Publication venue: UKnowledge
Publication date: 01/01/2008
Field of study

Processors used in lower-end scientific applications like graphic cards and video game consoles have IEEE single precision floating-point hardware [23]. Double precision offers higher precision at higher implementation cost and lower performance. The need for high precision computations in these applications is not enough to justify the use double precision hardware and the extra hardware complexity needed [23]. Native-pair arithmetic offers an interesting and feasible solution to this problem. This technique invented by T. J. Dekker uses single-length floating-point numbers to represent higher precision floating-point numbers [3]. Native-pair arithmetic has been proposed by Dr. William R. Dieter and Dr. Henry G. Dietz to achieve better accuracy using standard IEEE single precision floating point hardware [1]. Native-pair arithmetic results in better accuracy however it decreases the performance by 11x and 17x for addition and multiplication respectively [2]. The proposed implementation uses a residual register to store the error residual term [2]. This addition is not only cost efficient but also results in acceptable accuracy with 10 times the performance of 64-bit hardware. This thesis demonstrates the implementation of a 32-bit floating-point unit with residual register and estimates the hardware cost and performance

University of Kentucky

Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy

Author: Henry G. Dietz
William R. Dieter
Publication venue
Publication date
Field of study

Some of the potentially fastest processors that could be used for scientific computing do not have efficient floatingpoint hardware support for precisions higher than 32-bits. This is true of the CELL processor, all current commodity Graphics Processing Units (GPUs), various Digital Signal Processing (DSP) chips, etc. Acceptably high accuracy can be obtained without extra hardware by using pairs of native floating-point numbers to represent the base result and a residual error term, but an order of magnitude slowdown dramatically reduces the price/performance advantage of these systems. By adding a few simple microarchitectural features, acceptable accuracy can be obtained with relatively little performance penalty. To reduce the cost of native-pair arithmetic, a residual register is used to hold information that would normally have been discarded after each floating-point computation. The residual register dramatically simplifies the code, providing both lower latency and better instruction-level parallelism. To support speculative use of faster, lower precision, arithmetic, a peak exponent monitor and an absorption counter are added to measure potential loss of accuracy.

CiteSeerX