73,694 research outputs found

    HDL IMPLEMENTATION AND ANALYSIS OF A RESIDUAL REGISTER FOR A FLOATING-POINT ARITHMETIC UNIT

    Get PDF
    Processors used in lower-end scientific applications like graphic cards and video game consoles have IEEE single precision floating-point hardware [23]. Double precision offers higher precision at higher implementation cost and lower performance. The need for high precision computations in these applications is not enough to justify the use double precision hardware and the extra hardware complexity needed [23]. Native-pair arithmetic offers an interesting and feasible solution to this problem. This technique invented by T. J. Dekker uses single-length floating-point numbers to represent higher precision floating-point numbers [3]. Native-pair arithmetic has been proposed by Dr. William R. Dieter and Dr. Henry G. Dietz to achieve better accuracy using standard IEEE single precision floating point hardware [1]. Native-pair arithmetic results in better accuracy however it decreases the performance by 11x and 17x for addition and multiplication respectively [2]. The proposed implementation uses a residual register to store the error residual term [2]. This addition is not only cost efficient but also results in acceptable accuracy with 10 times the performance of 64-bit hardware. This thesis demonstrates the implementation of a 32-bit floating-point unit with residual register and estimates the hardware cost and performance

    Type classes for efficient exact real arithmetic in Coq

    Get PDF
    Floating point operations are fast, but require continuous effort on the part of the user in order to ensure that the results are correct. This burden can be shifted away from the user by providing a library of exact analysis in which the computer handles the error estimates. Previously, we [Krebbers/Spitters 2011] provided a fast implementation of the exact real numbers in the Coq proof assistant. Our implementation improved on an earlier implementation by O'Connor by using type classes to describe an abstract specification of the underlying dense set from which the real numbers are built. In particular, we used dyadic rationals built from Coq's machine integers to obtain a 100 times speed up of the basic operations already. This article is a substantially expanded version of [Krebbers/Spitters 2011] in which the implementation is extended in the various ways. First, we implement and verify the sine and cosine function. Secondly, we create an additional implementation of the dense set based on Coq's fast rational numbers. Thirdly, we extend the hierarchy to capture order on undecidable structures, while it was limited to decidable structures before. This hierarchy, based on type classes, allows us to share theory on the naturals, integers, rationals, dyadics, and reals in a convenient way. Finally, we obtain another dramatic speed-up by avoiding evaluation of termination proofs at runtime.Comment: arXiv admin note: text overlap with arXiv:1105.275

    Acceleration of generalized hypergeometric functions through precise remainder asymptotics

    Full text link
    We express the asymptotics of the remainders of the partial sums {s_n} of the generalized hypergeometric function q+1_F_q through an inverse power series z^n n^l \sum_k c_k/n^k, where the exponent l and the asymptotic coefficients {c_k} may be recursively computed to any desired order from the hypergeometric parameters and argument. From this we derive a new series acceleration technique that can be applied to any such function, even with complex parameters and at the branch point z=1. For moderate parameters (up to approximately ten) a C implementation at fixed precision is very effective at computing these functions; for larger parameters an implementation in higher than machine precision would be needed. Even for larger parameters, however, our C implementation is able to correctly determine whether or not it has converged; and when it converges, its estimate of its error is accurate.Comment: 36 pages, 6 figures, LaTeX2e. Fixed sign error in Eq. (2.28), added several references, added comparison to other methods, and added discussion of recursion stabilit
    corecore