181 research outputs found

    Formal proof for delayed finite field arithmetic using floating point operators

    Get PDF
    Formal proof checkers such as Coq are capable of validating proofs of correction of algorithms for finite field arithmetics but they require extensive training from potential users. The delayed solution of a triangular system over a finite field mixes operations on integers and operations on floating point numbers. We focus in this report on verifying proof obligations that state that no round off error occurred on any of the floating point operations. We use a tool named Gappa that can be learned in a matter of minutes to generate proofs related to floating point arithmetic and hide technicalities of formal proof checkers. We found that three facilities are missing from existing tools. The first one is the ability to use in Gappa new lemmas that cannot be easily expressed as rewriting rules. We coined the second one ``variable interchange'' as it would be required to validate loop interchanges. The third facility handles massive loop unrolling and argument instantiation by generating traces of execution for a large number of cases. We hope that these facilities may sometime in the future be integrated into mainstream code validation.Comment: 8th Conference on Real Numbers and Computers, Saint Jacques de Compostelle : Espagne (2008

    Hardware-Independent Proofs of Numerical Programs

    Get PDF
    On recent architectures, a numerical program may give different answers depending on the execution hardware and the compilation. Our goal is to formally prove properties about numerical programs that are true for multiple architectures and compilers. We propose an approach that states the rounding error of each floating-point computation whatever the environment. This approach is implemented in the Frama-C platform for static analysis of C code. Small case studies using this approach are entirely and automatically prove

    Computing Correctly Rounded Integer Powers in Floating-Point Arithmetic

    Get PDF
    23 pagesWe introduce several algorithms for accurately evaluating powers to a positive integer in floating-point arithmetic, assuming a fused multiply-add (fma) instruction is available. We aim at always obtaining correctly-rounded results in round-to-nearest mode, that is, our algorithms return the floating-point number that is nearest the exact value

    Hardware-independent proofs of numerical programs

    Get PDF
    International audienceOn recent architectures, a numerical program may give different answers depending on the execution hardware and the compilation. Our goal is to formally prove properties about numerical programs that are true for multiple architectures and compilers. We propose an approach that states the rounding error of each floating-point computation whatever the environment. This approach is implemented in the Frama-C platform for static analysis of C code. Small case studies using this approach are entirely and automatically proved

    Optimized linear, quadratic and cubic interpolators for elementary function hardware implementations

    Get PDF
    This paper presents a method for designing linear, quadratic and cubic interpolators that compute elementary functions using truncated multipliers, squarers and cubers. Initial coefficient values are obtained using a Chebyshev series approximation. A direct search algorithm is then used to optimize the quantized coefficient values to meet a user-specified error constraint. The algorithm minimizes coefficient lengths to reduce lookup table requirements, maximizes the number of truncated columns to reduce the area, delay and power of the arithmetic units, and minimizes the maximum absolute error of the interpolator output. The method can be used to design interpolators to approximate any function to a user-specified accuracy, up to and beyond 53-bits of precision (e.g., IEEE double precision significand). Linear, quadratic and cubic interpolator designs that approximate reciprocal, square root, reciprocal square root and sine are presented and analyzed. Area, delay and power estimates are given for 16, 24 and 32-bit interpolators that compute the reciprocal function, targeting a 65 nm CMOS technology from IBM. Results indicate the proposed method uses smaller arithmetic units and has reduced lookup table sizes compared to previously proposed methods. The method can be used to optimize coefficients in other systems while accounting for coefficient quantization as well as truncation and rounding effects of multiple arithmetic units.Peer reviewedElectrical and Computer Engineerin

    Floating-point arithmetic in the Coq system

    Get PDF
    International audienceThe process of proving some mathematical theorems can be greatly reduced by relying on numerically-intensive computations with a certified arithmetic. This article presents a formalization of floating-point arithmetic that makes it possible to efficiently compute inside the proofs of the Coq system. This certified library is a multi-radix and multi-precision implementation free from underflow and overflow. It provides the basic arithmetic operators and a few elementary functions

    Proving Tight Bounds on Univariate Expressions with Elementary Functions in Coq

    Get PDF
    International audienceThe verification of floating-point mathematical libraries requires computing numerical bounds on approximation errors. Due to the tightness of these bounds and the peculiar structure of approximation errors, such a verification is out of the reach of generic tools such as computer algebra systems. In fact, the inherent difficulty of computing such bounds often mandates a formal proof of them. In this paper, we present a tactic for the Coq proof assistant that is designed to automatically and formally prove bounds on univariate expressions. It is based on a formalization of floating-point and interval arithmetic, associated with an on-the-fly computation of Taylor expansions. All the computations are performed inside Coq's logic, in a reflexive setting. This paper also compares our tactic with various existing tools on a large set of examples

    An FPGA implementation of an investigative many-core processor, Fynbos : in support of a Fortran autoparallelising software pipeline

    Get PDF
    Includes bibliographical references.In light of the power, memory, ILP, and utilisation walls facing the computing industry, this work examines the hypothetical many-core approach to finding greater compute performance and efficiency. In order to achieve greater efficiency in an environment in which Moore’s law continues but TDP has been capped, a means of deriving performance from dark and dim silicon is needed. The many-core hypothesis is one approach to exploiting these available transistors efficiently. As understood in this work, it involves trading in hardware control complexity for hundreds to thousands of parallel simple processing elements, and operating at a clock speed sufficiently low as to allow the efficiency gains of near threshold voltage operation. Performance is there- fore dependant on exploiting a new degree of fine-grained parallelism such as is currently only found in GPGPUs, but in a manner that is not as restrictive in application domain range. While removing the complex control hardware of traditional CPUs provides space for more arithmetic hardware, a basic level of control is still required. For a number of reasons this work chooses to replace this control largely with static scheduling. This pushes the burden of control primarily to the software and specifically the compiler, rather not to the programmer or to an application specific means of control simplification. An existing legacy tool chain capable of autoparallelising sequential Fortran code to the degree of parallelism necessary for many-core exists. This work implements a many-core architecture to match it. Prototyping the design on an FPGA, it is possible to examine the real world performance of the compiler-architecture system to a greater degree than simulation only would allow. Comparing theoretical peak performance and real performance in a case study application, the system is found to be more efficient than any other reviewed, but to also significantly under perform relative to current competing architectures. This failing is apportioned to taking the need for simple hardware too far, and an inability to implement static scheduling mitigating tactics due to lack of support for such in the compiler

    Eight Biennial Report : April 2005 – March 2007

    No full text
    • …
    corecore