638 research outputs found

    Are Ieee 754 32-bit and 64-bit Binary Floating-point Accurate Enough?

    Full text link
    This paper describes a research toward the accuracy of floating-point values, and effort to reveal the real accuracy. The methods used in this research paper are assignment of values, assignment of value of arithmetic expressions, and output the values using floating-point value format that helps reveal the accuracy. The programming-tool used are Visual C# 9, Visual  C++  9,  Java  5,  and  Visual  BASIC  9.  These  tools  run  on  top  of  Intel 80x 86  hardware.  The  results  show  that 1*10-x cannot be accurately represented, and the approximate accuracy ranges only from 7 to 16 decimal digits. &nbsp


    Get PDF
    This paper describes a research toward the accuracy of floating-point values, and effort to reveal the real accuracy. The methods used in this research paper are assignment of values, assignment of value of arithmetic expressions, and output the values using floating-point value format that helps reveal the accuracy. The programming-tool used are Visual C# 9, Visual  C++  9,  Java  5,  and  Visual  BASIC  9.  These  tools  run  on  top  of  Intel 80x 86  hardware.  The  results  show  that 1*10-x cannot be accurately represented, and the approximate accuracy ranges only from 7 to 16 decimal digits.  Keywords: accuracy, binary, floating-point, IEEE 75

    Approximation and Compression Techniques to Enhance Performance of Graphics Processing Units

    Get PDF
    A key challenge in modern computing systems is to access data fast enough to fully utilize the computing elements in the chip. In Graphics Processing Units (GPUs), the performance is often constrained by register file size, memory bandwidth, and the capacity of the main memory. One important technique towards alleviating this challenge is data compression. By reducing the amount of data that needs to be communicated or stored, memory resources crucial for performance can be efficiently utilized.This thesis provides a set of approximation and compression techniques for GPUs, with the goal of efficiently utilizing the computational fabric, and thereby increase performance. The thesis shows that these techniques can substantially lower the amount of information the system has to process, and are thus important tools in the process of meeting challenges in memory utilization.This thesis makes contributions within three areas: controlled floating-point precision reduction, lossless and lossy memory compression, and distributed training of neural networks. In the first area, the thesis shows that through automated and controlled floating-point approximation, the register file can be more efficiently utilized. This is achieved through a framework which establishes a cross-layer connection between the application and the microarchitecture layer, and a novel register file organization capable of leveraging low-precision floating-point values and narrow integers for increased capacity and performance.Within the area of compression, this thesis aims at increasing the effective bandwidth of GPUs by presenting a lossless and lossy memory compression algorithm to reduce the amount of transferred data. In contrast to state-of-the-art compression techniques such as Base-Delta-Immediate and Bitplane Compression, which uses intra-block bases for compression, the proposed algorithm leverages multiple global base values to reach a higher compression ratio. The algorithm includes an optional approximation step for floating-point values which offers higher compression ratio at a given, low, error rate.Finally, within the area of distributed training of neural networks, this thesis proposes a subgraph approximation scheme for graph data which mitigates accuracy loss in a distributed setting. The scheme allows neural network models that use graphs as inputs to converge at single-machine accuracy, while minimizing synchronization overhead between the machines

    On Matrix Multiplication and Polynomial Identity Testing

    Full text link
    We show that lower bounds on the border rank of matrix multiplication can be used to non-trivially derandomize polynomial identity testing for small algebraic circuits. Letting R‾(n)\underline{R}(n) denote the border rank of n×n×nn \times n \times n matrix multiplication, we construct a hitting set generator with seed length O(n⋅R‾−1(s))O(\sqrt{n} \cdot \underline{R}^{-1}(s)) that hits nn-variate circuits of multiplicative complexity ss. If the matrix multiplication exponent ω\omega is not 2, our generator has seed length O(n1−ε)O(n^{1 - \varepsilon}) and hits circuits of size O(n1+δ)O(n^{1 + \delta}) for sufficiently small ε,δ>0\varepsilon, \delta > 0. Surprisingly, the fact that R‾(n)≥n2\underline{R}(n) \ge n^2 already yields new, non-trivial hitting set generators for circuits of sublinear multiplicative complexity

    Logical relations for coherence of effect subtyping

    Full text link
    A coercion semantics of a programming language with subtyping is typically defined on typing derivations rather than on typing judgments. To avoid semantic ambiguity, such a semantics is expected to be coherent, i.e., independent of the typing derivation for a given typing judgment. In this article we present heterogeneous, biorthogonal, step-indexed logical relations for establishing the coherence of coercion semantics of programming languages with subtyping. To illustrate the effectiveness of the proof method, we develop a proof of coherence of a type-directed, selective CPS translation from a typed call-by-value lambda calculus with delimited continuations and control-effect subtyping. The article is accompanied by a Coq formalization that relies on a novel shallow embedding of a logic for reasoning about step-indexing

    An Introduction to Mechanized Reasoning

    Get PDF
    Mechanized reasoning uses computers to verify proofs and to help discover new theorems. Computer scientists have applied mechanized reasoning to economic problems but -- to date -- this work has not yet been properly presented in economics journals. We introduce mechanized reasoning to economists in three ways. First, we introduce mechanized reasoning in general, describing both the techniques and their successful applications. Second, we explain how mechanized reasoning has been applied to economic problems, concentrating on the two domains that have attracted the most attention: social choice theory and auction theory. Finally, we present a detailed example of mechanized reasoning in practice by means of a proof of Vickrey's familiar theorem on second-price auctions
    • …