Search CORE

638 research outputs found

Are Ieee 754 32-bit and 64-bit Binary Floating-point Accurate Enough?

Author: Hariadi M. (Mochamad)
Hutabarat B. (Bernaridho)
Purnama I. K. (I)
Purnomo M. H. (Mauridhi)
Publication venue: 'Universitas Islam Indonesia (Islamic University of Indonesia)'
Publication date: 01/04/2011
Field of study

This paper describes a research toward the accuracy of floating-point values, and effort to reveal the real accuracy. The methods used in this research paper are assignment of values, assignment of value of arithmetic expressions, and output the values using floating-point value format that helps reveal the accuracy. The programming-tool used are Visual C# 9, Visual  C++  9,  Java  5,  and  Visual  BASIC  9.  These  tools  run  on  top  of  Intel 80x 86  hardware.  The  results  show  that 1*10-x cannot be accurately represented, and the approximate accuracy ranges only from 7 to 16 decimal digits. &nbsp

Neliti

ARE IEEE 754 32-BIT AND 64-BIT BINARY FLOATING-POINT ACCURATE ENOUGH?

Author: Bernaridho Hutabarat
I Ketut Eddy Purnama
Mauridhi Hery Purnomo
Mochamad Hariadi
Publication venue: Directorate of Research and Community Engagement, Universitas Indonesia
Publication date: 22/09/2011
Field of study

MAKARA Journal of Technology

Approximation and Compression Techniques to Enhance Performance of Graphics Processing Units

Author: Angerd Alexandra
Publication venue
Publication date: 01/01/2020
Field of study

A key challenge in modern computing systems is to access data fast enough to fully utilize the computing elements in the chip. In Graphics Processing Units (GPUs), the performance is often constrained by register file size, memory bandwidth, and the capacity of the main memory. One important technique towards alleviating this challenge is data compression. By reducing the amount of data that needs to be communicated or stored, memory resources crucial for performance can be efficiently utilized.This thesis provides a set of approximation and compression techniques for GPUs, with the goal of efficiently utilizing the computational fabric, and thereby increase performance. The thesis shows that these techniques can substantially lower the amount of information the system has to process, and are thus important tools in the process of meeting challenges in memory utilization.This thesis makes contributions within three areas: controlled floating-point precision reduction, lossless and lossy memory compression, and distributed training of neural networks. In the first area, the thesis shows that through automated and controlled floating-point approximation, the register file can be more efficiently utilized. This is achieved through a framework which establishes a cross-layer connection between the application and the microarchitecture layer, and a novel register file organization capable of leveraging low-precision floating-point values and narrow integers for increased capacity and performance.Within the area of compression, this thesis aims at increasing the effective bandwidth of GPUs by presenting a lossless and lossy memory compression algorithm to reduce the amount of transferred data. In contrast to state-of-the-art compression techniques such as Base-Delta-Immediate and Bitplane Compression, which uses intra-block bases for compression, the proposed algorithm leverages multiple global base values to reach a higher compression ratio. The algorithm includes an optional approximation step for floating-point values which offers higher compression ratio at a given, low, error rate.Finally, within the area of distributed training of neural networks, this thesis proposes a subgraph approximation scheme for graph data which mitigates accuracy loss in a distributed setting. The scheme allows neural network models that use graphs as inputs to converge at single-machine accuracy, while minimizing synchronization overhead between the machines

Chalmers Research

On Matrix Multiplication and Polynomial Identity Testing

Author: Andrews Robert
Publication venue
Publication date: 01/08/2022
Field of study

We show that lower bounds on the border rank of matrix multiplication can be used to non-trivially derandomize polynomial identity testing for small algebraic circuits. Letting

\underline{R}(n)

denote the border rank of

n \times n \times n

matrix multiplication, we construct a hitting set generator with seed length

O(\sqrt{n} \cdot \underline{R}^{-1}(s))

that hits

n

-variate circuits of multiplicative complexity

s

. If the matrix multiplication exponent

\omega

is not 2, our generator has seed length

O(n^{1 - \varepsilon})

and hits circuits of size

O(n^{1 + \delta})

for sufficiently small

\varepsilon, \delta > 0

. Surprisingly, the fact that

\underline{R}(n) \ge n^2

already yields new, non-trivial hitting set generators for circuits of sublinear multiplicative complexity

arXiv.org e-Print Archive

Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration

Author: Amarnath Aporva
Beaumont Jonathan
Blaauw David
Chakrabarti Chaitali
Cole Murray
Dreslinski Ronald
Feng Siying
He Xin
Kaszyk Kuba
Kim Hun-Seok
Kim Sung
May Kyle
Morton John Magnus
Mudge Trevor
O'Boyle Michael
Pal Subhankar
Park Dong-hyeon
Sun Jiawen
Xiong Yan
Yang Chi-Sheng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/09/2020
Field of study

Crossref

Edinburgh Research Explorer

Logical relations for coherence of effect subtyping

Author: Biernacki Dariusz
Polesiuk Piotr
Publication venue
Publication date: 01/01/2018
Field of study

A coercion semantics of a programming language with subtyping is typically defined on typing derivations rather than on typing judgments. To avoid semantic ambiguity, such a semantics is expected to be coherent, i.e., independent of the typing derivation for a given typing judgment. In this article we present heterogeneous, biorthogonal, step-indexed logical relations for establishing the coherence of coercion semantics of programming languages with subtyping. To illustrate the effectiveness of the proof method, we develop a proof of coherence of a type-directed, selective CPS translation from a typed call-by-value lambda calculus with delimited continuations and control-effect subtyping. The article is accompanied by a Coq formalization that relies on a novel shallow embedding of a logic for reasoning about step-indexing

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

An Introduction to Mechanized Reasoning

Author: Kerber Manfred
Lange Christoph
Rowat Colin
Publication venue: 'Elsevier BV'
Publication date: 10/08/2016
Field of study

Mechanized reasoning uses computers to verify proofs and to help discover new theorems. Computer scientists have applied mechanized reasoning to economic problems but -- to date -- this work has not yet been properly presented in economics journals. We introduce mechanized reasoning to economists in three ways. First, we introduce mechanized reasoning in general, describing both the techniques and their successful applications. Second, we explain how mechanized reasoning has been applied to economic problems, concentrating on the two domains that have attracted the most attention: social choice theory and auction theory. Finally, we present a detailed example of mechanized reasoning in practice by means of a proof of Vickrey's familiar theorem on second-price auctions

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

University of Birmingham Research Portal

Fraunhofer-ePrints