Search CORE

8,206 research outputs found

Radix Conversion for IEEE754-2008 Mixed Radix Floating-Point Arithmetic

Author: Kupriianova O.
Lauter Ch.
Muller J. -M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/11/2013
Field of study

Conversion between binary and decimal floating-point representations is ubiquitous. Floating-point radix conversion means converting both the exponent and the mantissa. We develop an atomic operation for FP radix conversion with simple straight-line algorithm, suitable for hardware design. Exponent conversion is performed with a small multiplication and a lookup table. It yields the correct result without error. Mantissa conversion uses a few multiplications and a small lookup table that is shared amongst all types of conversions. The accuracy changes by adjusting the computing precision

arXiv.org e-Print Archive

HAL-ENS-LYON

Crossref

Portail HAL Nantes Université

INRIA a CCSD electronic archive server

HAL Descartes

HAL: Hyper Article en Ligne

Hal-Diderot

Evaluating critical bits in arithmetic operations due to timing violations

Author: Bahar R. Iris
Moreshet Tali
Papagiannopoulou Dimitra
Rachford Tymani
Whang Sungseob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

Various error models are being used in simulation of voltage-scaled arithmetic units to examine application-level tolerance of timing violations. The selection of an error model needs further consideration, as differences in error models drastically affect the performance of the application. Specifically, floating point arithmetic units (FPUs) have architectural characteristics that characterize its behavior. We examine the architecture of FPUs and design a new error model, which we call Critical Bit. We run selected benchmark applications with Critical Bit and other widely used error injection models to demonstrate the differences

Crossref

Boston University Institutional Repository (OpenBU)

Representing numeric data in 32 bits while preserving 64-bit precision

Author: Neal Radford M.
Publication venue
Publication date: 11/04/2015
Field of study

Data files often consist of numbers having only a few significant decimal digits, whose information content would allow storage in only 32 bits. However, we may require that arithmetic operations involving these numbers be done with 64-bit floating-point precision, which precludes simply representing the data as 32-bit floating-point values. Decimal floating point gives a compact and exact representation, but requires conversion with a slow division operation before it can be used. Here, I show that interesting subsets of 64-bit floating-point values can be compactly and exactly represented by the 32 bits consisting of the sign, exponent, and high-order part of the mantissa, with the lower-order 32 bits of the mantissa filled in by table lookup, indexed by bits from the part of the mantissa retained, and possibly from the exponent. For example, decimal data with 4 or fewer digits to the left of the decimal point and 2 or fewer digits to the right of the decimal point can be represented in this way using the lower-order 5 bits of the retained part of the mantissa as the index. Data consisting of 6 decimal digits with the decimal point in any of the 7 positions before or after one of the digits can also be represented this way, and decoded using 19 bits from the mantissa and exponent as the index. Encoding with such a scheme is a simple copy of half the 64-bit value, followed if necessary by verification that the value can be represented, by checking that it decodes correctly. Decoding requires only extraction of index bits and a table lookup. Lookup in a small table will usually reference cache; even with larger tables, decoding is still faster than conversion from decimal floating point with a division operation. I discuss how such schemes perform on recent computer systems, and how they might be used to automatically compress large arrays in interpretive languages such as R

arXiv.org e-Print Archive

CiteSeerX

Fast HUB Floating-point Adder for FPGA

Author: Gonzalez-Navarro Sonia
Hormigo-Aguilar Javier
Villalba-Moreno Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/10/2018
Field of study

Several previous publications have shown the area and delay reduction when implementing real number computation using HUB formats for both floating-point and fixed-point. In this paper, we present a HUB floating-point adder for FPGA which greatly improves the speed of previous proposed HUB designs for these devices. Our architecture is based on the double path technique which reduces the execution time since each path works in parallel. We also deal with the implementation of unbiased rounding in the proposed adder. Experimental results are presented showing the goodness of the new HUB adder for FPGA.TIN2016- 80920-R, JA2012 P12-TIC-1692, JA2012 P12-TIC-147

Crossref

Repositorio Institucional Universidad de Málaga

Outdeconstructing Deconstruction in John Fowles’s Mantissa

Author: Dobrogoszcz Tomasz
Publication venue: Wydawnictwo Uniwersytetu Łódzkiego
Publication date: 01/01/2003
Field of study

Zadanie pt. „Digitalizacja i udostępnienie w Cyfrowym Repozytorium Uniwersytetu Łódzkiego kolekcji czasopism naukowych wydawanych przez Uniwersytet Łódzki” nr 885/P-DUN/2014 dofinansowane zostało ze środków MNiSW w ramach działalności upowszechniającej nauk

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)