Search CORE

155 research outputs found

A new binary floating-point division algorithm and its software implementation on the ST231 processor

Author: Jeannerod Claude-Pierre
Knochel Hervé
Monat Christophe
Revy Guillaume
Villard Gilles
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

This paper deals with the design and implementation of low latency software for binary floating-point division with correct rounding to nearest. The approach we present here targets a VLIW integer processor of the ST200 family, and is based on fast and accurate programs for evaluating some particular bivariate polynomials. We start by giving approximation and evaluation error conditions that are sufficient to ensure correct rounding. Then we describe the heuristics used to generate such evaluation programs, as well as those used to automatically validate their accuracy. Finally, we propose, for the binary32 format, a complete C implementation of the resulting division algorithm. With the ST200 compiler and compared to previous implementations, the speed-up observed with our approach is by a factor of almost 1.8

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Proceedings of the 7th Conference on Real Numbers and Computers (RNC'7)

Author: Hanrot Guillaume
Zimmermann Paul
Publication venue: None.
Publication date: 01/01/2006
Field of study

These are the proceedings of RNC7

INRIA a CCSD electronic archive server

HAL-Rennes 1

Verified compilation and optimization of floating-point kernels

Author: Becker Heiko
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

When verifying safety-critical code on the level of source code, we trust the compiler to produce machine code that preserves the behavior of the source code. Trusting a verified compiler is easy. A rigorous machine-checked proof shows that the compiler correctly translates source code into machine code. Modern verified compilers (e.g. CompCert and CakeML) have rich input languages, but only rudimentary support for floating-point arithmetic. In fact, state-of-the-art verified compilers only implement and verify an inflexible one-to-one translation from floating-point source code to machine code. This translation completely ignores that floating-point arithmetic is actually a discrete representation of the continuous real numbers. This thesis presents two extensions improving floating-point arithmetic in CakeML. First, the thesis demonstrates verified compilation of elementary functions to floating-point code in: Dandelion, an automatic verifier for polynomial approximations of elementary functions; and libmGen, a proof-producing compiler relating floating-point machine code to the implemented real-numbered elementary function. Second, the thesis demonstrates verified optimization of floating-point code in: Icing, a floating-point language extending standard floating-point arithmetic with optimizations similar to those used by unverified compilers, like GCC and LLVM; and RealCake, an extension of CakeML with Icing into the first fully verified optimizing compiler for floating-point arithmetic.Bei der Verifizierung von sicherheitsrelevantem Quellcode vertrauen wir dem Compiler, dass er Maschinencode ausgibt, der sich wie der Quellcode verhält. Man kann ohne weiteres einem verifizierten Compiler vertrauen. Ein rigoroser maschinen-ü}berprüfter Beweis zeigt, dass der Compiler Quellcode in korrekten Maschinencode übersetzt. Moderne verifizierte Compiler (z.B. CompCert und CakeML) haben komplizierte Eingabesprachen, aber unterstützen Gleitkommaarithmetik nur rudimentär. De facto implementieren und verifizieren hochmoderne verifizierte Compiler für Gleitkommaarithmetik nur eine starre eins-zu-eins Übersetzung von Quell- zu Maschinencode. Diese Übersetzung ignoriert vollständig, dass Gleitkommaarithmetik eigentlich eine diskrete Repräsentation der kontinuierlichen reellen Zahlen ist. Diese Dissertation präsentiert zwei Erweiterungen die Gleitkommaarithmetik in CakeML verbessern. Zuerst demonstriert die Dissertation verifizierte Übersetzung von elementaren Funktionen in Gleitkommacode mit: Dandelion, einem automatischen Verifizierer für Polynomapproximierungen von elementaren Funktionen; und libmGen, einen Beweis-erzeugenden Compiler der Gleitkommacode in Relation mit der implementierten elementaren Funktion setzt. Dann demonstriert die Dissertation verifizierte Optimierung von Gleitkommacode mit: Icing, einer Gleitkommasprache die Gleitkommaarithmetik mit Optimierungen erweitert die ähnlich zu denen in unverifizierten Compilern, wie GCC und LLVM, sind; und RealCake, eine Erweiterung von CakeML mit Icing als der erste vollverifizierte Compiler für Gleitkommaarithmetik

Universaar

Acronym

Solving Constraints on the Intermediate Result of Decimal Floating-Point Operations

Author: Abraham Ziv
Merav Aharoni
Ron Maharik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

The draft revision of the IEEE Standard for Floating-Point Arithmetic (IEEE P754) includes a definition for dec-imal floating-point (FP) in addition to the widely used bi-nary FP specification. The decimal standard raises new concerns with regard to the verification of hardware- and software-based designs. The verification process normally emphasizes intricate cor-ner cases and uncommon events. The decimal format intro-duces several new classes of such events in addition to those characteristic of binary FP. Our work addresses the following problem: Given a dec-imal floating-point operation, a constraint on the interme-diate result, and a constraint on the representation selected for the result, find random inputs for the operation that yield an intermediate result compatible with these specifications. The paper supplies efficient analytic solutions for addi-tion and for some cases of multiplication and division. We provide probabilistic algorithms for the remaining cases. These algorithms prove to be efficient in the actual imple-mentation.

CiteSeerX

Crossref

Glosarium Matematika

Author: Kerami Djati
Sitanggang Cormentyna
Publication venue: 'Pusat Bahasa IAIN Sultan Amai Gorontalo'
Publication date: 01/01/2008
Field of study

273 p.; 24 cm

library.uny.ac.id

Repositori Institusi Kemendikbud

A low complexity scaling method for the Lanczos Kernel in fixed-point arithmetic

Author: Constantinides GA
Jerez JL
Kerrigan EC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/07/2013
Field of study

We consider the problem of enabling fixed-point implementation of linear algebra kernels on low-cost embedded systems, as well as motivating more efficient computational architectures for scientific applications. Fixed-point arithmetic presents additional design challenges compared to floating-point arithmetic, such as having to bound peak values of variables and control their dynamic ranges. Algorithms for solving linear equations or finding eigenvalues are typically nonlinear and iterative, making solving these design challenges a nontrivial task. For these types of algorithms, the bounding problem cannot be automated by current tools. We focus on the Lanczos iteration, the heart of well-known methods such as conjugate gradient and minimum residual. We show how one can modify the algorithm with a low-complexity scaling procedure to allow us to apply standard linear algebra to derive tight analytical bounds on all variables of the process, regardless of the properties of the original matrix. It is shown that the numerical behavior of fixed-point implementations of the modified problem can be chosen to be at least as good as a floating-point implementation, if necessary. The approach is evaluated on field-programmable gate array (FPGA) platforms, highlighting orders of magnitude potential performance and efficiency improvements by moving form floating-point to fixed-point computation

Spiral - Imperial College Digital Repository

Glosarium Matematika

Author: Iswati Ellya
Kerami Djati
Publication venue: Pusat Pembinaan dan Pengembangan Bahasa
Publication date: 01/01/1993
Field of study

Repositori Institusi Kemendikbud

Fast Ray Tracing Techniques

Author: Tsakok John
Publication venue: 'University of Waterloo'
Publication date: 01/01/2008
Field of study

In the past, ray tracing has been used widely in offline rendering applications since it provided the ability to better capture high quality secondary effects such as reflection, refraction and shadows. Such effects are difficult to produce in a robust, high quality fashion with traditional, real-time rasterization algorithms. Motivated to bring the advantages to ray tracing to real-time applications, researchers have developed better and more efficient algorithms that leverage the current generation of fast, parallel CPU hardware within the past few years. This thesis provides the implementation and design details of a high performance ray tracing solution called ``RTTest'' for standard, desktop CPUs. Background information on various algorithms and acceleration structures are first discussed followed by an introduction to novel techniques used to better accelerate current, core ray tracing techniques. Techniques such as Omni-Directional Packets, Cone Proxy Traversal and Multiple Frustum Traversal are proposed and benchmarked using standard ray tracing scenes. Also, a novel soft shadowing algorithm called Edge Width Soft Shadows is proposed which achieves performance comparable to a single sampled hard shadow approach targeted at real time applications such as games. Finally, additional information on the memory layout, rendering pipeline, shader system and code level optimizations of RTTest are also discussed

University of Waterloo's Institutional Repository

MRRR-based Eigensolvers for Multi-core Processors and Supercomputers

Author: Petschow Matthias
Publication venue
Publication date: 01/01/2013
Field of study

The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR or MR3 in short) - introduced in the late 1990s - is among the fastest methods. To compute k eigenpairs of a real n-by-n tridiagonal T, MRRR only requires O(kn) arithmetic operations; in contrast, all the other practical methods require O(k^2 n) or O(n^3) operations in the worst case. This thesis centers around the performance and accuracy of MRRR.Comment: PhD thesi

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University