270 research outputs found
Implementation and performance evaluation of an extended precision floating-point arithmetic library for high-accuracy semidefinite programming
International audienceSemidefinite programming (SDP) is widely used in optimization problems with many applications, however , certain SDP instances are ill-posed and need more precision than the standard double-precision available. Moreover, these problems are large-scale and could benefit from parallelization on specialized architectures such as GPUs. In this article, we implement and evaluate the performance of a floating-point expansion-based arithmetic library (newFPLib) in the context of such numerically highly accurate SDP solvers. We plugged-in the newFPLib with the state-of-the-art SDPA solver for both CPU and GPU-tuned implementations. We compare and contrast both the numerical accuracy and performance of SDPA-GMP,-QD and-DD, which employ other multiple-precision arithmetic libraries against SDPA-newFPLib. We show that our newFPLib is a very good trade-off for accuracy and speed when solving ill-conditioned SDP problems
Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs
General Matrix Multiplication (GEMM) is a fundamental operation widely used
in scientific computations. Its performance and accuracy significantly impact
the performance and accuracy of applications that depend on it. One such
application is semidefinite programming (SDP), and it often requires binary128
or higher precision arithmetic to solve problems involving SDP stably. However,
only some processors support binary128 arithmetic, which makes SDP solvers
generally slow. In this study, we focused on accelerating GEMM with binary128
arithmetic on field-programmable gate arrays (FPGAs) to enable the flexible
design of accelerators for the desired computations. Our binary128 GEMM designs
on a recent high-performance FPGA achieved approximately 90GFlops, 147x faster
than the computation executed on a recent CPU with 20 threads for large
matrices. Using our binary128 GEMM design on the FPGA, we successfully
accelerated two numerical applications: LU decomposition and SDP problems, for
the first time.Comment: 12 pages, 8 figure
Certified Roundoff Error Bounds Using Semidefinite Programming.
Roundoff errors cannot be avoided when implementing numerical programs with finite precision. The ability to reason about rounding is especially important if one wants to explore a range of potential representations, for instance for FPGAs or custom hardware implementation. This problem becomes challenging when the program does not employ solely linear operations as non-linearities are inherent to many interesting computational problems in real-world applications. Existing solutions to reasoning are limited in the presence of nonlinear correlations between variables, leading to either imprecise bounds or high analysis time. Furthermore, while it is easy to implement a straightforward method such as interval arithmetic, sophisticated techniques are less straightforward to implement in a formal setting. Thus there is a need for methods which output certificates that can be formally validated inside a proof assistant. We present a framework to provide upper bounds on absolute roundoff errors. This framework is based on optimization techniques employing semidefinite programming and sums of squares certificates, which can be formally checked inside the Coq theorem prover. Our tool covers a wide range of nonlinear programs, including polynomials and transcendental operations as well as conditional statements. We illustrate the efficiency and precision of this tool on non-trivial programs coming from biology, optimization and space control. Our tool produces more precise error bounds for 37 percent of all programs and yields better performance in 73 percent of all programs
White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing
In numerical computations, precision of floating-point computations is a key
factor to determine the performance (speed and energy-efficiency) as well as
the reliability (accuracy and reproducibility). However, precision generally
plays a contrary role for both. Therefore, the ultimate concept for maximizing
both at the same time is the minimal-precision computing through
precision-tuning, which adjusts the optimal precision for each operation and
data. Several studies have been already conducted for it so far (e.g.
Precimoniuos and Verrou), but the scope of those studies is limited to the
precision-tuning alone. Hence, we aim to propose a broader concept of the
minimal-precision computing system with precision-tuning, involving both
hardware and software stack.
In 2019, we have started the Minimal-Precision Computing project to propose a
more broad concept of the minimal-precision computing system with
precision-tuning, involving both hardware and software stack. Specifically, our
system combines (1) a precision-tuning method based on Discrete Stochastic
Arithmetic (DSA), (2) arbitrary-precision arithmetic libraries, (3) fast and
accurate numerical libraries, and (4) Field-Programmable Gate Array (FPGA) with
High-Level Synthesis (HLS).
In this white paper, we aim to provide an overview of various technologies
related to minimal- and mixed-precision, to outline the future direction of the
project, as well as to discuss current challenges together with our project
members and guest speakers at the LSPANC 2020 workshop;
https://www.r-ccs.riken.jp/labs/lpnctrt/lspanc2020jan/
A Verified Certificate Checker for Finite-Precision Error Bounds in Coq and HOL4
Being able to soundly estimate roundoff errors of finite-precision
computations is important for many applications in embedded systems and
scientific computing. Due to the discrepancy between continuous reals and
discrete finite-precision values, automated static analysis tools are highly
valuable to estimate roundoff errors. The results, however, are only as correct
as the implementations of the static analysis tools. This paper presents a
formally verified and modular tool which fully automatically checks the
correctness of finite-precision roundoff error bounds encoded in a certificate.
We present implementations of certificate generation and checking for both Coq
and HOL4 and evaluate it on a number of examples from the literature. The
experiments use both in-logic evaluation of Coq and HOL4, and execution of
extracted code outside of the logics: we benchmark Coq extracted unverified
OCaml code and a CakeML-generated verified binary
On Sound Relative Error Bounds for Floating-Point Arithmetic
State-of-the-art static analysis tools for verifying finite-precision code
compute worst-case absolute error bounds on numerical errors. These are,
however, often not a good estimate of accuracy as they do not take into account
the magnitude of the computed values. Relative errors, which compute errors
relative to the value's magnitude, are thus preferable. While today's tools do
report relative error bounds, these are merely computed via absolute errors and
thus not necessarily tight or more informative. Furthermore, whenever the
computed value is close to zero on part of the domain, the tools do not report
any relative error estimate at all. Surprisingly, the quality of relative error
bounds computed by today's tools has not been systematically studied or
reported to date. In this paper, we investigate how state-of-the-art static
techniques for computing sound absolute error bounds can be used, extended and
combined for the computation of relative errors. Our experiments on a standard
benchmark set show that computing relative errors directly, as opposed to via
absolute errors, is often beneficial and can provide error estimates up to six
orders of magnitude tighter, i.e. more accurate. We also show that interval
subdivision, another commonly used technique to reduce over-approximations, has
less benefit when computing relative errors directly, but it can help to
alleviate the effects of the inherent issue of relative error estimates close
to zero
Automatic Estimation of Verified Floating-Point Round-Off Errors via Static Analysis
This paper introduces a static analysis technique for computing formally verified round-off error bounds of floating-point functional expressions. The technique is based on a denotational semantics that computes a symbolic estimation of floating-point round-o errors along with a proof certificate that ensures its correctness. The symbolic estimation can be evaluated on concrete inputs using rigorous enclosure methods to produce formally verified numerical error bounds. The proposed technique is implemented in the prototype research tool PRECiSA (Program Round-o Error Certifier via Static Analysis) and used in the verification of floating-point programs of interest to NASA
Verified compilation and optimization of floating-point kernels
When verifying safety-critical code on the level of source code, we trust the compiler to produce machine code that preserves the behavior of the source code. Trusting a verified compiler is easy. A rigorous machine-checked proof shows that the compiler correctly translates source code into machine code. Modern verified compilers (e.g. CompCert and CakeML) have rich input languages, but only rudimentary support for floating-point arithmetic. In fact, state-of-the-art verified compilers only implement and verify an inflexible one-to-one translation from floating-point source code to machine code. This translation completely ignores that floating-point arithmetic is actually a discrete representation of the continuous real numbers. This thesis presents two extensions improving floating-point arithmetic in CakeML. First, the thesis demonstrates verified compilation of elementary functions to floating-point code in: Dandelion, an automatic verifier for polynomial approximations of elementary functions; and libmGen, a proof-producing compiler relating floating-point machine code to the implemented real-numbered elementary function. Second, the thesis demonstrates verified optimization of floating-point code in: Icing, a floating-point language extending standard floating-point arithmetic with optimizations similar to those used by unverified compilers, like GCC and LLVM; and RealCake, an extension of CakeML with Icing into the first fully verified optimizing compiler for floating-point arithmetic.Bei der Verifizierung von sicherheitsrelevantem Quellcode vertrauen wir dem Compiler, dass er Maschinencode ausgibt, der sich wie der Quellcode verhĂ€lt. Man kann ohne weiteres einem verifizierten Compiler vertrauen. Ein rigoroser maschinen-ĂŒ}berprĂŒfter Beweis zeigt, dass der Compiler Quellcode in korrekten Maschinencode ĂŒbersetzt. Moderne verifizierte Compiler (z.B. CompCert und CakeML) haben komplizierte Eingabesprachen, aber unterstĂŒtzen Gleitkommaarithmetik nur rudimentĂ€r. De facto implementieren und verifizieren hochmoderne verifizierte Compiler fĂŒr Gleitkommaarithmetik nur eine starre eins-zu-eins Ăbersetzung von Quell- zu Maschinencode. Diese Ăbersetzung ignoriert vollstĂ€ndig, dass Gleitkommaarithmetik eigentlich eine diskrete ReprĂ€sentation der kontinuierlichen reellen Zahlen ist. Diese Dissertation prĂ€sentiert zwei Erweiterungen die Gleitkommaarithmetik in CakeML verbessern. Zuerst demonstriert die Dissertation verifizierte Ăbersetzung von elementaren Funktionen in Gleitkommacode mit: Dandelion, einem automatischen Verifizierer fĂŒr Polynomapproximierungen von elementaren Funktionen; und libmGen, einen Beweis-erzeugenden Compiler der Gleitkommacode in Relation mit der implementierten elementaren Funktion setzt. Dann demonstriert die Dissertation verifizierte Optimierung von Gleitkommacode mit: Icing, einer Gleitkommasprache die Gleitkommaarithmetik mit Optimierungen erweitert die Ă€hnlich zu denen in unverifizierten Compilern, wie GCC und LLVM, sind; und RealCake, eine Erweiterung von CakeML mit Icing als der erste vollverifizierte Compiler fĂŒr Gleitkommaarithmetik
Dandelion: Certified Approximations of Elementary Functions
Elementary function operations such as sin and exp cannot in general be computed exactly on today's digital computers, and thus have to be approximated. The standard approximations in library functions typically provide only a limited set of precisions, and are too inefficient for many applications. Polynomial approximations that are customized to a limited input domain and output accuracy can provide superior performance. In fact, the Remez algorithm computes the best possible approximation for a given polynomial degree, but has so far not been formally verified. This paper presents Dandelion, an automated certificate checker for polynomial approximations of elementary functions computed with Remez-like algorithms that is fully verified in the HOL4 theorem prover. Dandelion checks whether the difference between a polynomial approximation and its target reference elementary function remains below a given error bound for all inputs in a given constraint. By extracting a verified binary with the CakeML compiler, Dandelion can validate certificates within a reasonable time, fully automating previous manually verified approximations
- âŠ