10,423 research outputs found
Certifying floating-point implementations using Gappa
High confidence in floating-point programs requires proving numerical
properties of final and intermediate values. One may need to guarantee that a
value stays within some range, or that the error relative to some ideal value
is well bounded. Such work may require several lines of proof for each line of
code, and will usually be broken by the smallest change to the code (e.g. for
maintenance or optimization purpose). Certifying these programs by hand is
therefore very tedious and error-prone. This article discusses the use of the
Gappa proof assistant in this context. Gappa has two main advantages over
previous approaches: Its input format is very close to the actual C code to
validate, and it automates error evaluation and propagation using interval
arithmetic. Besides, it can be used to incrementally prove complex mathematical
properties pertaining to the C code. Yet it does not require any specific
knowledge about automatic theorem proving, and thus is accessible to a wide
community. Moreover, Gappa may generate a formal proof of the results that can
be checked independently by a lower-level proof assistant like Coq, hence
providing an even higher confidence in the certification of the numerical code.
The article demonstrates the use of this tool on a real-size example, an
elementary function with correctly rounded output
Square-rich fixed point polynomial evaluation on FPGAs
Polynomial evaluation is important across a wide range of application domains, so significant work has been done on accelerating its computation. The conventional algorithm, referred to as Horner's rule, involves the least number of steps but can lead to increased latency due to serial computation. Parallel evaluation algorithms such as Estrin's method have shorter latency than Horner's rule, but achieve this at the expense of large hardware overhead. This paper presents an efficient polynomial evaluation algorithm, which reforms the evaluation process to include an increased number of squaring steps. By using a squarer design that is more efficient than general multiplication, this can result in polynomial evaluation with a 57.9% latency reduction over Horner's rule and 14.6% over Estrin's method, while consuming less area than Horner's rule, when implemented on a Xilinx Virtex 6 FPGA. When applied in fixed point function evaluation, where precision requirements limit the rounding of operands, it still achieves a 52.4% performance gain compared to Horner's rule with only a 4% area overhead in evaluating 5th degree polynomials
The Voigt and complex error function: Huml\'i\v{c}ek's rational approximation generalized
Accurate yet efficient computation of the Voigt and complex error function is
a challenge since decades in astrophysics and other areas of physics. Rational
approximations have attracted considerable attention and are used in many
codes, often in combination with other techniques. The 12-term code "cpf12" of
Huml\'i\v{c}ek (1979) achieves an accuracy of five to six significant digits
throughout the entire complex plane. Here we generalize this algorithm to a
larger (even) number of terms. The approximation has a relative accuracy
better than for almost the entire complex plane except for very small
imaginary values of the argument even without the correction term required for
the cpf12 algorithm. With 20 terms the accuracy is better than . In
addition to the accuracy assessment we discuss methods for optimization and
propose a combination of the 16-term approximation with the asymptotic
approximation of Huml\'i\v{c}ek (1982) for high efficiency.Comment: 9 pages, 5 figure
A MDE-based optimisation process for Real-Time systems
The design and implementation of Real-Time Embedded Systems is now heavily relying on Model-Driven Engineering (MDE) as a central place to define and then analyze or implement a system. MDE toolchains are taking a key role as to gather most of functional and not functional properties in a central framework, and then exploit this information. Such toolchain is based on both 1) a modeling notation, and 2) companion tools to transform or analyse models. In this paper, we present a MDE-based process for system optimisation based on an architectural description. We first define a generic evaluation pipeline, define a library of elementary transformations and then shows how to use it through Domain-Specific Language to evaluate and then transform models. We illustrate this process on an AADL case study modeling a Generic Avionics Platform
Proving Tight Bounds on Univariate Expressions with Elementary Functions in Coq
International audienceThe verification of floating-point mathematical libraries requires computing numerical bounds on approximation errors. Due to the tightness of these bounds and the peculiar structure of approximation errors, such a verification is out of the reach of generic tools such as computer algebra systems. In fact, the inherent difficulty of computing such bounds often mandates a formal proof of them. In this paper, we present a tactic for the Coq proof assistant that is designed to automatically and formally prove bounds on univariate expressions. It is based on a formalization of floating-point and interval arithmetic, associated with an on-the-fly computation of Taylor expansions. All the computations are performed inside Coq's logic, in a reflexive setting. This paper also compares our tactic with various existing tools on a large set of examples
Verified compilation and optimization of floating-point kernels
When verifying safety-critical code on the level of source code, we trust the compiler to produce machine code that preserves the behavior of the source code. Trusting a verified compiler is easy. A rigorous machine-checked proof shows that the compiler correctly translates source code into machine code. Modern verified compilers (e.g. CompCert and CakeML) have rich input languages, but only rudimentary support for floating-point arithmetic. In fact, state-of-the-art verified compilers only implement and verify an inflexible one-to-one translation from floating-point source code to machine code. This translation completely ignores that floating-point arithmetic is actually a discrete representation of the continuous real numbers. This thesis presents two extensions improving floating-point arithmetic in CakeML. First, the thesis demonstrates verified compilation of elementary functions to floating-point code in: Dandelion, an automatic verifier for polynomial approximations of elementary functions; and libmGen, a proof-producing compiler relating floating-point machine code to the implemented real-numbered elementary function. Second, the thesis demonstrates verified optimization of floating-point code in: Icing, a floating-point language extending standard floating-point arithmetic with optimizations similar to those used by unverified compilers, like GCC and LLVM; and RealCake, an extension of CakeML with Icing into the first fully verified optimizing compiler for floating-point arithmetic.Bei der Verifizierung von sicherheitsrelevantem Quellcode vertrauen wir dem Compiler, dass er Maschinencode ausgibt, der sich wie der Quellcode verhĂ€lt. Man kann ohne weiteres einem verifizierten Compiler vertrauen. Ein rigoroser maschinen-ĂŒ}berprĂŒfter Beweis zeigt, dass der Compiler Quellcode in korrekten Maschinencode ĂŒbersetzt. Moderne verifizierte Compiler (z.B. CompCert und CakeML) haben komplizierte Eingabesprachen, aber unterstĂŒtzen Gleitkommaarithmetik nur rudimentĂ€r. De facto implementieren und verifizieren hochmoderne verifizierte Compiler fĂŒr Gleitkommaarithmetik nur eine starre eins-zu-eins Ăbersetzung von Quell- zu Maschinencode. Diese Ăbersetzung ignoriert vollstĂ€ndig, dass Gleitkommaarithmetik eigentlich eine diskrete ReprĂ€sentation der kontinuierlichen reellen Zahlen ist. Diese Dissertation prĂ€sentiert zwei Erweiterungen die Gleitkommaarithmetik in CakeML verbessern. Zuerst demonstriert die Dissertation verifizierte Ăbersetzung von elementaren Funktionen in Gleitkommacode mit: Dandelion, einem automatischen Verifizierer fĂŒr Polynomapproximierungen von elementaren Funktionen; und libmGen, einen Beweis-erzeugenden Compiler der Gleitkommacode in Relation mit der implementierten elementaren Funktion setzt. Dann demonstriert die Dissertation verifizierte Optimierung von Gleitkommacode mit: Icing, einer Gleitkommasprache die Gleitkommaarithmetik mit Optimierungen erweitert die Ă€hnlich zu denen in unverifizierten Compilern, wie GCC und LLVM, sind; und RealCake, eine Erweiterung von CakeML mit Icing als der erste vollverifizierte Compiler fĂŒr Gleitkommaarithmetik
Towards the AlexNet Moment for Homomorphic Encryption: HCNN, theFirst Homomorphic CNN on Encrypted Data with GPUs
Deep Learning as a Service (DLaaS) stands as a promising solution for
cloud-based inference applications. In this setting, the cloud has a
pre-learned model whereas the user has samples on which she wants to run the
model. The biggest concern with DLaaS is user privacy if the input samples are
sensitive data. We provide here an efficient privacy-preserving system by
employing high-end technologies such as Fully Homomorphic Encryption (FHE),
Convolutional Neural Networks (CNNs) and Graphics Processing Units (GPUs). FHE,
with its widely-known feature of computing on encrypted data, empowers a wide
range of privacy-concerned applications. This comes at high cost as it requires
enormous computing power. In this paper, we show how to accelerate the
performance of running CNNs on encrypted data with GPUs. We evaluated two CNNs
to classify homomorphically the MNIST and CIFAR-10 datasets. Our solution
achieved a sufficient security level (> 80 bit) and reasonable classification
accuracy (99%) and (77.55%) for MNIST and CIFAR-10, respectively. In terms of
latency, we could classify an image in 5.16 seconds and 304.43 seconds for
MNIST and CIFAR-10, respectively. Our system can also classify a batch of
images (> 8,000) without extra overhead
Automatic generation of polynomial-based hardware architectures for function evaluation
International audienceMany applications require the evaluation of some function through polynomial approximation. This article details an architecture generator for this class of problems that improves upon the literature in two aspects. Firstly, it benefits from recent advances related to constrained-coefficient polynomial approximation. Secondly, it refines the error analysis of polynomial evaluation to reduce the size of the multipliers used. As a result, architectures for evaluating arbitrary functions with precisions up to 64 bits, making efficient use of the resources of recent FPGAs, can be obtained in seconds. An open-source implementation is provided in the FloPoCo project
- âŠ