Search CORE

6,643 research outputs found

Formal Verification of an Iterative Low-Power x86 Floating-Point Multiplier with Redundant Feedback

Author: Anna Slobodová
Anna Slobodová
Christoph Berg
David Hardin
David M. Russinoff
David M. Russinoff
David M. Russinoff
David M. Russinoff
David M. Russinoff
Dimitri Tan
Erik Reeber
Julien Schmaltz
Mark D Aagaard
Matt Kaufmann
Matt Kaufmann
Peter-Michael Seidel
Roope Kaivola
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2011
Field of study

We present the formal verification of a low-power x86 floating-point multiplier. The multiplier operates iteratively and feeds back intermediate results in redundant representation. It supports x87 and SSE instructions in various precisions and can block the issuing of new instructions. The design has been optimized for low-power operation and has not been constrained by the formal verification effort. Additional improvements for the implementation were identified through formal verification. The formal verification of the design also incorporates the implementation of clock-gating and control logic. The core of the verification effort was based on ACL2 theorem proving. Additionally, model checking has been used to verify some properties of the floating-point scheduler that are relevant for the correct operation of the unit.Comment: In Proceedings ACL2 2011, arXiv:1110.447

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Interval Slopes as Numerical Abstract Domain for Floating-Point Variables

Author: A. Chapoutot
A. Chapoutot
A. Miné
A. Miné
A. Simon
C.H. Bischof
D. Monniaux
E. Goubault
E. Goubault
F. Logozzo
J. Férêt
J.M. Muller
K. Ghorbal
L. Chen
L. Chen
L. Chen
M. Karr
M. Martel
M. Péron
N. Higham
P. Cousot
P. Cousot
P. Granger
P. Granger
R. Clarisó
R. Krawczyk
R. Moore
S. Rump
S. Sankaranarayanan
V. Laviron
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The design of embedded control systems is mainly done with model-based tools such as Matlab/Simulink. Numerical simulation is the central technique of development and verification of such tools. Floating-point arithmetic, that is well-known to only provide approximated results, is omnipresent in this activity. In order to validate the behaviors of numerical simulations using abstract interpretation-based static analysis, we present, theoretically and with experiments, a new partially relational abstract domain dedicated to floating-point variables. It comes from interval expansion of non-linear functions using slopes and it is able to mimic all the behaviors of the floating-point arithmetic. Hence it is adapted to prove the absence of run-time errors or to analyze the numerical precision of embedded control systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Contract-Based General-Purpose GPU Programming

Author: Kolesnichenko Alexey
Meyer Bertrand
Nanz Sebastian
Poskitt Christopher M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2014
Field of study

Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper suggests a programming library, SafeGPU, that aims at striking a balance between programmer productivity and performance, by making GPU data-parallel operations accessible from within a classical object-oriented programming language. The solution is integrated with the design-by-contract approach, which increases confidence in functional program correctness by embedding executable program specifications into the program text. We show that our library leads to modular and maintainable code that is accessible to GPGPU non-experts, while providing performance that is comparable with hand-written CUDA code. Furthermore, runtime contract checking turns out to be feasible, as the contracts can be executed on the GPU

arXiv.org e-Print Archive

CiteSeerX

Repository for Publications and Research Data

Crossref

Institutional Knowledge at Singapore Management University

Hardware-Independent Proofs of Numerical Programs

Author: Boldo Sylvie
Nguyen Thi Minh Tuyen
Publication venue
Publication date: 13/04/2010
Field of study

On recent architectures, a numerical program may give different answers depending on the execution hardware and the compilation. Our goal is to formally prove properties about numerical programs that are true for multiple architectures and compilers. We propose an approach that states the rounding error of each floating-point computation whatever the environment. This approach is implemented in the Frama-C platform for static analysis of C code. Small case studies using this approach are entirely and automatically prove

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

NASA Technical Reports Server

HAL-Rennes 1

An efficient IEEE754 compliant floating point unit using verilog

Author: Dev Ruby
Sahu Lipsa
Publication venue
Publication date: 14/05/2012
Field of study

A floating-point unit (FPU) colloquially is a math coprocessor, which is a part of a computer system specially designed to carry out operations on floating point numbers [1]. Typical operations that are handled by FPU are addition, subtraction, multiplication and division. The aim was to build an efficient FPU that performs basic as well as transcendental functions with reduced complexity of the logic used reduced or at least comparable time bounds as those of x87 family at similar clock speed and reduced the memory requirement as far as possible. The functions performed are handling of Floating Point data, converting data to IEEE754 format, perform any one of the following arithmetic operations like addition, subtraction, multiplication, division and shift operation and transcendental operations like square Root, sine of an angle and cosine of an angle. All the above algorithms have been clocked and evaluated under Spartan 3E Synthesis environment. All the functions are built by possible efficient algorithms with several changes incorporated at our end as far as the scope permitted. Consequently all of the unit functions are unique in certain aspects and given the right environment(in terms of higher memory or say clock speed or data width better than the FPGA Spartan 3E Synthesizing environment) these functions will tend to show comparable efficiency and speed ,and if pipelined then higher throughput

ethesis@nitr

Efficient arithmetic for high speed DSP implementation on FPGAs

Author: Alexander Steven Wilson
Publication venue
Publication date: 01/01/2007
Field of study

The author was sponsored by EnTegra Ltd, a company who develop hardware and software products and services for the real time implementation of DSP and RF systems. The field programmable gate array (FPGA) is being used increasingly in the field of DSP. This is due to the fact that the parallel computing power of such devices is ideal for today’s truly demanding DSP algorithms. Algorithms such as the QR-RLS update are computationally intensive and must be carried out at extremely high speeds (MHz). This means that the DSP processor is simply not an option. ASICs can be used but the expense of developing custom logic is prohibitive. The increased use of the FPGA in DSP means that there is a significant requirement for efficient arithmetic cores that utilises the resources on such devices. This thesis presents the research and development effort that was carried out to produce fixed point division and square root cores for use in a new Electronic Design Automation (EDA) tool for EnTegra, which is targeted at FPGA implementation of DSP systems. Further to this, a new technique for predicting the accuracy of CORDIC systems computing vector magnitudes and cosines/sines is presented. This work allows the most efficient CORDIC design for a specified level of accuracy to be found quickly and easily without the need to run lengthy simulations, as was the case before. The CORDIC algorithm is a technique using mainly shifts and additions to compute many arithmetic functions and is thus ideal for FPGA implementation

Glasgow Theses Service

OpenGrey Repository

Doctor of Philosophy

Author: Sun Xiaojun
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

dissertationFormal verification of hardware designs has become an essential component of the overall system design flow. The designs are generally modeled as finite state machines, on which property and equivalence checking problems are solved for verification. Reachability analysis forms the core of these techniques. However, increasing size and complexity of the circuits causes the state explosion problem. Abstraction is the key to tackling the scalability challenges. This dissertation presents new techniques for word-level abstraction with applications in sequential design verification. By bundling together k bit-level state-variables into one word-level constraint expression, the state-space is construed as solutions (variety) to a set of polynomial constraints (ideal), modeled over the finite (Galois) field of 2^k elements. Subsequently, techniques from algebraic geometry -- notably, Groebner basis theory and technology -- are researched to perform reachability analysis and verification of sequential circuits. This approach adds a "word-level dimension" to state-space abstraction and verification to make the process more efficient. While algebraic geometry provides powerful abstraction and reasoning capabilities, the algorithms exhibit high computational complexity. In the dissertation, we show that by analyzing the constraints, it is possible to obtain more insights about the polynomial ideals, which can be exploited to overcome the complexity. Using our algorithm design and implementations, we demonstrate how to perform reachability analysis of finite-state machines purely at the word level. Using this concept, we perform scalable verification of sequential arithmetic circuits. As contemporary approaches make use of resolution proofs and unsatisfiable cores for state-space abstraction, we introduce the algebraic geometry analog of unsatisfiable cores, and present algorithms to extract and refine unsatisfiable cores of polynomial ideals. Experiments are performed to demonstrate the efficacy of our approaches

The University of Utah: J. Willard Marriott Digital Library

Characterization and Implementation of a Real-World Target Tracking Algorithm on Field Programmable Gate Arrays with Kalman Filter Test Case

Author: Hancey Benjamin D.
Publication venue: AFIT Scholar
Publication date: 01/03/2008
Field of study

A one dimensional Kalman Filter algorithm provided in Matlab is used as the basis for a Very High Speed Integrated Circuit Hardware Description Language (VHDL) model. The JAVA programming language is used to create the VHDL code that describes the Kalman filter in hardware which allows for maximum flexibility. A one-dimensional behavioral model of the Kalman Filter is described, as well as a one-dimensional and synthesizable register transfer level (RTL) model with optimizations for speed, area, and power. These optimizations are achieved by a focus on parallelization as well as careful Kalman filter sub-module algorithm selection. Newton-Raphson reciprocal is the chosen algorithm for a fundamental aspect of the Kalman filter, which allows efficient high-speed computation of reciprocals within the overall system. The Newton-Raphson method is also expanded for use in calculating square-roots in an optimized and synthesizable two-dimensional VHDL implementation of the Kalman filter. The two-dimensional Kalman filter expands on the one-dimensional implementation allowing for the tracking of targets on a real-world Cartesian coordinate system

AFTI Scholar (Air Force Institute of Technology)