9 research outputs found

    Adaptive FPGA NoC-based Architecture for Multispectral Image Correlation

    Full text link
    An adaptive FPGA architecture based on the NoC (Network-on-Chip) approach is used for the multispectral image correlation. This architecture must contain several distance algorithms depending on the characteristics of spectral images and the precision of the authentication. The analysis of distance algorithms is required which bases on the algorithmic complexity, result precision, execution time and the adaptability of the implementation. This paper presents the comparison of these distance computation algorithms on one spectral database. The result of a RGB algorithm implementation was discussed

    An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithm

    Get PDF
    n this work we present an FPGA implementation of a single-precision °oating-point arith- metic powering unit. Our powering unit is based on an indirect method that transforms xy into a chain of operations involving a logarithm, a multiplication, an exponential function and dedicated logic for the case of a negative base. This approach allows to use the full input range for the base and exponent without limiting the range of the exponent as in direct methods. A tailored hardware implementation is exploited to increase the accuracy of the unit reducing the relative errors of the operations while high performance is obtained taking advantage of the FPGA capabilities for parallel architectures. A careful design of the pipeline stages of the involved operators allows a clock cycle of 201.3 MHz on a Xilinx Virtex-4 FPG

    FPGA Implementation of Floating Point Reciprocator Using Binomial Expansion Method

    Get PDF
    Floating-point support has become a mandatory feature of new micro processors due to the prevalence of business, technical, and recreational applications that use these operations.  In these operations Floating-point division is generally regarded as a low frequency, high latency operation in typical floating-point applications. So due to this not much development had taken place in this field. But nowadays floating point divider has become indispensable and increasingly important in many scientific and signal processing applications.In this paper we implement floating point reciprocator for both single precision and double precision floating point numbers using FPGA. Here the implementation is based on the binomial expansion method. By comparing with previous works the modules occupy less area with a higher performance and less latency. The designs trade off either 1 unit in last-place (ulp) or 2 ulp of accuracy (for double or single precision respectively), without rounding, to obtain a better implementation

    Customizing floating-point units for FPGAs: Area-performance-standard trade-offs

    Get PDF
    The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexit

    A type-safe arbitrary precision arithmetic portability layer for HLS tools

    Get PDF
    International audienceRecent studies have shown that High-Level Synthesis (HLS) is an efficient way to design operators for floating-point arithmetic, or for emerging alternative formats such as posits. However, HLS tools support different supersets of different subsets of the C language-for example, support for arbitrary-sized bit vectors may be provided through vendor-specific data-type libraries such as ac_int, ap_int, or int1 to int64, while others only support the standard C integer types. This is a problem when carefully tuning an operator's internal data-path, as there is no portable HLS standard for arbitrary width integers, and vendor libraries may introduce implicit casts and extensions that can hide subtle bugs. Each vendor also offers varying support for important operator-building primitives, such as platform-optimized leading-zero count. To address such problems, this work introduces Hint (hardware integer), a header-only compatibility layer offering a consistent and comprehensive interface to signed and unsigned arbitrary-sized integers. To avoid bugs Hint is strongly typed, requiring exact matching of expression widths and types-this type-checking is performed statically using the C++ template system, and adds no overhead at synthesis time. The current implementation wraps ac_int and ap_int with no performance or resource overhead when synthesized on Xilinx or Intel FPGAs. It also offers a Boost::multiprecision backend for fast simulation. Hint is open-source and extensible, and aims to provide an optimized superset of existing library primitives. This work is evaluated with arithmetic operators useful when implementing floating-point and posit operators (shifter, leading zero counter, fused shifter+sticky) deployed using two mainstream HLS tools (Xilinx VivadoHLS, and IntelHLS). A complete posit adder operator has also been written using Hint, showing no overhead when compared to the original operator written for Xilinx FPGAs

    Optimisations arithmétiques et synthèse de haut niveau

    Get PDF
    High-level synthesis (HLS) tools offer increased productivity regarding FPGA programming.However, due to their relatively young nature, they still lack many arithmetic optimizations.This thesis proposes safe arithmetic optimizations that should always be applied.These optimizations are simple operator specializations, following the C semantic.Other require to a lift the semantic embedded in high-level input program languages, which are inherited from software programming, for an improved accuracy/cost/performance ratio.To demonstrate this claim, the sum-of-product of floating-point numbers is used as a case study. The sum is performed on a fixed-point format, which is tailored to the application, according to the context in which the operator is instantiated.In some cases, there is not enough information about the input data to tailor the fixed-point accumulator.The fall-back strategy used in this thesis is to generate an accumulator covering the entire floating-point range.This thesis explores different strategies for implementing such a large accumulator, including new ones.The use of a 2's complement representation instead of a sign+magnitude is demonstrated to save resources and to reduce the accumulation loop delay.Based on a tapered precision scheme and an exact accumulator, the posit number systems claims to be a candidate to replace the IEEE floating-point format.A throughout analysis of posit operators is performed, using the same level of hardware optimization as state-of-the-art floating-point operators.Their cost remains much higher that their floating-point counterparts in terms of resource usage and performance. Finally, this thesis presents a compatibility layer for HLS tools that allows one code to be deployed on multiple tools.This library implements a strongly typed custom size integer type along side a set of optimized custom operators.À cause de la nature relativement jeune des outils de synthèse de haut-niveau (HLS), de nombreuses optimisations arithmétiques n'y sont pas encore implémentées. Cette thèse propose des optimisations arithmétiques se servant du contexte spécifique dans lequel les opérateurs sont instanciés.Certaines optimisations sont de simples spécialisations d'opérateurs, respectant la sémantique du C.D'autres nécéssitent de s'éloigner de cette sémantique pour améliorer le compromis précision/coût/performance.Cette proposition est démontré sur des sommes de produits de nombres flottants.La somme est réalisée dans un format en virgule-fixe défini par son contexte.Quand trop peu d’informations sont disponibles pour définir ce format en virgule-fixe, une stratégie est de générer un accumulateur couvrant l'intégralité du format flottant.Cette thèse explore plusieurs implémentations d'un tel accumulateur.L'utilisation d'une représentation en complément à deux permet de réduire le chemin critique de la boucle d'accumulation, ainsi que la quantité de ressources utilisées. Un format alternatif aux nombres flottants, appelé posit, propose d'utiliser un encodage à précision variable.De plus, ce format est augmenté par un accumulateur exact.Pour évaluer précisément le coût matériel de ce format, cette thèse présente des architectures d'opérateurs posits, implémentés avec le même degré d'optimisation que celui de l'état de l'art des opérateurs flottants.Une analyse détaillée montre que le coût des opérateurs posits est malgré tout bien plus élevé que celui de leurs équivalents flottants.Enfin, cette thèse présente une couche de compatibilité entre outils de HLS, permettant de viser plusieurs outils avec un seul code. Cette bibliothèque implémente un type d'entiers de taille variable, avec de plus une sémantique strictement typée, ainsi qu'un ensemble d'opérateurs ad-hoc optimisés

    Diseño e implementación de una Unidad Aritmética de Coma Flotante (FPU) genérica y fexible

    Get PDF
    Proyecto de Graduación (Licenciatura en Ingeniería en Electrónica) Instituto Tecnológico de Costa Rica. Escuela de Ingeniería Electrónica, 2016.A methodology that measures the DSP performance of a ASP with low-power target applications is presented. Additionally, the timing, area and power synthesis results for an adder, a multiplier and a CORDIC oating point units in Artix 7 FPGA family and 0.13 m technology for single and double precision in various system frequencies, are presented. The pipelined adder achieves a maximum frequency of 350MHz, the multiplier (with a simple Karatsuba signi cand multiplication) reaches 243MHz, and lastly, the standalone CORDIC oating point operator reaches 537MHz

    Precision analysis for hardware acceleration of numerical algorithms

    No full text
    The precision used in an algorithm affects the error and performance of individual computations, the memory usage, and the potential parallelism for a fixed hardware budget. However, when migrating an algorithm onto hardware, the potential improvements that can be obtained by tuning the precision throughout an algorithm to meet a range or error specification are often overlooked; the major reason is that it is hard to choose a number system which can guarantee any such specification can be met. Instead, the problem is mitigated by opting to use IEEE standard double precision arithmetic so as to be ‘no worse’ than a software implementation. However, the flexibility in the number representation is one of the key factors that can be exploited on reconfigurable hardware such as FPGAs, and hence ignoring this potential significantly limits the performance achievable. In order to optimise the performance of hardware reliably, we require a method that can tractably calculate tight bounds for the error or range of any variable within an algorithm, but currently only a handful of methods to calculate such bounds exist, and these either sacrifice tightness or tractability, whilst simulation-based methods cannot guarantee the given error estimate. This thesis presents a new method to calculate these bounds, taking into account both input ranges and finite precision effects, which we show to be, in general, tighter in comparison to existing methods; this in turn can be used to tune the hardware to the algorithm specifications. We demonstrate the use of this software to optimise hardware for various algorithms to accelerate the solution of a system of linear equations, which forms the basis of many problems in engineering and science, and show that significant performance gains can be obtained by using this new approach in conjunction with more traditional hardware optimisations

    Advanced Components in the Variable Precision Floating-Point Library

    No full text
    corecore