4 research outputs found

    FPGA Implementation of Floating Point Reciprocator Using Binomial Expansion Method

    Get PDF
    Floating-point support has become a mandatory feature of new micro processors due to the prevalence of business, technical, and recreational applications that use these operations.聽 In these operations Floating-point division is generally regarded as a low frequency, high latency operation in typical floating-point applications. So due to this not much development had taken place in this field. But nowadays floating point divider has become indispensable and increasingly important in many scienti铿乧 and signal processing applications.In this paper we implement floating point reciprocator for both single precision and double precision floating point numbers using FPGA. Here the implementation is based on the binomial expansion method. By comparing with previous works the modules occupy less area with a higher performance and less latency. The designs trade off either 1 unit in last-place (ulp) or 2 ulp of accuracy (for double or single precision respectively), without rounding, to obtain a better implementation

    Series Expansion based Efficient Architectures for Double Precision Floating Point Division

    Get PDF
    postprin

    Profile-directed specialisation of custom floating-point hardware

    No full text
    We present a methodology for generating floating-point arithmetic hardware designs which are, for suitable applications, much reduced in size, while still retaining performance and IEEE-754 compliance. Our system uses three key parts: a profiling tool, a set of customisable floating-point units and a selection of system integration methods. We use a profiling tool for floating-point behaviour to identify arithmetic operations where fundamental elements of IEEE-754 floating-point may be compromised, without generating erroneous results in the common case. In the uncommon case, we use simple detection logic to determine when operands lie outside the range of capabilities of the optimised hardware. Out-of-range operations are handled by a separate, fully capable, floatingpoint implementation, either on-chip or by returning calculations to a host processor. We present methods of system integration to achieve this errorcorrection. Thus the system suffers no compromise in IEEE-754 compliance, even when the synthesised hardware would generate erroneous results. In particular, we identify from input operands the shift amounts required for input operand alignment and post-operation normalisation. For operations where these are small, we synthesise hardware with reduced-size barrel-shifters. We also propose optimisations to take advantage of other profile-exposed behaviours, including removing the hardware required to swap operands in a floating-point adder or subtractor, and reducing the exponent range to fit observed values. We present profiling results for a range of applications, including a selection of computational science programs, Spec FP 95 benchmarks and the FFMPEG media processing tool, indicating which would be amenable to our method. Selected applications which demonstrate potential for optimisation are then taken through to a hardware implementation. We show up to a 45% decrease in hardware size for a floating-point datapath, with a correctable error-rate of less then 3%, even with non-profiled datasets

    Dise帽o e implementaci贸n de una Unidad Aritm茅tica de Coma Flotante (FPU) gen茅rica y fexible

    Get PDF
    Proyecto de Graduaci贸n (Licenciatura en Ingenier铆a en Electr贸nica) Instituto Tecnol贸gico de Costa Rica. Escuela de Ingenier铆a Electr贸nica, 2016.A methodology that measures the DSP performance of a ASP with low-power target applications is presented. Additionally, the timing, area and power synthesis results for an adder, a multiplier and a CORDIC oating point units in Artix 7 FPGA family and 0.13 m technology for single and double precision in various system frequencies, are presented. The pipelined adder achieves a maximum frequency of 350MHz, the multiplier (with a simple Karatsuba signi cand multiplication) reaches 243MHz, and lastly, the standalone CORDIC oating point operator reaches 537MHz
    corecore