27,483 research outputs found

    Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

    Get PDF
    In floating point arithmetic operations, multiplication is the most required operation for many signal processing and scientific applications. 24-bit length mantissa multiplication is involved to obtain the floating point multiplication final result for two given single precision floating point numbers. This mantissa multiplication plays the major role in the performance evaluation in respect of occupied area and propagation delay. This paper presents the design and analysis of single precision floating point multiplication using karatsuba algorithm with vedic multiplier with the considering of modified 2x1 multiplexers and modified 4:2 compressors in order to overcome the drawbacks in the existing techniques. Further, the performance analysis of single precision floating point multiplier is analyzed in terms of area and delay using Karatsuba Algorithm with different existing techniques such as 4x1 multiplexers and 3:2 compressors and modified techniques such as 2x1 multiplexers, 4:2 compressors. From the simulation results, it is observed that single precision floating point multiplication with karatsuba algorithm using modified 4:2 compressor with XOR-MUX logic provides better performance with efficient usage of resources such as area and delay than that of existing techniques. All the blocks involved for floating point multiplication are coded with Verilog and synthesized using Xilinx ISE Simulator

    Design of a Single Precision Floating Point Divider and Multiplier with Pipelined Architecture

    Get PDF
    High speed computation is the need of today’s generation of Processors. To accomplish this major task, many functions are implemented inside the hardware of the processor rather than having software computing the same task. Majority of the operations which the processor executes are Arithmetic operations which are widely used in many applications that require heavy mathematical operations such as scientific calculations, image and signal processing. Especially in the field of signal processing, multiplication division operation is widely used in many applications. The major issue with these operations in hardware is that many iteration’s are required which results in slow operation while fast algorithms require complex computations within each cycle. The result of a Division operation results in a either in Quotient and Remainder or a Floating point number which is the major reason to make it more complex than Multiplication operation. The work described in this paper includes design and verification of a floating point divider and multiplier. The inputs of both the Multiplier and Divider and also the output are designed using the single precision IEEE Standard for floating point numbers

    A general framework for efficient FPGA implementation of matrix product

    Get PDF
    Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe

    Optimistic Parallelization of Floating-Point Accumulation

    Get PDF
    Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are "mostly" associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision floating-point adders on the Virtex-4 LX160 (-12) device where each floating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we demonstrate an average speedup of 6Ă— with randomly generated data and 3-7Ă— with summations extracted from Conjugate Gradient benchmarks
    • …
    corecore