33 research outputs found

    RADIX-10 PARALLEL DECIMAL MULTIPLIER

    Get PDF
    This paper introduces novel architecture for Radix-10 decimal multiplier. The new generation of highperformance decimal floating-point units (DFUs) is demanding efficient implementations of parallel decimal multiplier. The parallel generation of partial products is performed using signed-digit radix-10 recoding of the multiplier and a simplified set of multiplicand multiples. The reduction of partial products is implemented in a tree structure based on a new algorithm decimal multioperand carry-save addition that uses a unconventional decimal-coded number systems. We further detail these techniques and it significantly improves the area and latency of the previous design, which include: optimized digit recoders, decimal carry-save adders (CSA’s) combining different decimal-coded operands, and carry free adders implemented by special designed bit counters

    Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues

    Get PDF
    The residue number system (RNS) is suitable for DSP architectures because of its ability to perform fast carry-free arithmetic. However, this advantage is over-shadowed by the complexity involved in the conversion of numbers between binary and RNS representations. Although the reverse conversion (RNS to binary) is more complex, the forward transformation is not simple either. Most forward converters make use of look-up tables (memory). Recently, a memoryless forward converter architecture for arbitrary moduli sets was proposed by Premkumar in 2002. In this paper, we present an extension to that architecture which results in 44% less hardware for parallel conversion and achieves 43% improvement in speed for serial conversions. It makes use of the periodicity properties of residues obtained using modular exponentiation

    Multi-operand Decimal Adder Trees for FPGAs

    Get PDF
    The research and development of hardware designs for decimal arithmetic is currently going under an intense activity. For most part, the methods proposed to implement fixed and floating point operations are intended for ASIC designs. Thus, a direct mapping or adaptation of these techniques into a FPGA could be far from an optimal solution. Only a few studies have considered new methods more suitable for FPGA implementations. A basic operation that has not received enough attention in this context is multi-operand BCD addition. For example, it is of interest for low latency implementations of decimal fixed and floating point multipliers and decimal fused multiply-add units. We have explored the most representative proposals for multi-operand BCD addition and found that the resultant implementations in FPGAs are still very inefficient in terms of both area and latency when compared to their binary counterparts. In this paper we present a new method for fast and efficient implementation of multi-operand BCD addition in current FPGA devices. In particular, our proposal maps quite well into the slice structure of the Xilinx Virtex-5/Virtex-6 families and it is highly pipelineable. The synthesis results for a Virtex-6 device indicate that our implementations halve the area and latency of previous proposals, presenting area and delay figures close to those of optimal binary adder trees.La recherche sur l'implantation en matériel de l'arithmétique décimale est actuellement très active, la plupart des travaux portant sur des opérateurs pour les processeurs, en virgule fixe ou flottante. Mais les techniques développées pour un circuit intégré n'aboutissent pas forcément à une implémentation optimale dans un FPGA. Il n'y a que peu d'études ciblant explicitement les FPGA. Cet article s'intéresse dans ce contexte, à l'addition BCD multi-opérande, au cœur de multiplieurs et de multiplieurs-accumulateurs à faible latence. Nous étudions les architectures proposées pour cette opération décimale, et nous observons que, sur FPGA, leur performance (surface et latence) est très inférieure à celle des opérations binaire à précision comparable. Nous présentons donc dans cet article une nouvelle technique d'addition BCD multi-opérandes qui s'avère plus efficace que les propositions précédentes sur les FPGA actuels. Elle s'adapte particulièrement bien à la structure fine des FPGA Xilinx Virtex-5/Virtex-6, et se prête bien au pipeline. Les résultats de synthèse montrent que notre implémentation divise par deux la surface et la latence par rapport aux propositions précédentes, les ramenant à des valeurs comparables à celles des meilleurs additionneurs multi-opérandes binaires

    A New Family of High.Performance Parallel Decimal Multipliers

    Full text link

    A Novel VLSI Design On CSKA Of Binary Tree Adder With Compaq Area And High Throughput

    Get PDF
    Addition is one of the most basic operations performed in all computing units, including microprocessors and digital signal processors. It is also a basic unit utilized in various complicated algorithms of multiplication and division. Efficient implementation of an adder circuit usually revolves around reducing the cost to propagate the carry between successive bit positions. Multi-operand adders are important arithmetic design blocks especially in the addition of partial products of hardware multipliers. The multi-operand adders (MOAs) are widely used in the modern low-power and high-speed portable very-large-scale integration systems for image and signal processing applications such as digital filters, transforms, convolution neural network architecture. Hence, a new high-speed and area efficient adder architecture is proposed using pre-compute bitwise addition followed by carry prefix computation logic to perform the three-operand binary addition that consumes substantially less area, low power and drastically reduces the adder delay. Further, this project is enhanced by using Modified carry bypass adder to further reduce more density and latency constraints. Modified carry skip adder introduces simple and low complex carry skip logic to reduce parameters constraints. In this proposal work, designed binary tree adder (BTA) is analyzed to find the possibilities for area minimization. Based on the analysis, critical path of carry is taken into the new logic implementation and the corresponding design of CSKP are proposed for the BTA with AOI, OAI

    EFFICIENT VLSI IMPLEMENTATION OF REDUNDANT BINARY ENCODING FOR DECIMAL MULTIPLICATION

    Get PDF
    This paper introduces two novel architectures for parallel decimal multipliers. Our multipliers are based on a new algorithm for decimal carry–save multioperand addition that uses a novel TCSD recoding for decimal digits. It significantly improves the area and latency of the partial product reduction tree with respect to previous proposals. We also present three schemes for fast and efficient generation of partial products in parallel. The recoding of the TCSD multiplier operand into minimally redundant signed–digit radix–10, radix–4 and radix–5 representations using new recoders reduces the complexity of partial product generation. In addition, SD radix–4 and radix–5 recodings allow the reuse of a conventional parallel binary radix–4 multiplier to perform combined binary/decimal multiplications

    REDUCING LATENCY AND RECEIVING HIGH-RECITAL IN ANALOGOUS MULTIPLIERS

    Get PDF
    Partial goods are generated in parallel utilizing a signed-digit radix-10 recoding from the BCD multiplier using the digit set [-5, 5], and some positive multiplicand multiples (0X, 1X, 2X, 3X, 4X, 5X) created in XS-3. This encoding has lots of advantages. We present the formula and architecture of the BCD parallel multiplier that exploits some qualities of two different redundant BCD codes to hurry up its computation: the redundant BCD excess-3 code (XS-3), and also the overloaded BCD representation (ODDS). Additionally, new techniques are designed to reduce considerably the latency and section of previous representative high end implementations. First, it's a self-complementing code, to ensure that an adverse multiplicand multiple could be acquired just by inverting the items of the related positive one. Also, the accessible redundancy enables a easy and quick generation of multiplicand multiples inside a carry-free way. Finally, the partial products could be recoded towards the ODDS representation just by adding a continuing factor in to the partial product reduction tree. To exhibit the benefits of our architecture, we've synthesized a RTL model for 16 x 16-digit and 34 x 34-digit multiplications and performed a comparative survey from the previous most representative designs. Because the ODDS utilize a similar 4-bit binary encoding as non-redundant BCD, conventional binary VLSI circuit techniques, for example binary carry-save adders and compressor trees, could be adapted efficiently to do decimal operations

    A HIGH PERFORMANCE RADIX10 MULTIPLICATION ARCHITECTURE BASED ON REDUNDANT BCD CODES

    Get PDF
    The decimal multiplication is one of the most important decimal arithmetic operations which have a growing demand in the area of commercial, financial, and scientific computing. It has been revived in recent years due to the large amount of data in commercial applications. In this paper, we propose a parallel decimal multiplication algorithm with three components, which are a partial product generation, a partial product reduction, and a final digit-set conversion. First, a redundant number system is applied to recode not only the multiplier, but also multiples of the multiplicand in signed-digit (SD) numbers. Furthermore, we present a multi operand SD addition algorithm to reduce the partial product array. We consider the problem of multi operand parallel decimal addition with an approach that uses binary arithmetic, suggested by the adoption of binary-coded decimal (BCD) numbers. This involves corrections in order to obtain the BCD result or a binary-to-decimal (BD) conversion. The BD conversion moreover allows an easy alignment of the sums of adjacent columns. We treat the design of BCD digit adders using fast carry-free adders and the conversion problem through a known parallel scheme using elementary conversion cells. Spread sheets have been developed for adding several BCD digits and for simulating the BD conversion as a design tool. In this project Xilinx-ISE tool is used for simulation, logical verification, and further synthesizing

    Fast Multi Operand Decimal Adders using Digit Compressors with Decimal Carry Generation

    Get PDF

    EXPERIMENTAL STUDIES ON MULTI-OPERAND ADDERS

    Get PDF
    corecore