Search CORE

533 research outputs found

High-speed radix-10 multiplication using partial shifter adder tree-based convertor

Author: Malviya Utsav Kumar
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/04/2021
Field of study

A radix-10 multiplication is the foremost frequent operations employed by several monetary business and user-oriented applications, decimal multiplier using in state of art digital systems are significantly good but can be upgraded with time delay and area optimization. This work is proposed a more area and time delay optimized new design of overloaded decimal digit set (ODDS) architecture-based radix-10 multiplier for signed numbers. Binary coded decimal (BCD) to binary followed by binary multiplication and finally binary to BCD conversion are 3 major modules employed in radix-10 multiplication. This paperwork presents a replacement technique for binary coded decimal (BCD) to binary and vice-versa convertors in radix-10 multiplication. A novel addition tree structure called as partial shifter adder (PSA) tree-based approach has been developed for BCD to binary conversion, and it is used to add partially generated products. To meet our major concern i.e. speed, we need particular high-speed multiplication, hence the proposed PSA based radix-10 multiplier is using vertical cross binary multiplication and concurrent shifter-based addition method. The design has been tested on 45nm technology-based Zynq-7 field programmable gate array (FPGA) devices with a 6-input lookup table (LUTs). A combinational implementation maps quite well into the slice structure of the Xilinx Zynq-7 families field programmable gate array. The synthesis results for a Zynq-7 device indicate that our design outperforms in terms of the area and time delay

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

RADIX-10 PARALLEL DECIMAL MULTIPLIER

Author: INGLE MRUNALINI E.
PANSE TEJASWINI
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 30/07/2020
Field of study

This paper introduces novel architecture for Radix-10 decimal multiplier. The new generation of highperformance decimal floating-point units (DFUs) is demanding efficient implementations of parallel decimal multiplier. The parallel generation of partial products is performed using signed-digit radix-10 recoding of the multiplier and a simplified set of multiplicand multiples. The reduction of partial products is implemented in a tree structure based on a new algorithm decimal multioperand carry-save addition that uses a unconventional decimal-coded number systems. We further detail these techniques and it significantly improves the area and latency of the previous design, which include: optimized digit recoders, decimal carry-save adders (CSA’s) combining different decimal-coded operands, and carry free adders implemented by special designed bit counters

Interscience Research Network

BCD Multiplier

Author: Abulinain Ahmed Kamal Saad Fathallah
Publication venue: 'Whiting & Birch, Ltd.'
Publication date: 01/01/2015
Field of study

BCD multipliers are the basis of accurate decimal multiplication used in banking systems, scientific calculations, etc. Fractions convert poorly into binary numbers giving rise to conversion error. Therefore, banking industry have been using Binary Coded Decimal numbering system for their banking business transaction to circumvent the error between decimal fraction number to binary. Here we will explore some single-digit Binary Coded Decimal Multiplication units that perform multiplication in hardware for the purpose of future implementation

UTPedia

HIGH-SPEED CO-PROCESSORS BASED ON REDUNDANT NUMBER SYSTEMS

Author: Kaivani Amir
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

There is a growing demand for high-speed arithmetic co-processors for use in applications with computationally intensive tasks. For instance, Fast Fourier Transform (FFT) co-processors are used in real-time multimedia services and financial applications use decimal co-processors to perform large amounts of decimal computations. Using redundant number systems to eliminate word-wide carry propagation within interim operations is a well-known technique to increase the speed of arithmetic hardware units. Redundant number systems are mostly useful in applications where many consecutive arithmetic operations are performed prior to the final result, making it advantageous for arithmetic co-processors. This thesis discusses the implementation of two popular arithmetic co-processors based on redundant number systems: namely, the binary FFT co-processor and the decimal arithmetic co-processor. FFT co-processors consist of several consecutive multipliers and adders over complex numbers. FFT architectures are implemented based on fixed-point and floating-point arithmetic. The main advantage of floating-point over fixed-point arithmetic is the wide dynamic range it introduces. Moreover, it avoids numerical issues such as scaling and overflow/underflow concerns at the expense of higher cost. Furthermore, floating-point implementation allows for an FFT co-processor to collaborate with general purpose processors. This offloads computationally intensive tasks from the primary processor. The first part of this thesis, which is devoted to FFT co-processors, proposes a new FFT architecture that uses a new Binary-Signed Digit (BSD) carry-limited adder, a new floating-point BSD multiplier and a new floating-point BSD three-operand adder. Finally, a new unit labeled as Fused-Dot-Product-Add (FDPA) is designed to compute AB+CD+E over floating-point BSD operands. The second part of the thesis discusses decimal arithmetic operations implemented in hardware using redundant number systems. These operations are popularly used in decimal floating-point co-processors. A new signed-digit decimal adder is proposed along with a sequential decimal multiplier that uses redundant number systems to increase the operational frequency of the multiplier. New redundant decimal division and square-root units are also proposed. The architectures proposed in this thesis were all implemented using Hardware-Description-Language (Verilog) and synthesized using Synopsys Design Compiler. The evaluation results prove the speed improvement of the new arithmetic units over previous pertinent works. Consequently, the FFT and decimal co-processors designed in this thesis work with at least 10% higher speed than that of previous works. These architectures are meant to fulfill the demand for the high-speed co-processors required in various applications such as multimedia services and financial computations

eCommons@USASK

University of Saskatchewan Research Archive

A New Family of High.Performance Parallel Decimal Multipliers

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

EARLY ESTIMATION OF DELAY IN BINARY TO BCD CONVERTOR

Author: NARSIMHAM D. JOTHSNA
SAROJINI P. LAKSHMI
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 05/09/2020
Field of study

A novel high speed architecture for fixed bit binary to BCD conversion which is better in terms of delay is presented in this paper. In recent years, decimal data processing applications have grown and thus there is a need to have hardware support for decimal arithmetic. Decimal digit multipliers are having Binary to BCD conversion as the basic building block. This decimal multiplication in turn is an integral part of commercial, internet and financial based applications

Interscience Research Network

Algorithms and architectures for decimal transcendental function computation

Author: Chen Dongdong
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2011
Field of study

Nowadays, there are many commercial demands for decimal floating-point (DFP) arithmetic operations such as financial analysis, tax calculation, currency conversion, Internet based applications, and e-commerce. This trend gives rise to further development on DFP arithmetic units which can perform accurate computations with exact decimal operands. Due to the significance of DFP arithmetic, the IEEE 754-2008 standard for floating-point arithmetic includes it in its specifications. The basic decimal arithmetic unit, such as decimal adder, subtracter, multiplier, divider or square-root unit, as a main part of a decimal microprocessor, is attracting more and more researchers' attentions. Recently, the decimal-encoded formats and DFP arithmetic units have been implemented in IBM's system z900, POWER6, and z10 microprocessors. Increasing chip densities and transistor count provide more room for designers to add more essential functions on application domains into upcoming microprocessors. Decimal transcendental functions, such as DFP logarithm, antilogarithm, exponential, reciprocal and trigonometric, etc, as useful arithmetic operations in many areas of science and engineering, has been specified as the recommended arithmetic in the IEEE 754-2008 standard. Thus, virtually all the computing systems that are compliant with the IEEE 754-2008 standard could include a DFP mathematical library providing transcendental function computation. Based on the development of basic decimal arithmetic units, more complex DFP transcendental arithmetic will be the next building blocks in microprocessors. In this dissertation, we researched and developed several new decimal algorithms and architectures for the DFP transcendental function computation. These designs are composed of several different methods: 1) the decimal transcendental function computation based on the table-based first-order polynomial approximation method; 2) DFP logarithmic and antilogarithmic converters based on the decimal digit-recurrence algorithm with selection by rounding; 3) a decimal reciprocal unit using the efficient table look-up based on Newton-Raphson iterations; and 4) a first radix-100 division unit based on the non-restoring algorithm with pre-scaling method. Most decimal algorithms and architectures for the DFP transcendental function computation developed in this dissertation have been the first attempt to analyze and implement the DFP transcendental arithmetic in order to achieve faithful results of DFP operands, specified in IEEE 754-2008. To help researchers evaluate the hardware performance of DFP transcendental arithmetic units, the proposed architectures based on the different methods are modeled, verified and synthesized using FPGAs or with CMOS standard cells libraries in ASIC. Some of implementation results are compared with those of the binary radix-16 logarithmic and exponential converters; recent developed high performance decimal CORDIC based architecture; and Intel's DFP transcendental function computation software library. The comparison results show that the proposed architectures have significant speed-up in contrast to the above designs in terms of the latency. The algorithms and architectures developed in this dissertation provide a useful starting point for future hardware-oriented DFP transcendental function computation researches

eCommons@USASK

University of Saskatchewan Research Archive

Fast decimal floating-point division

Author: Lim C.
Nikmehr H.
Phillips B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

A new implementation for decimal floating-point (DFP) division is introduced. The algorithm is based on high-radix SRT division The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher. with the recurrence in a new decimal signed-digit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literatureHooman Nikmehr, Braden Phillips and Cheng-Chew Li

Crossref

Adelaide Research & Scholarship

Multi-operand Decimal Adder Trees for FPGAs

Author: de Dinechin Florent
Vazquez Alvaro
Publication venue: HAL CCSD
Publication date: 14/10/2010
Field of study

The research and development of hardware designs for decimal arithmetic is currently going under an intense activity. For most part, the methods proposed to implement fixed and floating point operations are intended for ASIC designs. Thus, a direct mapping or adaptation of these techniques into a FPGA could be far from an optimal solution. Only a few studies have considered new methods more suitable for FPGA implementations. A basic operation that has not received enough attention in this context is multi-operand BCD addition. For example, it is of interest for low latency implementations of decimal fixed and floating point multipliers and decimal fused multiply-add units. We have explored the most representative proposals for multi-operand BCD addition and found that the resultant implementations in FPGAs are still very inefficient in terms of both area and latency when compared to their binary counterparts. In this paper we present a new method for fast and efficient implementation of multi-operand BCD addition in current FPGA devices. In particular, our proposal maps quite well into the slice structure of the Xilinx Virtex-5/Virtex-6 families and it is highly pipelineable. The synthesis results for a Virtex-6 device indicate that our implementations halve the area and latency of previous proposals, presenting area and delay figures close to those of optimal binary adder trees.La recherche sur l'implantation en matériel de l'arithmétique décimale est actuellement très active, la plupart des travaux portant sur des opérateurs pour les processeurs, en virgule fixe ou flottante. Mais les techniques développées pour un circuit intégré n'aboutissent pas forcément à une implémentation optimale dans un FPGA. Il n'y a que peu d'études ciblant explicitement les FPGA. Cet article s'intéresse dans ce contexte, à l'addition BCD multi-opérande, au cœur de multiplieurs et de multiplieurs-accumulateurs à faible latence. Nous étudions les architectures proposées pour cette opération décimale, et nous observons que, sur FPGA, leur performance (surface et latence) est très inférieure à celle des opérations binaire à précision comparable. Nous présentons donc dans cet article une nouvelle technique d'addition BCD multi-opérandes qui s'avère plus efficace que les propositions précédentes sur les FPGA actuels. Elle s'adapte particulièrement bien à la structure fine des FPGA Xilinx Virtex-5/Virtex-6, et se prête bien au pipeline. Les résultats de synthèse montrent que notre implémentation divise par deux la surface et la latence par rapport aux propositions précédentes, les ramenant à des valeurs comparables à celles des meilleurs additionneurs multi-opérandes binaires

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

FPGA Implementation of Double Precision Floating Point Multiplier

Author: Abdullah Mohd.
Chourasia Bharti
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2022
Field of study

High speed computation is the need of today’s generation of Processors.  To accomplish this major task,  many functions  are implemented  inside the hardware  of the processor rather than  having  software  computing  the  same  task. Majority of the operations which the processor executes are Arithmetic operations which are widely used in many applications that require heavy mathematical operations such as scientific calculations, image and signal processing. Especially in the field of signal processing, multiplication division operation is widely used in many applications. The major issue with these operations in hardware is that much iteration is required which results in slow operation while fast algorithms require complex computations within each cycle. The result of a Division operation results in a either  in Quotient  and  Remainder  or a Floating  point  number  which is the  major reason  to  make it  more complex than  Multiplication  operation

International Journal on Recent and Innovation Trends in Computing and Communication