Search CORE

196 research outputs found

Single-Precision and Double-Precision Merged Floating-Point Multiplication and Addition Units on FPGA

Author: Zhang Hao
Publication venue: 'University of Saskatchewan Library'
Publication date: 11/02/2020
Field of study

Floating-point (FP) operations defined in IEEE 754-2008 Standard for Floating-Point Arithmetic can provide wider dynamic range and higher precision than fixed-point operations. Many scientific computations and multimedia applications adopt FP operations. Among all the FP operations, addition and multiplication are the most frequent operations. In this thesis, the single-precision (SP) and double-precision (DP) merged FP multiplier and FP adder architectures are proposed. The proposed efficient iterative FP multiplier is designed based on the Karatsuba algorithm and implemented with the pipelined architecture. It can accomplish two parallel SP multiplication operations in one iteration with a latency of 6 clock cycles or one DP multiplication operation in two iterations with a latency of 9 clock cycles. Implemented on Xilinx Virtex-5 (xc5vlx155ff1760-3) FPGA device, the proposed multiplier runs at 348 MHz using 6 DSP48E blocks, 1117 LUTs, and 1370 FFs. Compared to previous FPGA based multiple-precision FP multiplier, the proposed designs runs at 4% faster clock frequency with reduction of 33% of DSP blocks, 17% latency for SP multiplication, and 28% latency for DP multiplication. The proposed high performance FP adder is designed based one the two-path FP addition algorithm. With fully pipelined architecture, the proposed adder can accomplish one DP or two parallel SP addition/subtraction operations in 6 clock cycles. The proposed adder architecture is implemented on both Altera and Xilinx 65nm process FPGA devices. The proposed adder can run up to 336 MHz with 1694 FFs, 1420 LUTs on Xilinx Virtex-5 (xc5vlx155ff1760-3) FPGA device. Compared to the combination of one DP and two SP architecture built with Xilinx FP operator, the proposed adder has 11.3% faster clock frequency. On Altera Stratix-III (EP3SL340F1760C2) FPGA device, the maximum clock frequency of the proposed adder can reach 358 MHz and 1686 ALUTs and 1556 registers are occupied. The proposed adder is 11.6% faster than the combination of one DP and two SP architecture built with Altera FP megafunction. For the reference of other researchers, the implementation results of the proposed FP multiplier and FP adder on the latest Xilinx Virtex-7 device and Altera Arria 10 device are also provided

University of Saskatchewan Research Archive

Fast HUB Floating-point Adder for FPGA

Author: Gonzalez-Navarro Sonia
Hormigo-Aguilar Javier
Villalba-Moreno Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/10/2018
Field of study

Several previous publications have shown the area and delay reduction when implementing real number computation using HUB formats for both floating-point and fixed-point. In this paper, we present a HUB floating-point adder for FPGA which greatly improves the speed of previous proposed HUB designs for these devices. Our architecture is based on the double path technique which reduces the execution time since each path works in parallel. We also deal with the implementation of unbiased rounding in the proposed adder. Experimental results are presented showing the goodness of the new HUB adder for FPGA.TIN2016- 80920-R, JA2012 P12-TIC-1692, JA2012 P12-TIC-147

Crossref

Repositorio Institucional Universidad de Málaga

Serial-data computation in VLSI

Author: Smith Stewart Gresty
Publication venue: The University of Edinburgh
Publication date: 01/01/1987
Field of study

Edinburgh Research Archive

VLSI ARCHITECTURE OF PARALLEL MULTIPLIER– ACCUMULATOR BASED ON RADIX-2 MODIFIED BOOTH ALGORITHM

Author: Sailaja Mrs.
Sathish Mr.M.V.
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 28/08/2020
Field of study

A new architecture of multiplier-andaccumulator (MAC) for high-speed arithmetic. By combining multiplication with accumulation and devising a hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator that has the largest delay in MAC was merged into CSA, the overall performance was elevated. The proposing method CSA tree uses 1’s-complement-based radix-2 modified Booth’s algorithm (MBA) and has the modified array for the sign extension in order to increase the bit density of the operands. The proposed MAC showed the superior properties to the standard design in many ways and performance twice as much as the previous research in the similar clock frequency. We expect that the proposed MAC can be adapted to various fields requiring high performance such as the signal processing areas

Interscience Research Network

Efficient Low-Energy, Digit-Serial Exponentiator for large Finite Field GF

Author: Allah Cherigui F.
Mlynek D.
Publication venue
Publication date: 14/06/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

A Multi-Format Floating-Point Multiplier for Power-Efficient Operations

Author: Nannarelli Alberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

FPGA based efficient Multiplier for Image Processing Applications using Recursive Error Free Mitchell Log Multiplier and KOM Architecture

Author: Satish S Bhairannawar .
Venugopal K.R.
Publication venue
Publication date: 08/07/2014
Field of study

The Digital Image processing applications like medical imaging, satellite imaging, Biometric trait images etc., rely on multipliers to improve the quality of image. However, existing multiplication techniques introduce errors in the output with consumption of more time, hence error free high speed multipliers has to be designed. In this paper we propose FPGA based Recursive Error Free Mitchell Log Multiplier (REFMLM) for image Filters. The 2x2 error free Mitchell log multiplier is designed with zero error by introducing error correction term is used in higher order Karastuba-Ofman Multiplier (KOM) Architectures. The higher order KOM multipliers is decomposed into number of lower order multipliers using radix 2 till basic multiplier block of order 2x2 which is designed by error free Mitchell log multiplier. The 8x8 REFMLM is tested for Gaussian filter to remove noise in fingerprint image. The Multiplier is synthesized using Spartan 3 FPGA family device XC3S1500-5fg320. It is observed that the performance parameters such as area utilization, speed, error and PSNR are better in the case of proposed architecture compared to existing architecture

ePrints@Bangalore University

Energy-efficient design and implementation of approximate floating-point multiplier

Author: Paparouni Theodora
Παπαρούνη Θεοδώρα
Publication venue
Publication date: 28/01/2020
Field of study

DSpace at NTUA

VLSI Circuits for Approximate Computing

Author: Esposito Darjn
Publication venue
Publication date: 08/04/2017
Field of study

Approximate Computing has recently emerged as a promising solution to enhance circuits performance by relaxing the requisite on exact calculations. Multimedia and Machine Learning constitute a typical example of error resilient, albeit compute-intensive, applications. In this dissertation, the design and optimization of approximate fundamental VLSI digital blocks is investigated. In chapter one the theoretical motivations of Approximate Computing, from the VLSI perspective, are discussed. In chapter two my research activity about approximate adders is reported. In this chapter approximate adders for both traditional non-error tolerant applications and error resilient applications are discussed. In chapter three precision-scalable units are investigated. Real-time precision scalability allows adapting the precision level of the unit with the precision requirements of the applications. In this context my research activities regarding approximate Multiply-and-Accumulate and memory units are described. In chapter four a precision-scalable approximate convolver for computer vision applications is discussed. This is composed of both the approximate Multiply-and-Accumulate and memory units, presented in the chapter three

Università degli Studi di Napoli Federico Il Open Archive