Search CORE

30 research outputs found

Design and Implementation of a Radix-4 Complex Division Unit with Prescaling

Author: Dormiani Pouya
Ercegovac Milos
Muller Jean-Michel
Publication venue: IEEE Computer Society
Publication date: 07/07/2009
Field of study

International audienceWe present a design and implementation of a radix-4 complex division unit with prescaling of the operands. Specifically, we extend the treatment of the residual bound and errors due to the use of truncated redundant representation. The requirements for prescaling tables are simplified and a detailed specification of the table design is given. All principal components used in the design are described and the proposed optimizations are explained. The target platform for implementation was an Altera Stratix II FPGA [15] for which we report timing and area requirements. For a precision of 36 bits, the implementation uses 1093 ALUTs, achieving a latency of 97ns. The maximum clock frequency is 268.53 MHz

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

An Architecture for Improving Variable Radix Real and Complex Division Using Recurrence Division

Author: Ercegovac Miloš,
Muller Jean-Michel
Stine James,
Publication venue: HAL CCSD
Publication date: 01/11/2020
Field of study

International audienceThis paper shows the details of an implementation of variable radix floating-point complex division based on previous implementations of the algorithm. This implementation takes advantage of the easier prescaling offered by low-radix division and recodes it as necessary for higher radix iterations throughout the design. This, along with proper use of redundant digit sets, allows us to significantly altar performance characteristics relative to exclusively high-radix division implementations. Comparisons to existing architectures are shown, as well as common implementation optimizations for future iterations. Results are given in cmos32soi 32nm MTCMOS technology using ARMbased standard-cells and commercial EDA toolsets

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Reliable and Fault-Resilient Schemes for Efficient Radix-4 Complex Division

Author: Manoharan Niranjan
Publication venue: RIT Scholar Works
Publication date: 01/05/2014
Field of study

Complex division is commonly used in various applications in signal processing and control theory including astronomy and nonlinear RF measurements. Nevertheless, unless reliability and assurance are embedded into the architectures of such structures, the suboptimal (and thus erroneous) results could undermine the objectives of such applications. As such, in this thesis, we present schemes to provide complex number division architectures based on (Sweeney, Robertson, and Tocher) SRT-division with fault diagnosis mechanisms. Different fault resilient architectures are proposed in this thesis which can be tailored based on the eventual objectives of the designs in terms of area and time requirements, among which we pinpoint carefully the schemes based on recomputing with shifted operands (RESO) to be able to detect both natural and malicious faults and with proper modification achieve high throughputs. The design also implements a minimized look up table approach which favors in error detection based designs and provides high fault coverage with relatively-low overhead. Additionally, to benchmark the effectiveness of the proposed schemes, extensive fault diagnosis assessments are performed for the proposed designs through fault simulations and FPGA implementations; the design is implemented on Xilinx Spartan-VI and Xilinx Virtex-VI FPGA families

RIT Scholar Works

Low Precision Table Based Complex Reciprocal Approximation

Author: Dormiani Pouya
Ercegovac Milos
Muller Jean-Michel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

International audienceA recently proposed complex valued division algorithm designed for efficient hardware implementations requires a prescaling step by a constant factor. Techniques for obtaining this prescaling factor have been mentioned by the authors, which serves to justify the feasibility of the algorithm but is inadequate for obtaining efficient implementations. Table based solutions are formulated in this paper for obtaining the prescaling factor, a low precision reciprocal approximation for a complex value, using techniques adopted from univariate function approximations. Two separate designs are proposed, one using a single table (a reference design) and another using generalized multipartite tables. The main contribution of this work is the extension of generalized multipartite table methods to a function of two variables. The multipartite tables derived were up to 67% more memory efficient than their single table counterparts

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Study of Recursive Divide Architectures and Implementation for Division and Multiplication

Author: Phadke Amey P.
Publication venue: 'Oklahoma State University Library'
Publication date: 01/12/2011
Field of study

Multipliers have been key and critical components for most application-specific and general-purpose computer architectures. However, these architectures have been transitioning towards multiple cores that can process large amounts of data through parallel approaches to computation. Unfortunately, traditional arithmetic functional units that worked well for single-core architectures have the side effect of incurring large amounts of area and power. Consequently, multi-core architecture need new ways of thinking about increased throughput to handle large amounts of data. This work discusses implementation of different divider algorithms and presents a recursive high radix divide unit that is modified to handle both multiplication and division targeted at multi-core architectures. Results are obtained with a 65nm technology and show a significant decrease in area and power while still maintaining a low total latency by utilizing high radix encoding within the functional unit.School of Electrical & Computer Engineerin

SHAREOK repository

Solving Systems of Linear Equations in Complex Domain : Complex E-Method

Author: Ercegovac Milos
Muller Jean-Michel
Publication venue: HAL CCSD
Publication date: 24/01/2007
Field of study

The E-method, introduced by Ercegovac, allows efficient parallel solution of diagonally dominant systems of linear equations in real domain using simple and highly regular hardware. Since the evaluation of polynomials and certain rational functions can be achieved by solving the corresponding linear systems, the E-method is an attractive general approach for function evaluation. We generalize the E-method to complex linear systems, and show some potential applications such as the evaluation of complex polynomials and rational functions

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Complex Multiply-Add and Other Related Operators

Author: Ercegovac Milos
Muller Jean-Michel
Publication venue: 'Instytut Dermatologii Radoslaw Spiewak'
Publication date: 26/08/2007
Field of study

International audienceIn this work, we present algorithms and schemes for computing common arithmetic expressions defined in the complex domain as hardware-implemented operators.The operators include Complex Multiply-Add (CMA: ab+c), Complex Sum of Producrs (CSP: ab+ce+f), Complex Sum of Squares (CSS: a^2+b^2) and complex Integer Powers. The proposed approach is to map the expression to a system of linear equations, apply a complex-to-real transform, and compute the solutions to the linear system using a digit-by-digit, the most significant digit first, recurrence method. The components of the solution vector corresponds to the expressions being evaluated. The number of digit cycles is about m for m-digit precision. The basic modules are similar to left-to-right multipliers. The interconnections between the modules are digit-wide

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Αποδοτική σχεδίαση τροποποιημένων SRT αλγορίθμων και προσομοίωση κατόπιν σύνθεσής τους

Author: Bakas Konstantinos
Μπάκας Κωνσταντίνος
Publication venue
Publication date: 21/06/2016
Field of study

DSpace at NTUA

An Efficient Method for Evaluating Complex Polynomials

Author: A. H. Nutall
F. W. J. Olver
Jean-Michel Muller
K. Benmahammed
M. D. Ercegovac
M. D. Ercegovac
M. D. Ercegovac
M. D. Ercegovac
Miloš D. Ercegovac
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

High sample-rate Givens rotations for recursive least squares

Author: Walke Richard Lewis
Publication venue
Publication date
Field of study

The design of an application-specific integrated circuit of a parallel array processor is considered for recursive least squares by QR decomposition using Givens rotations, applicable in adaptive filtering and beamforming applications. Emphasis is on high sample-rate operation, which, for this recursive algorithm, means that the time to perform arithmetic operations is critical. The algorithm, architecture and arithmetic are considered in a single integrated design procedure to achieve optimum results. A realisation approach using standard arithmetic operators, add, multiply and divide is adopted. The design of high-throughput operators with low delay is addressed for fixed- and floating-point number formats, and the application of redundant arithmetic considered. New redundant multiplier architectures are presented enabling reductions in area of up to 25%, whilst maintaining low delay. A technique is presented enabling the use of a conventional tree multiplier in recursive applications, allowing savings in area and delay. Two new divider architectures are presented showing benefits compared with the radix-2 modified SRT algorithm. Givens rotation algorithms are examined to determine their suitability for VLSI implementation. A novel algorithm, based on the Squared Givens Rotation (SGR) algorithm, is developed enabling the sample-rate to be increased by a factor of approximately 6 and offering area reductions up to a factor of 2 over previous approaches. An estimated sample-rate of 136 MHz could be achieved using a standard cell approach and O.35pm CMOS technology. The enhanced SGR algorithm has been compared with a CORDIC approach and shown to benefit by a factor of 3 in area and over 11 in sample-rate. When compared with a recent implementation on a parallel array of general purpose (GP) DSP chips, it is estimated that a single application specific chip could offer up to 1,500 times the computation obtained from a single OP DSP chip

Warwick Research Archives Portal Repository