Search CORE

1,268 research outputs found

Bit-level pipelined digit-serial array processors

Author: Aggoun A
Ashur A
Ibrahim MK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/1998
Field of study

A new architecture for high performance digit-serial vector inner product (VIP) which can be pipelined to the bit-level is introduced. The design of the digit-serial vector inner product is based on a new systematic design methodology using radix-2n arithmetic. The proposed architecture allows a high level of bit-level pipelining to increase the throughput rate with minimum initial delay and minimum area. This will give designers greater flexibility in finding the best tradeoff between hardware cost and throughput rate. It is shown that sub-digit pipelined digit-serial structure can achieve a higher throughput rate with much less area consumption than an equivalent bit-parallel structure. A twin-pipe architecture to double the throughput rate of digit-serial multipliers and consequently that of the digit-serial vector inner product is also presented. The effect of the number of pipelining levels and the twin-pipe architecture on the throughput rate and hardware cost are discussed. A two's complement digit-serial architecture which can operate on both negative and positive numbers is also presented

Crossref

Brunel University Research Archive

Recommended from our members

Effects of mixing design styles on the synthesis of RTL components

Author: Gajski Daniel D.
Kipps James R.
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

By mixing design styles during synthesis of RTL components such as adders, multipliers, and ALUs, it is possible to generate a range of designs from small to fast, where intermediate designs make favorable and possibly desirable tradeoffs between area and delay. Although module generators can be written to reflect design styles that reduce either area or delay, the current approach to generator execution does not examine the effects of mixing different design styles. We have developed an approach to RTL component synthesis that searches the space of design alternatives, and we have implemented this approach with the DTAS Design Language. The significance of our approach is that it allows DTAS to generate designs use a combination of design styles and to compare the effects of mixing styles. In this paper, we outline the operation of DTAS and describe how DTAS expands and constrains the design space. We present results from applying DTAS to large RTL components using an MCNC benchmark library. We also present results of integrating DTAS with the MISII logic optimizer

eScholarship - University of California

Fast Quantum Modular Exponentiation

Author: A. G. Fowler
A. M. Steane
A. Yao
D. Deutsch
D. E. Knuth
L. Grover
M. A. Nielsen
M. A. Nielsen
M. D. Ercegovac
M. Oskin
P. W. Shor
P. W. Shor
R. Cleve
R. Van Meter
S. Beauregard
Publication venue: 'American Physical Society (APS)'
Publication date: 29/03/2005
Field of study

We present a detailed analysis of the impact on modular exponentiation of architectural features and possible concurrent gate execution. Various arithmetic algorithms are evaluated for execution time, potential concurrency, and space tradeoffs. We find that, to exponentiate an n-bit number, for storage space 100n (twenty times the minimum 5n), we can execute modular exponentiation two hundred to seven hundred times faster than optimized versions of the basic algorithms, depending on architecture, for n=128. Addition on a neighbor-only architecture is limited to O(n) time when non-neighbor architectures can reach O(log n), demonstrating that physical characteristics of a computing device have an important impact on both real-world running time and asymptotic behavior. Our results will help guide experimental implementations of quantum algorithms and devices.Comment: to appear in PRA 71(5); RevTeX, 12 pages, 12 figures; v2 revision is substantial, with new algorithmic variants, much shorter and clearer text, and revised equation formattin

arXiv.org e-Print Archive

Crossref

Low-Complexity and High-Speed Constant Multiplications for Digital Filters Using Carry-Save Arithmetic

Author: Lars Wanhammar
Oscar Gustafsson
Publication venue: 'IntechOpen'
Publication date: 11/04/2011
Field of study

IntechOpen

High Speed and Low Power Consumption Carry Skip Adder using Binary to Excess-One Converter

Author: Sanyukta Vijaykumar Chahande, Prof. Mohammad Nasiruddin
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2017
Field of study

Arithmetic and Logic Unit (ALU) is a vital component of any CPU. In ALU, adders play a major role not only in addition but also in performing many other basic arithmetic operations like subtraction, multiplication, etc. Thus realizing an efficient adder is required for better performance of an ALU and therefore the processor. For the optimization of speed in adders, the most important factor is carry generation. For the implementation of a fast adder, the generated carry should be driven to the output as fast as possible, thereby reducing the worst path delay which determines the ultimate speed of the digital structure. In conventional carry skip adder the multiplexer is used as a skip logic that provides a better performance and performs an efficient operation with the minimum circuitry. Even though, it affords a significant advantages there may be a large critical path delay revealed by the multiplexer that leads to increase of area usage and power consumption. The basic idea of this paper is to use Binary to Excess-1 Converters (BEC) to achieve lower area and power consumption

International Journal on Recent and Innovation Trends in Computing and Communication

VLSI design of high-speed adders for digital signal processing applications.

Author: Bazarjani Seyfollah Seyfollahi
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/1987
Field of study

Scholarship at UWindsor

Pipelining Saturated Accumulation

Author: Chan Stephanie
DeHon André
Kapre Nachiket
Papadantonakis Karl
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/04/2008
Field of study

Aggressive pipelining and spatial parallelism allow integrated circuits (e.g., custom VLSI, ASICs, and FPGAs) to achieve high throughput on many Digital Signal Processing applications. However, cyclic data dependencies in the computation can limit parallelism and reduce the efficiency and speed of an implementation. Saturated accumulation is an important example where such a cycle limits the throughput of signal processing applications. We show how to reformulate saturated addition as an associative operation so that we can use a parallel-prefix calculation to perform saturated accumulation at any data rate supported by the device. This allows us, for example, to design a 16-bit saturated accumulator which can operate at 280 MHz on a Xilinx Spartan-3(XC3S-5000-4) FPGA, the maximum frequency supported by the component's DCM

CiteSeerX

Caltech Authors

A Novel VLSI Design On CSKA Of Binary Tree Adder With Compaq Area And High Throughput

Author: Deepika Ms. M.
Harini Ms. VVVSSS
Publication venue: International Journal of Innovative Technology and Research
Publication date: 21/02/2023
Field of study

Addition is one of the most basic operations performed in all computing units, including microprocessors and digital signal processors. It is also a basic unit utilized in various complicated algorithms of multiplication and division. Efficient implementation of an adder circuit usually revolves around reducing the cost to propagate the carry between successive bit positions. Multi-operand adders are important arithmetic design blocks especially in the addition of partial products of hardware multipliers. The multi-operand adders (MOAs) are widely used in the modern low-power and high-speed portable very-large-scale integration systems for image and signal processing applications such as digital filters, transforms, convolution neural network architecture. Hence, a new high-speed and area efficient adder architecture is proposed using pre-compute bitwise addition followed by carry prefix computation logic to perform the three-operand binary addition that consumes substantially less area, low power and drastically reduces the adder delay. Further, this project is enhanced by using Modified carry bypass adder to further reduce more density and latency constraints. Modified carry skip adder introduces simple and low complex carry skip logic to reduce parameters constraints. In this proposal work, designed binary tree adder (BTA) is analyzed to find the possibilities for area minimization. Based on the analysis, critical path of carry is taken into the new logic implementation and the corresponding design of CSKP are proposed for the BTA with AOI, OAI

International Journal of Innovative Technology and Research (IJITR)