Optimization of Speed using Compressors by Ghalyan, Aashu & Kadyan, Virender
 International Journal of Computer (IJC) 
 
ISSN 2307-4531 
 
http://gssrr.org/index.php?journal=InternationalJournalOfComputer&page=index 
 
Optimization of Speed using Compressors 
Aashu Ghalyana*, Virender Kadyanb 
aAashu Ghalyan, S.L.D.C. Complex, Panipat, India 
bVirender Kadyan, Panipat, India 
aEmail: aashughalyan19@gmail.com 
bEmail: virenderkadyan89@Panipat.com 
Abstract  
The main objective of this Review is to provide high speed solutions for Very Large Scale Integration 
(VLSI) designers. Especially, we want focuses on the reduction of the time delay, which is showing an 
ever-increasing growth with the scaling down of the technologies. Various techniques at the different 
levels of the design process have been implemented to reduce the time delay at the circuit, architectural 
and system level. The high performance is obtained by using a new hierarchical structure, These adders 
are called compressors. These compressors make the multipliers faster as compared to the conventional 
design .  
Keywords:  Compressor, Multiplier, Xilinx tool, 
1. Introduction 
Designing low power high speed arithmetic circuits requires a combination of techniques at four levels; 
algorithm, architecture, circuit and system levels. This dissertation presents design and architecture of a 
multiplication algorithm, which is suitable for high performance and low–power applications [2, 3]. Digital 
multipliers are the most commonly used components in many digital circuit designs. They are fast, reliable and 
efficient components that are utilized to implement any operation. Depending upon the arrangement of the 
components, there are different types of multipliers available. Particular multiplier architecture is chosen based 
on the application. The time delay in a multiplier is a very important issue as it reflects the speed by the circuit 
and hence affects the performance of the device [4].  
------------------------------------------------------------------------  
* Corresponding author.  
E-mail address: aashughalyan19@gmail.com 
 
33 
 
International Journal of Computer (IJC) (2014) Volume 12, No  1, pp 33-38 
 
Most digital signal processing (DSP) systems incorporate a multiplication unit to implement algorithms such as 
correlations, convolution, and filtering and frequency analysis. In many DSP algorithms, the multiplier lies in 
the critical delay path and ultimately determines the performance of the algorithm [6]. The speed of 
multiplication operation is of great importance in DSP as well as in the general processors today, especially 
since the media processing took off. In the past, multiplication was implemented generally with a sequence of 
addition, subtraction and shift operations [8]. Recently, many multiplication algorithms have been invented and 
developed, each having pros and cons in different fields. The multiplier is a fairly large block of a computing 
system. The amount of circuitry involved is proportional to the square of its resolution; i.e. a multiplier of size n 
bits has O (n
2
) gates [9]. For multiplication algorithms performed in DSP applications, latency and throughput 
are the two major constraints from delay perspective. Latency is the real delay of computing a function, a 
measure of how long after the inputs to a device are stable, is the final result available on outputs [11]. 
Throughput is the measure of how many multiplications can be performed in a given period of time. Multiplier 
is not only a high delay block but also a significant source of power dissipation [13]. That’s why, if one also 
aims to minimize power consumption, it is of great interest to identify the techniques to be applied to reduce 
delay by using various delay optimization. 
Minimizing delay time for digital systems involves optimization at all levels of the design. This optimization 
includes the technology used to implement the digital circuits, the circuit style and topology, the architecture for 
implementing the circuits and at the highest level the algorithms that are being implemented. In the past five 
decades, engineering ingenuity has moved multiplication away from the slow add-and-shift techniques [14] to 
faster, parallel multiplication schemes.  
2. History 
In the first large-scale digital systems, multiplication was performed as a series of additions and shifts [1]. The 
requisite hardware consisted only of a parallel adder and a few registers. In the early 1950’s, multiplier 
performance was significantly improved with the introduction of Booth’s method, the modified Booth multiplier 
[22], and the development of faster adders [24] and memory components. Booth’s method and the modified 
Booth method do not require a correction of the product when either (or both) of the operands is negative for 
two’s complement numbers. During the 1950’s, adder designs moved away from the slow sequential formation 
of carries executed by ripple carry adders. Carry look ahead, carry select, and conditional sum adders yielded 
speedy sums through the faster simultaneous or parallel generation of carries.  
3. Literature survey 
Multiplication is the mathematical operation of scaling one number by another. It is one of the four basic 
operations in elementary arithmetic (the others being addition, subtraction and division).Webster’s dictionary 
defines multiplication as “a mathematical operation that at  its  simplest  is  an  abbreviated  process  of  adding  
an  integer  to  itself  a specified  number  of times”.   A number (multiplicand) is added to itself a number of 
times as   by another number (multiplier) form a result (product). 
34 
 
International Journal of Computer (IJC) (2014) Volume 12, No  1, pp 33-38 
 
In the research paper titled,, “Performance Evaluation of High Speed Compressors for High Speed Multiplier” 
by Ravi Nirlakalla, Thota Subba Rao Talari, Jaya chandra Prasad, describes the high speed compressors for high 
speed parallel multipliers like Booth Multiplier, Wallace Tree Multiplier in Digital Signal Processing (DSP). 
Authors presents 4-3, 5-3, 6-3 and 7-3 compressors for high speed multiplication.  These  compressors  reduce  
vertical  critical  path  more rapidly than conventional compressors. A 5-3 conventional compressor can take 
four steps to reduce bits from 5 to 3, but the proposed 5-3 takes only 2 steps. The Power, Delay, Power Delay 
Product (PDP) and Energy Delay Product (EDP) of the compressors are calculated to analyze the total 
propagation delay and energy consumption. All the compressors are designed with half adder and full Adders 
only. 
In the research paper titled,“ASIC Implementation of 4 Bit Multipliers” by Prvinkumar G. Parate, Prafulla S. 
Patil, Dr (Mrs) S. Subbaraman presents a basic multiplier algorithm for multiplication of 4 bit binary numbers 
which is suitable for CMOS implementation with partial product generation technique. From the results of 
simulation it can be concluded that Booth Multiplier is inferior in all respect and hence should be avoided.  
From power delay product Array Multiplier turns out to be better than Wallace Tree Multiplier.  However   
Array Multiplier  gives  optimum power  consumption  as  well  as  number  of  components required,  but  
delay  for  this  multiplier  is  larger  than Wallace Tree Multiplier.  Hence for low power requirement Array 
multiplier is suggested and for less delay requirement Wallace Tree Multiplier is suggested. 
In the research paper titled, “Delay Power Performance Comparison of Multipliers in VLSI Circuit Design” by 
Sumit Vaidya, Deepak Dandekar describes a typical processor central processing unit devotes a considerable 
amount of processing time in performing arithmetic operations, particularly multiplication operations. 
Multiplication is one of the basic arithmetic operations and it requires substantially more hardware resources 
and processing time than addition and subtraction. In fact, 8.72% of all the instruction in typical processing units 
is multiplication. Authors gives information that Booth Multiplier is superior in all respect like speed, delay, 
area, complexity, power consumption. However Array Multiplier requires more power consumption and gives 
optimum number of components required, but delay for this multiplier is larger than Wallace Tree Multiplier. 
Hence for low power requirement and for less delay requirement Booth’s multiplier is suggested. Further the 
work can be extended for optimization of said multiplier to improve the speed or to minimize the delay. 
In the research paper titled,“Fast and Power Efficient 16 x16 Array of Array Multiplier using Vedic 
Multiplication” by Dr. K.S. Gurumurthy , M.S. Prahalad discussed about “Array of Array” multiplier which is a 
derivative of Braun Array Multiplier. Braun array are much suitable for VLSI implementation because of its less 
space complexity though it shows larger time complexity, on the other hand tree multipliers have time 
complexity of O(log n) but are less suitable for VLSI implementation  since, being less regular; they require  
larger  total  routing length,  which  leads to performance degradation; simply put, they show higher space 
complexity. The main advantage of “Array of Array” multipliers is its inherent ability to reduce both time and 
space complexity with intermediate relative performance. In this paper a 16×16 unsigned ‘Array of Array’ 
multiplier circuit is designed with hierarchical structuring, it has been optimized using Vedic Multiplication 
Sutra (Algorithm) “UrdhvaTriyagbhyam” and Karatsuba-Ofman algorithm. The proposed algorithm is useful for 
math coprocessors in the field of computers. Algorithm is implemented on SPARTAN-3E FPGA (Field 
35 
 
International Journal of Computer (IJC) (2014) Volume 12, No  1, pp 33-38 
 
Programmable Gate Array). The proposed multiplier implementation shows large reduction in average power 
dissipation and in time delay as compared to Booth encoded radix-4 multiplier. 
4. Objective 
The main objective of this paper is to design and implementation of a fast multiplier which can be used in 
Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform (FFT), filtering and in 
microprocessors in its arithmetic and logic unit  
In this paper work, study of Array Multiplication algorithm, Multiplier with compressor has been explored.  
Multipliers with compressor have high speed performance. With total delays that are proportional to the 
logarithm of the operand word length, Multipliers with compressor are faster than Array Multipliers, whose 
delay grows linearly with operand word length. Architecture of these multiplier based on Power, speed and Area 
specification is designed here.  
5. Methodology 
Multiplication is one of the basic functions used in digital signal processing(DSP). It requires more hardware 
resources and processing time than addition and subtraction. In computers, typical central processing unit 
devotes a considerable amount of processing time in implementing arithmetic operations, particularly 
multiplication operations. Most high performance digital signal processing systems rely on hardware 
multiplication to achieve high data throughput. Multiplication is an important fundamental arithmetic operation. 
 
Multiplication-based operations such as Multiplier and Accumulate (MAC) are currently implemented in many 
Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform (FFT), filtering and in 
microprocessors in its arithmetic and logic unit [2]. 
References 
[1] R. Nirlakalla, T. SubbaRao, T. Jayachandra Prasad “Performance Evaluation of High Speed 
Compressors for High Speed Multipliers”, SERBIAN JOURNAL OF ELECTRICAL ENGINEERING 
Vol. 8, No. 3, pp. 293-306, November 2011. 
36 
 
International Journal of Computer (IJC) (2014) Volume 12, No  1, pp 33-38 
 
[2] Prvinkumar G. Parate, Prafulla S. Patil, Dr (Mrs) S. Subbaraman ,“ASIC Implementation of 4 Bit 
Multipliers”,IEEE First International Conference on Emerging Trends in Engineering and Technology, 
pp. 408-413,2008. 
[3] S.Vaidya,D.Dandekar,“DELAY-POWER ERFORMANCE COMPARISON OF MULTIPLIERS IN 
VLSI CIRCUIT DESIGN”, International Journal of Computer Networks & Communications (IJCNC), 
Vol.2, No.4, July 2010. 
[4] Dr. K.S. Gurumurthy, M.S. Prahalad, ”Fast and Power Efficient 16 x16 Array of Array Multiplier 
using Vedic Multiplication”,International Conference on Embedded System, Jan. 2011. 
[5] P.Ienne and Ajay K. Verma Federal, “Arithmetic Transformations to Maximise the Use of Compressor 
Trees” ,  In Proceedings of the IEEE International Workshop on Electronic Design, Test and 
Applications,2004 
[6] R.Hussin, A.Yeon Md.Shakaff, N. Idris, Z.Sauli, Rizalafande C.Ismail, Afzan   Kamarudin ,”An 
Efficient Modified Booth Multiplier Architecture”,IEEE International Conference on Electronics Design 
,Dec. 2008. 
[7] A.K.Verma, P.Ienne,  “Automatic Synthesis of Compressor Trees: Reevaluating Large Counters”, 
DATE07 ,EDAA © 2007. 
[8] S.R.Vaidya, D.R.Dandekar, “Performance Comparison of Multipliers for Power-Speed Trade-off in 
VLSI  Design”, Recent Advances in Networking VLSI and Signal Processing, pp.262-265. 
[9] A.Deshpande, Jeff Draper, “Squaring Units and a Comparison with Multipliers”, IEEE, pp.1266-
1269,2010. 
[10] M.Kamran, S.Feng ,J.Weixing, “Large Data handling Technique for Compression Pre-coder using 
Scalable Algorithm”, Journal of Information & Communication Technology Vol. 1, No. 1, pp.42-
50,2007. 
[11] A.Vazquez, E.Antelo,P.  Montuschi,” Improved  Design  of High-Performance Parallel  Decimal  
Multipliers”, IEEE TRANSACTIONS ON COMPUTERS, VOL. 59,    NO. 5,  pp.679-693,  MAY 
2010. 
[12] O.KWON, P.Alto, “A 16-Bit by 16-Bit MAC Design Using Fast 5:3 Compressor Cells”, Journal of 
VLSI Signal Processing 31, 77–89, 2002 
[13] S.V Siddamal, R.M Banakar  , B .C. Jinaga, “Design of High-Speed Floating Point Multiplier”, 4th 
IEEE International Symposium on Electronic Design, Test &Applications,pp. 285-289,2008. 
[14] S.Devadas, S.Malik,“A Survey of Optimization Techniques Targeting Low Power VLSI Circuits”, 
32nd ACM/IEEE Design Automation Conference, 1995. 
[15] A.Silva,E. Costa,S. Almeida,M.Porto, S.Bampi, “High Performance Motion Estimation Architecture 
UsingEfficient Adder-Compressors”, SBCCI’09, August 31st   - September 3rd, 2009. 
[16] K.Z. Pekmestzi,’’Multiplexer-Based Array Multipliers”, IEEE TRANSACTIONS ON COMPUTERS, 
VOL. 48, NO. 1, JANUARY 1999. 
[17] H. Parandeh-Afshar,P. Brisk, P. Ienne, “A Novel FPGA Logic Blockfor Improved Arithmetic 
Performance”, FPGA’08, February 24-26,  Monterey, California, USA, 2008 
[18] Q. LI, G. LIANG, Amine BERMAK, “A High-speed 32-bit Signed/Unsigned Pipelined Multiplier”, 
Fifth  IEEE International Symposium on Electronic Design, Test & Applications ,pp.207-211,2010. 
37 
 
International Journal of Computer (IJC) (2014) Volume 12, No  1, pp 33-38 
 
[19] C.N.Marimuthu, P.Thangaraj, “Low Power High Performance Multiplier”, ICGST-PDCS,Volume 8, 
Issue 1,Dec. 2008. 
[20] Kuan-Hung Chen, Yuan-Sun Chu,“A Low-Power Multiplier With the Spurious Power Suppression 
Technique” IEEE Transactions On Very Large Scale Integration Vlsi Systems (2007) Volume: 15, 
Issue: 7, Pages: 846-850, 2007. 
[21] Ms. V.N. Chaudhary,   Prof. Dr. P.R. Deshmukh, “analysis and implementation of low power wallace 
tree multiplier kevin nowka”, issn: 0975 – 6779| Nov 10 to | Volume – 01, Issue – 02, Oct 11. 
[22] Y. Ben Asher, E. Stein” Extending Booth Algorithm to Multiplications of Three Numbers on FPGAs”, 
IEEE, pp.333-336, 2008. 
[23] C. Senthilpari   , A.K.  Singh, K. Diwakar “Design of a low-power,  high performance, 8×8 bit 
multiplier using a Shannon-based adder  cell”, Microelectronics  Journal  39 (2008) pp.812–821, 2008. 
[24] X. Zhang, A. Bermak, F. Boussaid ,”Power Optimization in Multipliers Using Multi-Precision 
Combined with Voltage Scaling Techniques” ,IEEE , pp. 79-82, 2009. 
 
 
38 
 
