Performance evaluation of high speed compressors for high speed multipliers by Nirlakalla Ravi et al.
SERBIAN JOURNAL OF ELECTRICAL ENGINEERING 
Vol. 8, No. 3, November 2011, 293-306 
293 
Performance Evaluation of High Speed 
Compressors for High Speed Multipliers 
Ravi Nirlakalla
1, Thota Subba Rao
2, Talari Jayachandra Prasad
3 
Abstract: This paper describes high speed compressors for high speed parallel 
multipliers like Booth Multiplier, Wallace Tree Multiplier in Digital Signal 
Processing (DSP). This paper presents 4-3, 5-3, 6-3 and 7-3 compressors for high 
speed multiplication. These compressors reduce vertical critical path more 
rapidly than conventional compressors. A 5-3 conventional compressor can take 
four steps to reduce bits from 5 to 3, but the proposed 5-3 takes only 2 steps. 
These compressors are simulated with H-Spice at a temperature of 25°C at a 
supply voltage 2.0V using 90nm MOSIS technology. The Power, Delay, Power 
Delay Product (PDP) and Energy Delay Product (EDP) of the compressors are 
calculated to analyze the total propagation delay and energy consumption. All 
the compressors are designed with half adder and full Adders only.  
Keywords: Compressors, Adders, Delay, Power, PDP, EDP. 
 Introduction 
With the recent trends in increasing mobility and performance in small 
hand-held mobile communication and portable devices, among three thrust 
areas i.e speed, area and power, speed has become one of the emphases in 
modern VLSI design. Parallel multipliers can be used to speed up the processors 
comparative serial multipliers.  
There are two basic approaches to enhance the speed of parallel multipliers, 
one is the Booth algorithm and the other is the Wallace tree compressors or 
counters. But as per as power concern these two methods are not suitable, 
energy dissipation will be more [1]. 
Multiplier architecture can be divided into three stages, a partial product 
generation stage, a partial product addition stage and final addition stage. 
Multipliers require high amount of power and delay during the partial products 
addition. For higher order multiplications, a huge number of adders or 
compressors are used to perform the partial product addition [2]. The number of 
adders was minimized by introducing different high order compressors. Binary 
                                                            
1Department of Physics, RGM Engg College, Nandyal, AP-India, 518501; E-mail: ravi2728@gmail.com 
2Department of Physics, S.K. University, Anantapur, AP-India, 515003 
3Department of ECE, RGM Engg College, Nandyal, AP-India, 518501 
UDK: 004.383.3  DOI: 10.2298/SJEE1103293N R. Nirlakalla, T.S. Rao, T.J. Prasad 
294 
counter property has been merged with the compressor property to develop high 
order compressors such as 5-3, 6-3 and 7-3 compressors [3, 4]. 
The paper is organized as follows: Section  1 is the introduction of the 
compressors. Wallace tree and compressors description is given in Section 2. 
A review of Adders is discussed in Section  3. The architectures of the 
compressors are discussed in Section  4. Section  5 deals with results and 
discussions. Finally conclusion of the paper is given in Section 6. 
2 Wallace  Tree 
Speed is not an issue in the multipliers, the partial products can be added 
serially to reduce the design complexity. In high-speed designs for example 16 
bit [5], the Wallace tree method [6] is usually used to add the partial products. 
In this method all the bits in each column at a time compresses them into two or 
three bits. Adders and compressors can be used to vertical bits compression in 
partial product reduction. An adder itself a compressor that is it compresses 
three bits into two bits. Hence it is a 3-2 compressor. For high order multi-
plication, high order compressors are used to compress the bits [7-8]. In [3], 
16×16 bit multiplication is as shown in Fig. 1. 4-3, 5-3, 6-3 and 7-3 compressors 
are designed with half adders, full adders and a logic block is used in vertical 
compression of the bits. But the proposed compressors are designed with complete 
efficient half adders and full adders which are discussed in later sections. 
O N M L K J I H G F E D C B AP
 
Fig. 1 – 16×16 bit Wallace Tree multiplier. Performance Evaluation of High Speed Compressors for High Speed Multipliers 
295 
3  A Review of Adders 
A. Half Adder: A half adder can be construct with one AND and one XOR gate. 
In this paper an efficient low power design is used to construct XOR gate [9]. 
B. Full Adder: The 1-bit full-adder functionality can be summarized by the 
following equations, given the three 1-bit inputs  A,  B, and  in C , it is desired to 
generate the two 1-bit outputs Sum and Cout, where: 
  () = ⊕⊕ in Sum A B C , (1) 
 () = +⊕ out in CA B C A B . (2) 
A transmission function full adder (TFA) based on the transmission 
function theory is shown in Fig.  2. A transmission-gate adder (TGA) using 
CMOS transmission gates is shown in Fig. 3. Transmission gate logic circuit is 
a special kind of pass-transistor logic circuit. It is built by connecting a pMOS 
transistor and a nMOS transistor in parallel, which are controlled by 
complementary control signals. Both the pMOS and nMOS transistors will 
provide the path to the input logic “1” or “0,” respectively, when they are turned 
on simultaneously [10]. The 14-T full adder shown in Fig. 4 design uses only 
one inverter, but has the problem of output glitches and sub threshold leakage 
power component. This is due to the incomplete voltage swing of the XOR gate 
output signal (an internal node of the adder) for the case  0 = = AB , where the 
PMOS transistor will be ON while the NMOS will not be totally OFF, leading 
to a larger subthreshold current. Another 16-T full adder [11], shown in Fig. 5 
uses the low power designs of XOR and XNOR gates along with pass 
transistors and transmission gates. The adder offers higher speed and lower 
power consumption than other implementations of the full adder. However, 
1 == AB . Pass Transistor Logic based Static Energy-Recovery Full (SERF) 
adder with ten transistors claimed superiority in energy consumption shown in 
Fig. 6 [12]. 
The performances of the adders are verified with 90  nm technology in 
terms of average power, propagation delay, PDP and EDP. The results of theses 
adders are generated at a supply voltage  2.0V = V . With the use of 16-T better 
results can be achieved. 
Table 1 
Comparison of power, delay, PDP, EDP of Adders. 
Adder Type  Power [W]  Delay [s]  PDP  EDP [Js] 
14-T 2.33E-05  8.97E-10  2.09E-14  1.87E-23 
16-T 1.36E-05  5.07E-10  6.89E-15  3.49E-24 
TFA 3.05E-05  2.51E-09  7.65E-14  1.92E-22 
TGCMOS 5.07E-05 9.36E-10 4.74E-14  4.44E-23 
SERF 9.04E-06  1.65E-09  1.49E-14  2.46E-23 R. Nirlakalla, T.S. Rao, T.J. Prasad 
296 
 
Fig. 2 – TG CMOS Adder. 
 
Fig. 3 – TFA Adder. Performance Evaluation of High Speed Compressors for High Speed Multipliers 
297 
 
Fig. 4 – 16-T Adders. 
 
Fig. 5 – 14-T Adder. R. Nirlakalla, T.S. Rao, T.J. Prasad 
298 
 
Fig. 6 – SERF Adder. 
 
4 Compressor  Architectures 
A single bit full adder can be considered as a counter; A, B, C & D are 
inputs of a counter 4 and the three outputs are  X , Y  and Z  then  X  is the LSB 
and Z  is the MSB. Input combinations and the corresponding decimal count are 
shown in Table 2. Based on property of counter a compressor 4-3 as shown in 
Fig. 7 is constructed using a full adder and two half adders along with efficient 
XOR designs.  
Table 2 
Adder as 4, 3 counter. 
Outputs 
Input 
Z Y X 
Decimal Count 
All the inputs are zero  0  0  0  0 
Any one input is one  0  0  1  1 
Any two inputs are one  0  1  0  2 
Any three inputs are one  0  1  1  3 
All the inputs are one  1  1  1  4 Performance Evaluation of High Speed Compressors for High Speed Multipliers 
299 
Compressor 5-3 uses 2 full adders connected with ripple type shown in 
Fig.  8. The compressor 6-3 uses 3 full adders and one half adder and 7-3 
compressor uses 4 full adders shown in Figs.  9 and 10. Now consider the 
column H where there are 9 bits, to compress the column a 6-3 compressor and 
one full adder is needed to reduce the bits shown in Fig. 1. In column D there 
are 13 bits, using just one 6-3 and one 7-3 compressors we may compress them 
into 6 bits. Hence the multiplication will be very fast due to reduction in critical 
path with these compressors. The truth table of 4-3 compressor is as shown in 
Table 2. 
D CB A
Z2 Z1 Z0
FA
HA
FA
 
Fig. 7 – 4-3 compressor. 
ED CBA
Z0 Z1 Z2
FA
FA
 
Fig. 8 – 5-3 compressor. R. Nirlakalla, T.S. Rao, T.J. Prasad 
300 
HF FA
FA
FA
FE D C B A
Z0 Z1 Z2  
Fig. 9 – 6-3 compressor. 
FA FA
FA
FA
GFE DCB A
Z0 Z1 Z2   
Fig. 10 – 7-3 compressor. 
 
5  Results and Discussions 
The functionality of the compressors is verified using Xilinx ISE 9.1 
synthesis tool at gate level describing them with Verilog HDL. The simulation 
waveforms of these compressors are shown in Figs. 11, 12, 13 and 14. Performance Evaluation of High Speed Compressors for High Speed Multipliers 
301 
The average power, propagation delay, Propagation Delay Product (PDP) 
and Energy Delay Product (EDP) of the compressors are calculated at transistor 
level using H-Spice with different full adder designs at a temperature of 25°C, 
100MHz frequency using 90  nm MOSIS CMOS technology file. The 
concentration is not only on the speed, power also consider that is why power 
efficient XOR design is introduced in half adder to design 4-3 and 6-3 
compressors. Monte-Carlo simulation has been used in the simulation for better 
results. 
Total average powers of the proposed compressors are given in Table 3 
and the comparison graph is shown in Fig. 15. Total power includes dynamic, 
static and leakage power. Leakage power domination starts from nanometer 
technology. The total propagation delays of the compressors with the adders are 
shown in Table 4 and comparison graph is shown in Fig.  16. The delay is 
calculated for all input and output combinations. Worst case delays of the 
compressors are compared. As per as power concern SERF and 16-T 
compressors shows better results. In the case of speed 14-T compressors had 
shown better performance. The power delay product in 7-3 and 6-3 TFA 
compressors show little economic than 16-T compressors, in remaining the 
16-T compressors are the most energy efficient. The PDP comparisons of the 
compressors are shown in Table 5 and comparison graph is shown in Fig. 17. 
Energy Delay Product (EDP) comparison is given in Table 6 and the variation 
is shown in Fig.  18. In PDP and EDP 16-T and SERF compressors shown 
improvement than other compressor adders. The output voltage swing in 16-T 
compressors is also better than 14-T and SERF compressors.  
Table 3 
Average power (in [W]) of compressors. 
Compress
or Type  TG CMOS  TFA  16-T  14-T  SERF 
(10T) 
7 - 3  4.7562E-04  2.5090E-04  7.3960E-05  7.1241E-05  4.6221E-05 
6 - 3  5.1638E-04  2.2965E-04  9.8845E-05  2.7021E-04  7.3254E-05 
5 - 3  2.4322E-04  1.2191E-04  3.1278E-05  3.4854E-05  1.9450E-05 
4 - 3  1.3685E-03  8.3446E-04  5.0348E-04  7.2032E-04  4.0577E-04 
Table 4 
Delay (in [s]) of the compressors. 
Compressor 
Type  TG CMOS  TFA  16-T  14-T  SERF 
(10T) 
7 - 3  1.0763E-09  1.0603E-09  1.0578E-09  1.0043E-09  1.0431E-09 
6 - 3  1.0835E-09  1.0674E-09  1.0583E-09  9.9821E-10  9.9129E-10 
5 - 3  1.0984E-09  1.0942E-09  9.7639E-10  9.6992E-10  1.0174E-09 
4 - 3  9.0469E-10  4.4560E-09  4.5822E-10  4.5102E-09  4.4632E-09 R. Nirlakalla, T.S. Rao, T.J. Prasad 
302 
Table 5 
PDP of the compressors. 
Compressor 
Type  TG CMOS  TFA  16-T  14-T  SERF 
(10T) 
7 - 3  5.1191E-13 2.6603E-13 7.8235E-14 7.1547E-14 4.8213E-14 
6 - 3  5.595E-13 2.4513E-13  1.0461E-13 2.6973E-13 7.2616E-14 
5 - 3  2.6715E-13 1.3339E-13  3.054E-14  3.3806E-14 1.9788E-14 
4 - 3  1.2381E-12 3.7184E-12 2.2435E-12 3.2488E-12  1.811E-12 
Table 6 
EDP (in [Js]) of the compressors. 
Compressor 
Type  TG CMOS  TFA  16-T  14-T  SERF 
(10T) 
7 - 3  5.5097E-22 2.8207E-22 8.2757E-23 7.1855E-23 5.0291E-23 
6 - 3  6.0622E-22 2.6165E-22 1.1071E-22 2.6924E-22 7.1983E-23 
5 - 3  2.9344E-22 1.4596E-22 2.9818E-23 3.2789E-23 2.0133E-23 
4 - 3  1.1201E-21 1.6569E-20  1.028E-21  1.4653E-20  8.083E-21 
 
 
Fig. 11 – Waveforms of 4-3 compressor. 
 
Fig. 12 – Waveforms of 5-3 compressor. Performance Evaluation of High Speed Compressors for High Speed Multipliers 
303 
 
Fig. 13 – Waveforms of 6-3 compressor. 
 
Fig. 14 – Waveforms of 7-3 compressor. 
7-3 6-3 5-3 4-3
0.0000
0.0005
0.0010
T
o
t
a
l
 
P
o
w
e
r
(
W
a
t
t
s
)
COMPRESSORS
 TGCMOS
 TFA
 16T
 14T
 SERF
02468 1 0
0
2
4
6
8
10
 
Fig. 15 – Proposed Compressors comparison of average power. R. Nirlakalla, T.S. Rao, T.J. Prasad 
304 
 
7-3 6-3 5-3 4-3
0.00E+000
2.00E-009
4.00E-009
P
r
o
p
-
D
e
l
a
y
(
S
e
c
)
COMPRESSORS
 TGCMOS
 TFA
 16T
 14T
 SERF
02468 1 0
0
2
4
6
8
10
 
Fig. 16 – Proposed Compressors comparison of propagation delay. 
 
7-3 6-3 5-3 4-3
0.00E+000
2.00E-012
4.00E-012
P
O
W
E
R
-
D
E
L
A
Y
 
P
R
O
D
U
C
T
COMPRESSORS
 TGCMOS
 TFA
 16T
 14T
 SERF
02468 1 0
0
2
4
6
8
10
 
Fig. 17 – Proposed Compressors comparison of PDP. Performance Evaluation of High Speed Compressors for High Speed Multipliers 
305 
7-3 6-3 5-3 4-3
0.00E+000
6.00E-021
1.20E-020
E
N
E
R
G
Y
-
D
E
L
A
Y
 
P
R
O
D
U
C
T
(
J
S
e
c
)
COMPRESSORS
 TGCMOS
 TFA
 16T
 14T
 SERF
02468 1 0
0
2
4
6
8
10
 
Fig. 18 – Proposed Compressors comparison of EDP. 
 
6 Conclusion 
To speed up Dadda, Wallace tree and Booth multipliers, compressors are 
the key in partial product reduction. The use of compressors in the multipliers 
not only reduces the vertical critical path but also reduce the stage operations 
simultaneously. To show better performance the compressors are tested with 
efficient adders. Multi threshold logic also can be use to improve the 
performance of the compressors. 16 bit multiplier effectively utilizes all the 
above said compressors for partial product reduction. The 16-T full adder 
compressors are the suitable for partial product reduction in multipliers than the 
better results of SERF. Threshold loss will be more in SERF. We can also use 
hybrid adders instead of using same adders to design a compressor. 
7 References 
[1]  A. Bellaouar, M.I. Elmasry: Low-power Digital VLSI Design Circuits and Systems, Kluwer 
Academic Publishers, Boston, USA, 1995. 
[2]  V.G. Oklobdzija, D. Villeger, S.S. Liu: A Method for Speed Optimized Partial Product 
Reduction and Generation of Fast Parallel Multipliers using an Algorithmic Approach, IEEE 
Transaction on Computers, Vol. 45, No. 3, March 1996, pp. 294 – 306. R. Nirlakalla, T.S. Rao, T.J. Prasad 
306 
[3]  A. Dandapat, S. Ghosal, P. Sarkar, D. Mukhopadhyaya: A 1.2-ns 16×16-bit Binary 
Multiplier using High Speed Compressors, International Journal of Electrical, Computer and 
Systems Engineering, Vol. 4, No. 3, 2010, pp. 234 – 239. 
[4]  A. Dandapat, P. Bose, S. Ghosh, P. Sarkar, D. Mukhopadhyay: Design of an Application 
Specific Low-power High Performance Carry Save 4-2 Compressor, 11
th VLSI Design and 
Test Symposium, Kolkata, India, Aug. 2007. 
[5]  C.F. Law, S.S. Rofail, K.S. Yeo: Low-power Circuit Implementation for Partial Product 
Addition using Pass Transistor Logic, IEE proceedings – Circuits, Devices and Systems, 
Vol. 146, No. 3, June 1999, pp. 124 – 129. 
[6]  C.S. Wallace: A Suggestion for a Fast Multiplier, IEEE Transactions on Electronic 
Computers, Vol. EC-13, No. 1, Feb. 1964, pp. 14 – 17. 
[7]  R. Menon, D. Radhakrishnan: High Performance 5:2 Compressor Architectures, IEE 
proceedings – Circuits, Devices and Systems, Vol. 153, No. 5, Oct. 2006, pp. 447 – 452. 
[8]  S.R. Chowdhury, A. Banerjee, A. Roy, H. Saha: Design, Simulation and Testing of a High 
Speed Low Power 15-4 Compressor for High Speed Multiplication Applications, First 
International Conference on Emerging Trends in Engineering and Technology, Nagpur, 
Maharashtra, India, July 2008, pp 434 – 438.  
[9]   J. Wang, S. Fang, W. Feng: New Efficient Designs for XOR and XNOR Functions on the 
Transistor Level, IEEE Journal of Solid-State Circuits, Vol. 29, No. 7, July 1994, pp. 780 – 786. 
[10]  C.H. Chang, J. Gu, M. Zhang: A Review of 0.18- m Full Adder Performances for Tree 
Structured Arithmetic Circuits, IEEE Transaction on Very Large Scalar Integration (VLSI) 
Systems, Vol. 13, No. 6, June 2005, pp. 686 – 695. 
[11]  A.M. Shams, M.A. Bayoumi: A Novel High-performance CMOS 1-Bit Full-adder Cell, 
IEEE Transaction on Circuits and Systems II: Analog and Digital Signal Processing, 
Vol. 47, No. 5, May 2000, pp. 478 – 481. 
[12]  J.F. Lin, Y.T. Hwang, M.H. Sheu, C.C. Ho: A Novel High-speed and Energy Efficient 10-
Transistor Full Adder Design, IEEE Transactions on Circuits And Systems I: Regular 
Papers, Vol. 54, No. 5, May 2007, pp. 1050 – 1059. 
 