Abstract: This paper describes high speed compressors for high speed parallel multipliers like Booth Multiplier, Wallace Tree Multiplier in Digital Signal Processing (DSP). This paper presents 4-3, 5-3, 6-3 and 7-3 compressors for high speed multiplication. These compressors reduce vertical critical path more rapidly than conventional compressors. A 5-3 conventional compressor can take four steps to reduce bits from 5 to 3, but the proposed 5-3 takes only 2 steps. These compressors are simulated with H-Spice at a temperature of 25°C at a supply voltage 2.0V using 90nm MOSIS technology. The Power, Delay, Power Delay Product (PDP) and Energy Delay Product (EDP) of the compressors are calculated to analyze the total propagation delay and energy consumption. All the compressors are designed with half adder and full Adders only.
Introduction
With the recent trends in increasing mobility and performance in small hand-held mobile communication and portable devices, among three thrust areas i.e speed, area and power, speed has become one of the emphases in modern VLSI design. Parallel multipliers can be used to speed up the processors comparative serial multipliers.
There are two basic approaches to enhance the speed of parallel multipliers, one is the Booth algorithm and the other is the Wallace tree compressors or counters. But as per as power concern these two methods are not suitable, energy dissipation will be more [1] .
Multiplier architecture can be divided into three stages, a partial product generation stage, a partial product addition stage and final addition stage. Multipliers require high amount of power and delay during the partial products addition. For higher order multiplications, a huge number of adders or compressors are used to perform the partial product addition [2] . The number of adders was minimized by introducing different high order compressors. Binary counter property has been merged with the compressor property to develop high order compressors such as 5-3, 6-3 and 7-3 compressors [3, 4] .
The paper is organized as follows: Section 1 is the introduction of the compressors. Wallace tree and compressors description is given in Section 2. A review of Adders is discussed in Section 3. The architectures of the compressors are discussed in Section 4. Section 5 deals with results and discussions. Finally conclusion of the paper is given in Section 6.
Wallace Tree
Speed is not an issue in the multipliers, the partial products can be added serially to reduce the design complexity. In high-speed designs for example 16 bit [5] , the Wallace tree method [6] is usually used to add the partial products. In this method all the bits in each column at a time compresses them into two or three bits. Adders and compressors can be used to vertical bits compression in partial product reduction. An adder itself a compressor that is it compresses three bits into two bits. Hence it is a 3-2 compressor. For high order multiplication, high order compressors are used to compress the bits [7] [8] . In [3] , 16×16 bit multiplication is as shown in Fig. 1. 4-3 , 5-3, 6-3 and 7-3 compressors are designed with half adders, full adders and a logic block is used in vertical compression of the bits. But the proposed compressors are designed with complete efficient half adders and full adders which are discussed in later sections. In this paper an efficient low power design is used to construct XOR gate [9] . B. Full Adder: The 1-bit full-adder functionality can be summarized by the following equations, given the three 1-bit inputs A , B , and in C , it is desired to generate the two 1-bit outputs Sum and Cout, where:
A transmission function full adder (TFA) based on the transmission function theory is shown in Fig. 2 . A transmission-gate adder (TGA) using CMOS transmission gates is shown in Fig. 3 . Transmission gate logic circuit is a special kind of pass-transistor logic circuit. It is built by connecting a pMOS transistor and a nMOS transistor in parallel, which are controlled by complementary control signals. Both the pMOS and nMOS transistors will provide the path to the input logic "1" or "0," respectively, when they are turned on simultaneously [10] . The 14-T full adder shown in Fig. 4 design uses only one inverter, but has the problem of output glitches and sub threshold leakage power component. This is due to the incomplete voltage swing of the XOR gate output signal (an internal node of the adder) for the case 0
, where the PMOS transistor will be ON while the NMOS will not be totally OFF, leading to a larger subthreshold current. Another 16-T full adder [11] , shown in Fig. 5 uses the low power designs of XOR and XNOR gates along with pass transistors and transmission gates. The adder offers higher speed and lower power consumption than other implementations of the full adder. However,
. Pass Transistor Logic based Static Energy-Recovery Full (SERF) adder with ten transistors claimed superiority in energy consumption shown in Fig. 6 [12] .
The performances of the adders are verified with 90 nm technology in terms of average power, propagation delay, PDP and EDP. The results of theses adders are generated at a supply voltage 2.0 V = V . With the use of 16-T better results can be achieved. 
Compressor Architectures
A single bit full adder can be considered as a counter; A, B, C & D are inputs of a counter 4 and the three outputs are X , Y and Z then X is the LSB and Z is the MSB. Input combinations and the corresponding decimal count are shown in Table 2 . Based on property of counter a compressor 4-3 as shown in Fig. 7 is constructed using a full adder and two half adders along with efficient XOR designs. 
Performance Evaluation of High Speed Compressors for High Speed Multipliers
Compressor 5-3 uses 2 full adders connected with ripple type shown in Fig. 8 . The compressor 6-3 uses 3 full adders and one half adder and 7-3 compressor uses 4 full adders shown in Figs. 9 and 10. Now consider the column H where there are 9 bits, to compress the column a 6-3 compressor and one full adder is needed to reduce the bits shown in Fig. 1 . In column D there are 13 bits, using just one 6-3 and one 7-3 compressors we may compress them into 6 bits. Hence the multiplication will be very fast due to reduction in critical path with these compressors. The truth table of 4-3 compressor is as shown in Table 2 
Results and Discussions
The functionality of the compressors is verified using Xilinx ISE 9.1 synthesis tool at gate level describing them with Verilog HDL. The simulation waveforms of these compressors are shown in Figs. 11, 12 , 13 and 14.
The average power, propagation delay, Propagation Delay Product (PDP) and Energy Delay Product (EDP) of the compressors are calculated at transistor level using H-Spice with different full adder designs at a temperature of 25°C, 100MHz frequency using 90 nm MOSIS CMOS technology file. The concentration is not only on the speed, power also consider that is why power efficient XOR design is introduced in half adder to design 4-3 and 6-3 compressors. Monte-Carlo simulation has been used in the simulation for better results.
Total average powers of the proposed compressors are given in Table 3 and the comparison graph is shown in Fig. 15 . Total power includes dynamic, static and leakage power. Leakage power domination starts from nanometer technology. The total propagation delays of the compressors with the adders are shown in Table 4 and comparison graph is shown in Fig. 16 . The delay is calculated for all input and output combinations. Worst case delays of the compressors are compared. As per as power concern SERF and 16-T compressors shows better results. In the case of speed 14-T compressors had shown better performance. The power delay product in 7-3 and 6-3 TFA compressors show little economic than 16-T compressors, in remaining the 16-T compressors are the most energy efficient. The PDP comparisons of the compressors are shown in Table 5 and comparison graph is shown in Fig. 17 . Energy Delay Product (EDP) comparison is given in Table 6 and the variation is shown in Fig. 18 . In PDP and EDP 16-T and SERF compressors shown improvement than other compressor adders. The output voltage swing in 16-T compressors is also better than 14-T and SERF compressors. 1.0763E-09 1.0603E-09 1.0578E-09 1.0043E-09 1.0431E-09 6 -3
1.0835E-09 1.0674E-09 1.0583E-09 9.9821E-10 9.9129E-10 5 -3
1.0984E-09 1.0942E-09 9.7639E-10 9.6992E-10 1.0174E-09 4 -3 9.0469E-10 4.4560E-09 4.5822E-10 4.5102E-09 4.4632E-09 
Conclusion
To speed up Dadda, Wallace tree and Booth multipliers, compressors are the key in partial product reduction. The use of compressors in the multipliers not only reduces the vertical critical path but also reduce the stage operations simultaneously. To show better performance the compressors are tested with efficient adders. Multi threshold logic also can be use to improve the performance of the compressors. 16 bit multiplier effectively utilizes all the above said compressors for partial product reduction. The 16-T full adder compressors are the suitable for partial product reduction in multipliers than the better results of SERF. Threshold loss will be more in SERF. We can also use hybrid adders instead of using same adders to design a compressor.
