Abstract
INTRODUCTION
Multiplication is one of the complex arithmetic operations [6] . In most of the signal processing algorithms multiplication is a root operation whereas multipliers have large area, consume considerable power and long latency. So, in low-power VLSI system design, low-power multiplier design is also an important part. Mostly architecture of parallel multipliers can be classified into three parts: bit generation of primary partial product by using simple AND gates or by using any recoding strategies; bit compression of partial product by using any irregular array of logarithmic tree or by using a regular array; and the final addition [6] .
The main part of this paper is the reduction tree technique which is used for designing a new Baugh Wooley multiplier architecture. High Performance Multiplier (HPM) reduction tree [6] , [8] is based mainly on the generated partial product compression [1] . It is completely regular and the connectivity of the adding cells in HPM is in the triangular shape. The reason for using triangular shaped is that the triangular cell placement in the reduction tree technique has a shorter wire length [8] .
In the paper design and implementation of conventional 8 bit Baugh Wooley and Modified Booth multiplier algorithm has done and compared the result obtained with the new design of 8 bit Baugh Wooley and Modified Booth multiplier algorithm using HPM reduction tree [8], [6] . The comparative analysis has been done to prove that the new Baugh Wooley multiplier design is faster than the conventional Baugh Wooley and conventional as well as HPM Modified Booth multiplier design [9] . The algorithm for 5 bit Baugh Wooley multiplier is shown in 
BAUGH WOOLEY MULTIPLIER
The Baugh-Wooley multiplication is one of the efficient methods to handle the sign bots. This approach has been developed in order to design regular multipliers, suited for 2's complement numbers [2] . Let two n-bit numbers, multiplier (A) and multiplicand (B), to be multiplied. A and B can be represented as
(1)
Where the a i 's and b i 's are the bits in A and B, respectively, and a n-1 and b n-1 are the sign bits. The product, P = A x B, is given by the equation:
The final product can be generated by subtracting the last two positive terms from the first two terms [2] .
Instead of doing subtraction operation, it is possible to obtain the 2's complement of the last two terms and add all terms to get the final product.
The last two terms are n-1 bits in which each that extend in binary weight from position 2 n-1 up to 2 2n-3 . On the other hand, the final product is 2n bits and extends in binary weight from 2 0 up to 2 2n-1 .
At first pad each of the last two terms in the product P equation with zeros to obtain a 2n-bit number to be able to add it with the other terms. Then the padded terms extend in binary weight from 2 0 up to 2 2n-1 [3] .
Let X is one of the last two terms that can represent it with zero padding as
The final product [3] , P = A x B becomes:
Let A and B are 4-bit binary numbers, then the product [3] , P = A x B will be 8 bit long and is The block diagram for 4 bit Baugh Wooley multiplier is shown in Fig 3 and the detailed structure of each block has been shown in 
MODIFIED BOOTH MULTIPLIER
Let A be the multiplicand and B be the multiplier for multiplication of two n-bit integer numbers which can be represented in two's complement as.,
In Modified Booth multiplier, B in (2) becomes. Table 1 . The system of action [8] is partitioned into blocks such as  the encoder unit that is the e-cell that encodes multiplier bits (Y bits) and then it send signals for the generation of partial products;  the partial product generator (PPG) which will decodes signals from the encoder as well as the multiplicand X in order to generate the partial products;  the carry-save adder matrix (CAM) will add all the partial product which obtained during previous operation, and  The last row of full adders and half adder that is the final product adder (FPA) will add all the value from the CAM and produce the final product [10] . Fig 5 shows the architecture for Modified Booth multiplier [11] . The encoder (e-cell in Fig 6) where the multiplier(Y) encodes and the encoded signal and the multiplicand(X) is given to the partial product generator (g-cell in Fig 7) are the basic units of the Modified Booth multiplier. Both CAM and FPA blocks are made up of full adders as well as half adders [11] .
Fig-5:
Architecture of Modified Booth Multiplier [6] . 
HIGH PERFORMANCE MULTIPLIER REDUCTION TREE
In High Performance reduction tree technique the primary partial product bits are generated outside the tree. After the generation of partial product bits; these partial products then put into the reduction tree to calculate the product of the multiplier. This can be done using number of half adders and full adders arranged in a tree structure. The routing patterns for half adder, full adder and wiring cells in HPM reduction tree can be shown in Fig 5 . A K number of partial product bits are enter at the top of the half adder and full adder cell. The main role of the half adder and full adder cell is to reduce this number by one and to produce an output carry that is passed rightwards to the next column.
HPM Baugh Wooley Multiplier
The illustration of Baugh Wooley algorithm is represented in Fig 1. It is based on Hatamian's scheme [4] . It can be divided into three steps: 1) the most significant bit (MSB) of the partial-products in each N-1 rows and all bits of the last partial-product row, except its MSB, are inverted in the Baugh Wooley algorithm. 2) To the N th column a '1' is added. 3) In the final result the MSB of it is inverted [6] .
Implementation of Baugh Wooley multiplier using HPM method [6] is simply a straight forward method which is as represented in the algorithm. The partial products can be calculated using AND gates and the inverted products can be calculated using NAND gates. Insertion of ‗1' and the partial products are shown in Fig 9 that the block diagram for 8 bit Baugh Wooley multiplier using HPM [5] . 
HPM Modified Booth Multiplier
The illustration of Modified Booth algorithm is represented in Fig 1. Implementation of Modified Booth multiplier using HPM method [6] , [8] is simply a straight forward method which is as represented in the algorithm. The partial products can be calculated using encoder and decoder which are shown in Fig 3 and 
SIMULATION RESULTS

Simulation Results of 8 bit Conventional Baugh
Wooley Multiplier Using Cadence RTL complier 180nm process technology. 
Simulation Results of 8 bit HPM Baugh Wooley
Multiplier Using Cadence RTL complier 180nm process technology. 
Simulation Results of 8 bit HPM Modified
Booth Multiplier Using Cadence RTL complier 180nm process technology. 
COMPARATIVE ANALYSIS
Conventional Baugh Wooley and Modified Booth versus HPM Baugh Wooley and Modified Booth multiplier in terms of the number of power, delay, area footprint and energy. It is shown in Table 2 . 
