INTRODUCTION
A multiplier is one of the key hardware blocks in most digital signal processing systems, like frequency domain filtering (FIR and IIR), frequency-time transformations (FFT), Correlation, Digital Image processing etc [1, 4] . With advances in technology, many researchers have tried to design multipliers which offer either of the following-high speed, low power consumption, regularity of layout and hence less area or even combination of them in multiplier.
Many research efforts have been devoted to implement high speed and reducing the power dissipation of multipliers like Wallace Tree Multiplier (WTM) [5] , Modified Booth Array (MBA) [6] , Baugh Wooley Multiplier (BWM) [7] and Row Bypassing and Parallel Architecture (RBPA) [8] based multiplier etc. The basic idea behind all these attempts was the fast implementation of the addition of the partial products. For this purpose, the carry-save addition (CSA) technique has been extensively used. In this technique, intermediate results are always in a redundant form of two numbers [9] . The carry-select-adder (CSA)-based radix multipliers, which have lower area overhead, employ a greater number of active transistors for the multiplication operation and hence consume more power [10] . Among other multipliers, shift-and-add multipliers have been used in many other applications for their simplicity and relatively small area requirement [11] .
Higher-radix multipliers are faster but consume more power since they employ wider registers, and require more silicon area due to their more complex logic [12] .
In algorithmic and structural levels, a lot of multiplication techniques had been developed to enhance the efficiency of the multiplier; which encounters the reduction of the partial products and/or the methods for their partial products addition, but the principle behind multiplication was same in all cases. The Vedic mathematics approach is totally different and considered very close to the way a human mind works. Vedic Mathematics is the ancient system of Indian mathematics which has a unique technique of calculations based on 16 Sutras (Formulae). "Urdhvatiryakbyham" is a Sanskrit word means vertically and crosswise formula is used for smaller number multiplication. "Nikhilam Navatascaramam Dasatah" also a Sanskrit term indicating "all from 9 and last from 10", formula is used for large number multiplication. All these formulas are adopted from ancient Indian Vedic Mathematics. Mehta et al. [13] have been proposed a multiplier design using "Urdhva-tiryakbyham" sutras, which was adopted from the Vedas. The P. Saha, A. Banerjee, A. Dandapat, P. Bhattacharyya, Vedic Mathematics Based 32-Bit Multiplier Design for High Speed Low Power Processors formulation using this sutra is similar to the modern array multiplication, which also indicating the carry propagation issues. A multiplier design using "Nikhilam Navatascaramam Dasatah" sutras has been reported by Tiwari et. al [14] in 2009, but he has not implemented the hardware module for multiplication. Recently Saha et. al [15] has been reported a multiplier based on the same principle of "Nikhilam Navatascaramam Dasatah" sutra for special types multiplier design of same bases, but he has not extended his work for general purpose multiplier design.
In this work we formulate this mathematics for designing the 32×32 bit multiplier architecture in transistor level with two clear goals in mind such as: i) Simplicity and modularity multiplications for VLSI implementations and ii) The elimination of carry propagation for rapid additions and subtractions. By employing the Vedic mathematics, an (N×N) bit multiplier implementation was transformed into one small number multiplication, one addition/subtraction and shifting operations. "Urdhva-tiryakbyham" method is used for the implementation of small number multiplication, "Nikhilam Navatascaramam Dasatah" and "Urdhva-tiryakbyham" methodology is used for generating the whole (N×N) bit multiplier. The multiplier is fully parameterized, so any configuration of input and output word-lengths could be elaborated. Transistor level implementation for performance parameters such as propagation delay, dynamic leakage power and dynamic switching power consumption calculation of the proposed method was calculated by spice spectre using 90 nm standard CMOS technology and compared with the other design like Wallace Tree Multiplier (WTM) [5] , Modified Booth Array (MBA) [6] , Baugh Wooley Multiplier (BWM) [7] and Row Bypassing and Parallel Architecture (RBPA) [8] . The calculated results revealed (32×32) bit multiplier have propagation delay only ~1.06 us and consumes ~132 uW dynamic switching power.
MATHEMATICAL FORMULATION OF VEDIC SUTRAS
The gifts of the ancient Indian mathematics in the world history of mathematical science are not well recognized. The contributions of saint and mathematician in the field of number theory, 'Sri Bharati Krsna Thirthaji Maharaja', in the form of Vedic Sutras (formulas) [11] are significant for calculations. He had explored the mathematical potentials from Vedic primers and showed that the mathematical operations can be carried out mentally to produce fast answers using the Sutras.
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 4, NO. 2, JUNE 2011
In this paper we are concentrating on "Urdhva-tiryakbyham", and "Nikhilam Navatascaramam Dasatah" formulas and other formulas are beyond the scope of this paper.
"Nikhilam Navatascaramam Dasatah" Sutra
Nikhilam sutra means "all from 9 and last from 10". Mathematical description of this sutra can be formulated as:
Assuming A and B are two n-bit numbers to be multiplied and their product is equals to Z.
where (1) where (2) Multiplication Rule:
Equation (3) can be reformulated as by adding and subtracting the term 10 2n +10 n (A+B) in the right hand side
Equation no (5) can be derived for both the numbers if the number is greater than the base or less than the base.
If the number is greater than the base:
If the number is less than the base:
Where and are the 10 n 's complement of A and B.
Example of Multiplication using "Nikhilam Navatascaramam Dasatah" Sutra: As shown in Figure 1 , we write the multiplier and the multiplicand in two rows followed by the differences/addition of each of them from the chosen base. We can now write two columns of numbers, one consisting of the numbers to be multiplied (Column 1) and the other consisting of their compliments (Column 2). The product also consists of two parts which are demarcated/ The serious drawback of Nikhilam sutra can be summarized as:
(i) Both the multiplier and multiplicand are less or greater than the base.
(ii) Multiplier and multiplicand are nearer to the base.
"Urdhva-tiryakbyham" Sutra
The meaning of this sutra is "Vertically and crosswise" and it is applicable to all the multiplication operations. tiryakbhyam' Sutra is shown in Figure 2 . The numbers to be multiplied are written on two consecutive sides of the square as shown in the figure. Each of the small squares is partitioned into two equal halves by the crosswise lines. Each digit of the multiplier is then independently multiplied with every digit of the multiplicand and the two-digit product is written in the common box. All the digits lying on a yellow boxes are added and producing sum and carry digits. Finally results are obtained by the addition of sum digits and the previous carry digits.
Carry for the first step is taken to be zero. 
Assume that, their product is equal to Z. Then Z can be represented as:
Where (A i , B j € (0,1,2,.....,9) and 'N' may ne any number. X and Y can be represented as:
Assuming product of the number is equals to P.
For the fast multiplication using extended rule of the sutra the bases of the multiplicand and the multiplier assuming same, thus the equation no 16 can be rewritten as
From equation no 17 it is observed that a large number multiplication can easily decomposed into a small number multiplication, addition/subtraction and shifting, leading towards the reduction of hardware cost, propagation delay and power consumption. Small number of the multiplication can easily implemented using "Urdhva-tiryakbyham" sutra (formula).
Hardware Implementation of "Nikhilam Navatascaramam Dasatah" Sutra
The mathematical expression for the proposed algorithm is shown in equation no 17. Hardware implementation of this mathematics is shown in figure 3 . It consists of five major segments such as:-(i) RSU, (ii) Sub-tractor, (iii) Adder/Sub-tractor, (iv) Multiplier and (v) Shifter. Since the radix is same for both the inputs so the radix generated from RSU corresponding to input X is fed to the sub-tractor. The other input to the sub-tractor is Y. The output generated is the residue corresponding to Y. Both the residues are multiplied through the multiplier. The residue corresponding to Y is added with or subtracted from X by the first adder/sub-tractor block. The result is shifted left based on the exponent generated from RSU. The multiplier result is added with or subtracted from the shifted result by the second adder/sub-tractor.
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 4, NO. 2, JUNE 2011
Figure 3. Hardware Implementation of "Nikhilam Navatascaramam Dasatah" Sutra
Mathematical Background of "Urdhva-tiryakbyham" sutra for binary number system:
Assume the product of two N-bit words are described as
where x i , y j ε { 0, 1}
Multiplication can be described as (20)
Let k= i+j
Where (23) The equation (21) shows that the co-efficient of multiplication that can be achieved by the convolution sum of the two finite number sequences. The hardware implementation of the exponent determinant is shown in Figure 5 .The integer part or exponent of the number from the binary fixed point number can be obtained by the maximum power of the radix. For the non-zero input, shifting operation is executed using parallel in parallel out (PIPO) shift registers. The number of select lines (in Figure 5 it is denoted as S 1 , S 0 ) of the PIPO shifter is chosen as per the binary representation of the number (N-1) 10 . 'Shift' pin is assigned in PIPO shifter to check whether the number is to be shifted or not (to initialize the operation 'Shift' pin is initialized to low). A decrementer has been integrated in this architecture to follow the maximum power of the radix. A sequential searching procedure has been implemented here to search the first '1' starting from the MSB side by using shifting technique.
For an N bit number, the value (N-1) 10 is fed to the input of decrementer. The decrementer is decremented based on a control signal which is generated by the searched result. If the searched bit is '0' then the control signal becomes low then decrementer start decrementing the input value (Here the decrementer is operating in active low logic). The searched bit is used as a controller of the decrementer. When the searched bit is '1' then the control signal becomes high and the decrementer stops further decrementing and shifter also stops shifting operation. The output of the decrementer shows the integer part (exponent) of the number. The architecture of the Mean Determinant is exhibited in Figure 6 . Mean Determinant takes the exponent (power of MSB) as input and is decremented by one. Decrement operation is performed in the sub-tractor. The decremented value is fed to the shifter which acts as select input for the shifter. The Binary value "11" ('3 10 ') is shifted to the left depending upon the select input and the result from the shifter is the desired mean value. Figure 6 . Architecture of Mean Determinent.
Comparator:
The architecture of comparator is shown in Figure 7 . The comparison task is performed based on subtraction operation. The subtraction result has two parts:-(i) result and (ii) Borrow Out. The result is not the matter of interest for this architecture so it has not been exhibited in the figure.
From the "Borrow Out" signal, the comparison decision has been taken. 
Where t pd =Total propagation delay, t RSU =Propagation delay for radix selection unit, t adsb =Propagation delay for addition/subtraction unit, t m =Propagation delay for multiplication unit, t shft = Propagation delay for shifting module.
The performance parameters such as propagation delay, dynamic switching power consumption and dynamic leakage power consumption to implement Vedic multiplier is shown in figure 9 .
Input data is taken as a regular fashion for experimental purpose. We have kept our main concentration for reducing the propagation delay, dynamic switching power and dynamic leakage power consumption and energy delay product. 
