Any processor's performance is dependent on three important factors namely speed, area and power. A better trade-off between these factors makes the processor, an effective one. Multipliers are the commonly used architectures inside the processor. If the performance of these multipliers are improved then powerful processors can be created in future. In this paper, the proposed multiplier design based on the sutra-'Urdhva Tiryakbhyam' of vedic mathematics is analyzed and the performance results of the multiplier are compared with conventional multipliers. Vedic mathematics is an ancient mathematics system which is based on 16 sutras, gives distinctive ideas for obtaining solutions with ease. The design is done using Verilog HDL and the processes such as simulation and synthesis are done using ModelSim ALTERA 6.5b and Xilinx ISE Design Suite 13.2.
Introduction
Many Digital signal processing (DSP) systems includes multipliers as one of core hardware blocks. Multipliers hold a significant role in various DSP applications such as digital filtering, digital communication and Fast Fourier transform [4] .
The common classification of multipliers depending on their architecture involves three types: 'serial multipliers', 'parallel multipliers' and 'serial-parallel multipliers'.
In this paper, multiplier architecture based on Urdhva tiryakbhyam sutra [6] ,a concept based on vedic mathematics is discussed.
The paper can be summarized with five sections in which section two describes about vedic mathematics, section three explains the urdhva tiryakbhyam sutra, section four presents the conventional multiplier types vs. proposed multiplier, section five discusses the results and comparison and the final section deals with the conclusion of the paper.
Vedic mathematics and its sutras
Vedic Mathematics is a book written by Jagadguru Shankaracharya Bharati Krishna Trithaji Maharaja. The book includes 16 sutras which are said to be derived from 'Ganita sutras' of atharva veda. 
Multiplication sutra-urdhva tiryakbhyam
The sutra 'Urdhva Tiryakbhyam' [2] is a common method which can be applied to all cases of multiplication. The diagrammatic representation of the process flow is shown in Fig.1 .The rule of this multiplication is at first, multiplication starts from MSBMost Significant Bit(or LSB)(vertical), of both multiplicands to get first cross product. Then increasing one bit, further calculation of cross products takes place between the bits of multiplicands goes till all bits are used. Then further dropping bits from MSB (or LSB) process of cross product is continued till only LSB (or MSB) is used for cross product. Here the notable characteristic of this multiplication is that determination of cross product and the summation involved with each step takes place simultaneously.
Conventional multiplier vs. Proposed multiplier

Conventional multipliers
The two widely used multipliers in digital circuits are array and booth multipliers. In the case of array multiplier, the multiplier takes two numbers say P and Q of a and b bits. A group of ab AND gates are used to generate ab partial products simultaneously which are summed using half adders and full adders. The booth multiplier is a multiplier that uses two's complement notation of signed binary numbers for multiplication. Large booth arrays are needed for fast multiplication and exponential operations. In considering both the multipliers, the time required for computation in booth multiplier is comparatively low [5] .
Proposed multiplier
The 2 x 2 multiplier structure [1] is obtained using four input AND gates and two half adder circuits as shown in the Fig.2(A) . It can be observed that this structure is similar to the hardware architecture of 2 x 2 conventional array multiplier and the overall delay is the delay associated with only those two half adder circuits which is the same in array multiplier case too. So, it can be judged that multiplying two bit binary numbers by this technique would not make drastic improvement in the efficiency of the multiplier. Hence we shift to 4 x 4 multiplier implementation which can be constructed from 2 x 2 multiplier block. In Fig.2(B) , all the square boxes indicate 2 x 2 multiplier blocks. Each block's input is assigned as shown in figure. Finally output is of eight bit which is Q7Q6Q5Q4Q3Q2Q1Q0. Thus, its block diagram can be represented by Fig.3 . This technique is used to construct multipliers with more bits such as 8,12,32 and 64 bits. The following tabulations show the results of different multipliers. 
Multiplier
Conclusion
Thus the design of proposed multiplier has been implemented on Xilinx Spartan3E xc3s500e-4fg320 (up to 32-bit) and xc3s1600e-4fg484 (for 64-bit) since for 64-bit more than 100% of resources in the former one is required and so the later one is used. The delay of the proposed multiplier for 64 x 64 bit multiplication is 45.601ns. Therefore proposed multiplier comes out as the better multiplier than the conventional multipliers in terms of speed. The multiplier performance can still improved by using the technique such as Wallace tree addition in adder block and it can be applied for DSP applications such as FIR,IIR,FFT. Further the above said application with respect to filters and FFT will be implemented in FPGA as future work.
