Introduction
With the latest advancement of VLSI technology the demand for portable and embedded digital signal processing (DSP) systems has increased efficiently. Multipliers are key components of many high performance systems such are FIR filters, Microprocessors, Digital Signal Processors etc. For higher order multiplications, a large number of adders or components are used. The speed of the multiplier depends on the number of partial product and the speed of adder.
Vedic mathematics was reconstructed from 'Vedas' by Sri Bharti Krishna Tirthaji (1884-1960) after his eight years of research on Vedas [1] . According to him, Vedic mathematics is mainly focused on sixteen very important principles or word-formulae which are termed as sutras. In this paper after a gentle introduction of Urdhavatriyakb-hyam sutra, multiplier architecture is discussed and is illustrated with two 8-bit numbers.
The multiplier and multiplicand, each are grouped as 4-bit numbers so that it decomposes into 4x4 multipl-ication module. After decomposition, vertical and cross-wise is applied to carry out the multiplication on first 4x4 multiply module [2] . The multiplier based on this sutra has the advantage that as the number of bits increases, gate delay and area increase very slowly as compared to other conventional multipliers. Vedic Sutras are applied to binary multipliers using carry save adders. The Vedic multiplier which is implemented by us and is discussed in this paper, performs partial product generation and addition in parallel has better performance in turns of speed and area.
II. Vedic Multiplier using 'Urdhya Tiryagbhyam Sutra
The use of Vedic mathematics lies in the fact that it reduces the typical calculations in conventional mathematics into very simple one. This is so because the Vedic formulae are claimed to be based on the natural principles on which the human mind works. Vedic mathematics is a methodology of arithmetic rules that allow more efficient speed implementation. UrdhvaTriyakbhyam is the general formula applicable to all cases of multiplication.
'Urdhva' and 'Tiryagbhyam' words are derived from Sanskrit literature. 'Urdhva' means 'Vertically' and 'Tirya-gbhyam' means 'Crosswise' [3] . It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial product.
The sutra is illustrated in Fig 1. and the hardware architecture is depicted in Fig. 2 in this example two decimal numbers 252x846 are multiplied.
The digits on the both sides of the line are multiplied and added with the carry from the previous step. This generates one of the bits of the result and a carry. This carry is added in the next step and hence the process goes on. If more than one line is there in one step, all the results are added to the previous carry. In each step, least significant bit acts as the result bit and all other bits act as carry for the next step. Initially the carry is taken to be zero. The 'Urdhva Tiryagbhyam' algorithm can be implemented for binary number system in the same way as decimal number system. For the multiplication algorithm, let us consider the multiplication of two 4 bit numbers a 3 a 2 a 1 a 0 and b 3 b 2 b 1 b 0 . As the result of this multiplication of two 4 bit, we express it as c 7 r 6 r 5 r 4 r 3 r 2 r 1 r 0 .
Least significant bit r 0 is obtained by multiplying the least significant bits of the multiplicand and the multiplier as shown in Fig 3. The digits on both sides of the lines are multiplied and added with the carry from the previous step [4 -7] . This generates on the bits of the result (r n ) and a carry (c n ). This carry is added in the next step and thus the process goes on.
Thus the following expressions (I) to (7) 
With c 6 r 6 r 5 r 4 r 3 r 2 r 1 r 0 being the final product. Partial products are calculated in parallel and hence the delay involved is just the time it takes for the signal to propagate through the gates. 
III. The Proposed Multiplier Architecture
The hardware architecture of 4x4 and 8x8 bit Vedic multiplier module are displayed in the below sections. Here UrdhvaTiryagbhyam (vertically and crosswise) sutra is used to propose such architecture for the multiplication of two binary numbers. The beauty of Vedic multiplier is that here partial product generations and additions are done concurrently. Hence it is well adapted to parallel processing. The feature makes it more attractive for binary multiplications. This in turn reduces delay, which is the primary motivations behind this work. The proposed Vedic multipliers can be used to reduce delay. Early literature speaks about Vedic multiplier based on array multiplier structures. On the other hand, we proposed a new architecture, which is efficient in terms of speed. The arrangements of RC Adders showed in Fig. 5 helps us to reduce delay. Interestingly, 8x8 Vedic multiplier modules are implemented easily by using four 4x4 multiplier modules [8] [9] [10] . 
Vedic Multiplier for 8x8 bit module.
The 8x8 bit Vedic multiplier module as shown in block diagram in Fig.6 
Using the fundamental of Vedic multiplication, taking four bits at a time and using 4 bit multiplier block we can perform the multiplication. The outputs of 4x4 bit multipliers are added accordingly to obtain the final product [11] [12] [13] . Here total three 8 bit Ripple-Carry Adders are required as shown in 
IV. Design Verification and Implementation
In this work, 8x8 bit Vedic multiplier is designed in VHDL (very High speed Integrated Circuits Hardware Description Language) Logic Synthesis and Simulation is done in Xilinx ISE 8.2 i -Project Navigator and Isim simulator integrated in the Xilinx package. The perfor-mance of circuit is evaluated on the Xilinx device family, Spartan 2, XC2550 and package TQ144. The summary of the device description of the vertex FPGA used is explained in the table 1 
Simulation Results
After the successful compilation the RTL view generated is shown in Fig. 7 (a) and 7(b). Fig. 8 and Fig. 9: show the simulation result of 8x8 bit Vedic multiplier for unsigned binary and decimal number respectively . Table. 2 below shows the Synthesis report of the Vedic multiplier with the logic resource utilization. Synthesis was done using Xilinx ISE 8.21. The device chosen for synthesis is 2s15cs144-6. The Computation path delay for proposed 8x8 bit Vedic multiplier is found to be 8. 460 ns.
Synthesis Results
The power consumption was measured by using Xpower option available in project Navigator in ISE 6.1 Power consumption is 9.29 mW. Total memory usage is 166744 Kilobytes.
V. Conclusion
This paper presents an efficient method of multiplication (UrdhvaTiryakbhyam) Sutra's based on Vedic mathematics. It gives us method for hierarchical multiplier design and clearly indicates the computational advantages offered by Vedic methods.
The Computational path delay for proposed 8x8 bit Vedic multiplier is found to be lesser then the conventional multiplier. Hence our motivation to reduce delay is finally fulfilled. Vedic multiplier has less number of gates required for given 8x8 bit multiplier so its power dissipation is very small as compared to other multiplier architecture. In terms of area also, the proposed multiplier is better than the conventional multiplier. An awareness of Vedic mathematics can be effectively increased if, it is included in engineering education which may lead to improvement significantly in many areas where fast arithmetic computational are critical such as real time DSP applications.
