Multiplication is an operation much needed in Digital Signal Processing for various applications. This paper puts forward a high speed Vedic multiplier which is efficient in terms of speed, making use of Urdhva Tiryagbhyam, a sutra from Vedic Math for multiplication and Kogge Stone algorithm for performing addition of partial products and also compares it with the characteristics of existing algorithms. The below two algorithms aids to parallel generation of partial products and faster carry generation respectively, leading to better performance. The code is written in Verilog HDL and implemented on Xilinx Spartan 3 and Spartan 6 FPGA kit using Xilinx ISE 9.1i. The propagation delay of the implemented architecture is obtained to be 28.699ns and 15.752ns respectively.
INTRODUCTION
Vedic Mathematics is an ancient system of math practiced during Vedic age which was reconstructed by Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja [1] between 1911 and 1918 from certain Sanskrit manuscripts. It is perhaps the most refined and efficient mathematical system possible. One of such efficient technique has been employed to enhance the design of a multiplier. Multipliers are the key blocks of a Digital Signal processor. Multiplication is the key aspect, whereby improvement in computational speed of multiplication decreases the processing time of Digital Signal Processors. Convolution, Fast Fourier transforms and various other transforms make use of multiplier blocks.
A faster method for multiplication based on ancient Indian Vedic mathematics is studied in this paper. Among various methods of multiplications in Vedic mathematics, Urdhva Tiryagbhyam is efficient [2] . Urdhva Tiryagbhyam is a general multiplication formula applicable to all cases of multiplication. For addition of partial products in the multiplier Kogge Stone algorithm is used and realized. The code is written in Verilog HDL [3] and synthesized using Xilinx ISE 9.1i and implemented on Spartan 3 and 6 FPGA devices.
LITERATURE REVIEW
The algorithms and multiplier architecture was studied from [4, 5, 6, 7, 8, 9, 10, 11] and are represented below.
Urdhva Tiryagbhyam
Consider ABC as multiplicand and DEF as the multiplier. The steps of multiplication are descriptive in the figure above and the examples are solved below for better understanding. The intermediate carry generated is appended to the very next bit. 
Kogge Stone Algorithm
Kogge Stone algorithm was developed by Peter M. Kogge and Harold S. Stone and published in an IEEE seminar in 1973 [6] . It generates carry in O (log n) time and is used in the industry for high performance arithmetic circuits considering it to be the fastest adder. Carries are computed faster using KSA [9, 10, 11] at the cost of increased area. This is an attempt to apprehend the functioning of KSA in three distinct steps:
Pre processing
This step involves computation of generate and propagate signals corresponding to each pair of bits in A and B. These signals are given by the logic equations below:
Carry look ahead network
This is the block responsible for advancement in speed. This step involves computation of carries corresponding to each bit. It uses group propagate and generate as intermediate signals which are given by the logic equations below:
In the Figure 4 , Black box represents the computation of both Pi:j and Gi:j whereas grey box represents the computation of Gi:j alone and White triangular objects are buffers.
Post processing
It involves computation of sum bits. Sum bits are computed by the logic given below:
PROPOSED MULTIPLIER
Proposed multiplier architecture of 2x2, 4x4, and 8x8 bit VM module are displayed below. The basic architecture was comprehended from the base paper [4] and modified to obtain the right output as well as gain speed. The major change adopted here in the architecture is that we have used Kogge stone algorithm to add partial products rather than RCA, CLA and CSA. 
RESULTS AND SIMULATIONS
The Verilog code of 8x8 Vedic multiplier was synthesized using Xilinx ISE 9.1i and was implemented on FPGA device xc3s400-5tq144 of SPARTAN 3 Family. The results are shown below. DIP switches are used as input devices and LEDs are used as output devices. Comparison of delays between 8x8 modified Vedic multipliers using RCA and KSA executed on xc3s700-afg484 and VM using RCA represented in the paper [4] are shown in Table. 1. [5] 32.01 Booth Multiplier [5] 29.549 VM8x8KSA 23.644 
CONCLUSION

