# Design and Implementation of High Speed Vedic Multiplier in SPARTAN 3 FPGA Device

### Prof. Parvaneh Basaligheh

University of Tech and Management Malaysia pbasaligeh@utmcc.my

| Article History                                                                                                                                                          | Abstract                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Article Submission<br>19 December 2016<br>Revised Submission<br>1 February 2017<br>Article Accepted<br>10 March 2017<br>Article Published<br>31 <sup>st</sup> March 2017 | Digital systems which are more effective are necessary due to the enormous growth<br>in the technology. So, we go for multipliers which are playing a key role in each and<br>every digital domain device. Also, designing a multiplier with high speeds to perform<br>ALU operations is an important aspect in digital signal processing. These operations<br>are used for DFT, convolution etc. Hence, professionals in DSP domain are trying<br>to develop innovative algorithms and hardware implementation. It is very essential<br>to employ a multiplier which is more effective. They are many standard algorithms<br>that are existing to reduce the area and time needed for execution. Vedic era<br>described algorithms in vedic mathematics that supply an efficiency which are of<br>high level. They provide 16 sutras for the operation of multiplication. Here, we<br>discuss about urdhva tiryakbhyam algorithm for multiplication operation.<br>Therefore, vedic algoritm provides better efficiency in comparison to that of<br>conventional multipliers.<br><b>Keywords:</b> ALU, DFT, DSP, FPGA |

#### I. Introduction

The need for high speed while processing is becoming more and more due to various innovative applications developed in a computer and to attain a preferred output in various real time applications, ALU operations providing higher throughput are crucial [1][2]. Rather than using more time in processing, we go for Urdhva Triyagbhyam algorithm for ALU operations for performing the operations with high speed. The overall speed of electronic components is incremented. Multipliers are the important aspects of DSP processors and multipliers speed will determine the DSP speed [3][4].

Multiplication operation is done by employing various algorithms. In the algorithm of array multiplication, in performing one micro-operation we can obtain two binary numbers by employing a combinational circuit that generates product bits which is very speed because of only one delay. Moreover, this algorithm requires more gates which is cost effective. Another important issue is to improve efficiency of multiplier that is by employing adders. There are two algorithms. One is carry save array and the other is Wallace tree algorithm. CSA offers processing bit by bit to generate a carry signal to an adder [5][6].

A multiplier basically consists of partial product generation circuit and an adder circuit. The production of partial product produces products with the use of two operands namely multiplier and a multiplicand by multiplying each digit in multiplier with multiplicand and addition of these products are performed employing adder circuit to produce the final product (result). While designing the multipliers, efficient adders should be chosen to decrease multipliers power, delay. Multipliers are the important components in any hardware block which is employed in various applications. Multipliers efficiency is determined by the aspects based on area, speed [7][8].

#### **II. Existing method**

The standard operations performed in multiplication are (I) partial products should be generated (II) segregation of these products. Multiplication speed is increased based on the accumulation speed. Booth algorithm is another method which employs approach of a state machine. It decrements the count of partial products that are produced. It also go wrong in decreasing the complexity of a circuit. It is outlined employing data path and controller method. Data path is the one which needs a counter, registers for shifting purpose, an accumulator. Data flow will be taken care by a controller. The controller produces preferred signals for each cycle of a clock based on the present state output. Thereafter, product is generated at a specific count of clock cycles [9].

Booth recording multiplier is the one which senses only 3 bits at an instance to decrease the count of partial products. They are: a pair of bits from the present state, last bit corresponds to higher level bit of the lower level pair. After assessing these bits, they are converted into a group of five bits adopted by the adder inside the array to hold operations conducted by cells of an adder. This method decreases the count of adders and delay to generate partial sums [10].

CSA algorithm is the one which has limitation that is the time taken for execution will turn on based on the count of multiplier bits. Also, there is chance of difficulty in attain more speed during the operation. While in Wallace tree algorithm, three bits are transferred to a full adder that is also known as three input Wallace tree circuit. Then the output is moved to the later stages of full adder of the similar bit. Also, the output of carry signal is transferred to full adder of the similar bit. Also, the output of the similar count. In Wallace tree algorithm, outline of the circuit is difficult inspite of high speed operation.

FPGA device used to implement the above algorithm employed is Spartan-3 developed by Xilinx recently. It is very cost effective , kit provided by Digilent. This FPGA is cost effective in comparison with all the available boards in the market that give away tools necessary for designing purpose and checking Spartan-3 designs. Finally, the recent algorithm employed for multiplication results in the utilization of more power and reduced efficiency. Here, we propose an innovative multiplier that solves the problems employing Urdhva Tiryagbhyam sutra in Vedic Mathematics. Multiplier designs are more speed in comparison with that of multipliers that are existing.

#### **III. Proposed Work**

Here, Vedic multiplier which is proposed is based on Urdhva-Triyag algorithm. They are conventionally employed for multiplication operation of two decimal numbers. A similar key point is employed to the binary system suitable to the hardware. It is suitable for any kind of multiplication. It is also known as Vertically and Crosswise. Belo we can observe 2x2 and 4x4 bit Vedic multipliers in hardware architecture. We employ Urdhva-Triyag algorithm for multiplication operation of two binary numbers. Also, Vedic multiplier is employed for 4 bit multiplication operation in the figure shown below.

#### 4\*4 Vedic Multiplier Algorithm

```
4- bit Multiplicand and Multiplier

M \leftarrow 0

X(k): 9- bit vector initialized to 0

for M = 0 to 4

for N = 0 to i

X(k) = S(k) + x(i) \times y(a - b)

End

For J = k + 1

end for

for M = 4 to 1

for N = 4 to J

X(k) = XS(k) + a(i) \times b(4 - (a - b))
```

ISSN: 2250-0839 © IJNPME 2017 end for M = M + 1end for for a = 0 to (N - 1) do Z = N + X(i)end



Urdhva Tiryakbhayam algorithm for reversible 2x2 takes into consideration the Fan out. This algorithm employs Reversible gates that are totally five. They are: BME(two), Peres(one), BVF(one), CNOT gate(one). BME gate is the one employed for producing partial products. BVF eliminates the problem of fan out. Peres is nothing but a half adder. CNOT gate is necessary to perform EXOR operation. I0,I1 are the outputs in the intermediate stage that are required to eliminate fan out shown in below figure.



Fig 2:Reversible 2\*2 Vedic Multiplier

4\*4 vedic and reversible vedic multipliers are implemented in FPGA SPARTAN 3 kit. FPGA implementation involves VHDL/VERILOG coding followed by simulation to check the logical outputs. Then RTL synthesis is

done to convert the coding into diagram and finally through fusing kit the design is implemented in order to anlayse the parameters such as; power, area, delay, voltage swing, fan-in and fan-out.



Fig 3: FPGA Implementation process

## **IV. Experimental Results**

The 4\*4 vedic multiplier and 2\*2 reversible vedic multiplier is designed using GENESYS software tool and the logical outputs were verified by using VHDL code XILINX simulation tool as shown in Figure 4 and Figure 5.

| Name       | Value | 1 us  2 us  3 us  4        | łus |
|------------|-------|----------------------------|-----|
| 🕨 📑 q[3:0] | 1001  | X0001 X 0100 X 0110 X 1001 | Π   |
| 🕨 🍯 a[1:0] | 11    | X 01 X 10 X 11             |     |
| 🕨 幡 b[1:0] | 11    |                            |     |
|            |       |                            |     |

| Name       | Value    | 10 ns    | 10 ns    | 20 ns    | 30 ns    | 40 ns    | 50 ns    |
|------------|----------|----------|----------|----------|----------|----------|----------|
| ▶ 🎽 a[3:0] | 1111     | 0011     | 0101     | 1010     | 1100     | 1101     | 1111     |
| ▶ 臂 b[3:0] | 1111     | 0011     | 0111     | 1001     | 1110     | 1101     | 1111     |
| 🕨 🕌 y[7:0] | 11100001 | 00001001 | 00100011 | 01011010 | 10101000 | 10101001 | 11100001 |

Fig 5: 2\*2 reversible simulation output

Comparison is made between 4\*4 vedic multiplier, Booth multiplier and Wallace multiplier to analyze the area, power consumption, delay and speed. Below table says that recent method is better than all other fast multipliers in terms of above said parameters.

| 4 *4 Multiplier | Area(Sq.m) | Delay(ns) | Speed(ns) | Power<br>consumption(nW) |
|-----------------|------------|-----------|-----------|--------------------------|
| Vedic           | 0.21       | 12        | 23        | 0.89                     |
| Booth           | 0.4        | 15        | 12        | 1.23                     |
| Wallace         | 0.32       | 17        | 11        | 1.54                     |

| Table 1: Comparison of multipliers |
|------------------------------------|
|------------------------------------|

All the above said three multiplier are implemented in Spartan FPGA device to analyze the parameters as shown in Figure 6.



Fig 6: Implementation in FPGA SPARTAN 3 device

#### V. Conclusion

The above discussed algorithm employing vedic multipliers are applied for any count of bits by reconstructing the values of bits. Fact- findings says that the above algorithm is efficient in providing high speed, reducing the area and utilization of power. So, we can say that our algorithm is applicable for high speed and low power DSP processors and in applications like cryptography. However, we should try to increase the speed of vedic multipliers some more in the future.

#### References

- S. Z. H. Naqvi, "Design and simulation of enhanced 64-bit Vedic multiplier," 2016 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Aqaba, 2016, pp. 1-4.
- [2] Sunjoo Hong, Taehwan Roh and Hoi-Jun Yoo, "a 145w 8×8 parallel multiplier based on optimized bypassing architecture", department of electrical engineering, Korea advanced institute of science and technology (KAIST), Daejeon, Republic of Korea, IEEE, pp.1175-1178, 2011.
- [3] Badal sharma, "Design and hardware implementation of 128 bit vedic multiplier" International journal for advance research in engineering and technology. Vol1 issue V Jun 2013.
- [4] Himanshu Thapliyal and Nagarajan Ranganathan, 2009, "Design of Efficient Reversible Binary Subtractors Based on a New Reversible Gate", IEEE Computer Society Annual Symposium on VLSI, pp. 229-234.
- [5] Gowthami, P., and Satyanarayana, R.V.S., "Design of an Efficient Multiplier using Vedic Mathematics and Reversible logic", IEEE International Conference on Computational Intelligence and Computing Research, 2016.
- [6] Yogendri and A. K. Gupta, "Design of high performance 8-bit Vedic multiplier," 2016 International Conference on Advances in Computing, Communication, & Automation (ICACCA) (Spring), Dehradun, 2016, pp. 1-6.
- [7] G. C. Ram, Y. R. Lakshmanna, D. S. Rani and K. B. Sindhuri, "Area efficient modified vedic multiplier," 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, 2016, pp. 1-5.
- [8] Zhang, J., Wang, W., Wang, X. et al. Enhancing Security of FPGA-Based Embedded Systems with Combinational Logic Binding. J. Comput. Sci. Technol. 32, 329–339 (2016).
- [9] K. D. Rao, C. Gangadhar and P. K. Korrai, "FPGA implementation of complex multiplier using minimum delay Vedic real multiplier architecture," 2016 IEEE Uttar Pradesh Section International Conference on

Electrical, Computer and Electronics Engineering (UPCON), Varanasi, 2016, pp. 580-584, doi: 10.1109/UPCON.2016.7894719.

- [10] G. C. Ram, Y. R. Lakshmanna, D. S. Rani and K. B. Sindhuri, "Area efficient modified vedic multiplier," 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, 2016, pp. 1-5, doi: 10.1109/ICCPCT.2016.7530294.
- [11] P. Gulati, H. Yadav and M. K. Taleja, "Implementation of an efficient multiplier using the vedic multiplication algorithm," 2016 International Conference on Computing, Communication and Automation (ICCCA), Noida, 2016, pp. 1440-1443, doi: 10.1109/CCAA.2016.7813946.
- [12] G. C. Ram, Y. R. Lakshmanna, D. S. Rani and K. B. Sindhuri, "Area efficient modified vedic multiplier," 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, 2016, pp. 1-5, doi: 10.1109/ICCPCT.2016.7530294.
- [13] K. D. Rao, C. Gangadhar and P. K. Korrai, "FPGA implementation of complex multiplier using minimum delay Vedic real multiplier architecture," 2016 IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics Engineering (UPCON), Varanasi, 2016, pp. 580-584, doi: 10.1109/UPCON.2016.7894719.
- [14] Kartika S. (2016). Analysis of "SystemC" design flow for FPGA implementation. International Journal of New Practices in Management and Engineering, 5(01), 01 - 07. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/41
- [15] Prof. Naveen Jain. (2013). FPGA Implementation of Hardware Architecture for H264/AV Codec Standards. International Journal of New Practices in Management and Engineering, 2(01), 01 - 07. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/11
- [16] Dr. Bhushan Bandre. (2013). Design and Analysis of Low Power Energy Efficient Braun Multiplier. International Journal of New Practices in Management and Engineering, 2(01), 08 - 16. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/12
- [17] G. S. Lakshmi, K. Fatima and B. K. Madhavi, "Implementation of high speed Vedic BCD Multiplier using Vinculum method," 2016 IEEE Region 10 Conference (TENCON), Singapore, 2016, pp. 147-151, doi: 10.1109/TENCON.2016.7847978.
- [18] A. Jais and P. Palsodkar, "Design and implementation of 64 bit multiplier using vedic algorithm," 2016 International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, 2016, pp. 0775-0779, doi: 10.1109/ICCSP.2016.7754250.
- [19] G. C. Ram, D. S. Rani, R. Balasaikesava and K. B. Sindhuri, "VLSI architecture for delay efficient 32bit multiplier using vedic mathematic sutras," 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, 2016, pp. 1873-1877, doi: 10.1109/RTEICT.2016.7808160.