An efficient floating point multiplier design for high speed
  applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm by Arish, S & Sharma, R. K.
An efficient floating point multiplier design for high 
speed applications using Karatsuba algorithm and 
Urdhva-Tiryagbhyam algorithm  
 
 
Abstract: Floating point multiplication is a crucial 
operation in high power computing applications such as image 
processing, signal processing etc. And also multiplication is 
the most time and power consuming operation. This paper 
proposes an efficient method for IEEE 754 floating point 
multiplication which gives a better implementation in terms of 
delay and power. A combination of Karatsuba algorithm and 
Urdhva-Tiryagbhyam algorithm (Vedic Mathematics) is used 
to implement unsigned binary multiplier for mantissa 
multiplication. The multiplier is implemented using Verilog 
HDL, targeted on Spartan-3E and Virtex-4 FPGA. 
Keywords: fpga, Floating point multiplier, Vedic 
mathematics, Urdhva-Tiryagbhyam, Karatsuba 
I. INTRODUCTION  
    Floating point multiplication units are an essential IP for 
modern multimedia and high performance computing such as 
graphics acceleration, signal processing, image processing etc. 
There are lot of effort is made over the past few decades to 
improve performance of floating point computations. Floating 
point units are not only complex, but also require more area 
and hence more power consuming as compared to fixed point 
multipliers. And the complexity of the floating point unit 
increases as accuracy becomes a major issue. IEEE 754 [1] 
support different floating point formats such as Single 
Precision format, Double Precision format, Quadruple 
Precision format etc. But as the precision increases, multiplier 
area, delay and power increases drastically. In the proposed 
paper, we present a new multiplication method which uses a 
combination of Karatsuba and Urdhva-Tiryagbhyam (Vedic 
Mathematics) algorithm for multiplication. This combination 
not only reduces delay, but also reduces the percentage 
increase in hardware as compared to conventional methods. 
    IEEE 754 format specifies two different formats namely 
single precision and double precision format [1, 2]. Fig. 1 
shows the different IEEE 754 floating point formats used 
commonly. The Single precision format is of 32-bit wide and 
Double precision format is of 64-bit wide. The Most 
Significand Bit is the sign bit. The exponent is a signed 
integer. It is often represented as an unsigned value by adding 
a bias. In  
Single precision format, the exponent is of 8-bit wide and the 
bias is 127, i.e. the exponent has a range ofሺെ127 ݐ݋ 128ሻ. In 
Double precision format, the exponent is of 11-bit wide and 
the bias is 1023, i.e. the exponent has a range 
ofሺെ1023 ݐ݋ 1024ሻ.  The mantissa or significand of Single 
precision format is of 23-bit and of double precision format is 
of 52 bit wide. The maximum value that can be represented 
using floating point format is 
 ݈ܽݎ݃݁ݏݐ ݏ݂݅݃݊݅݅ܿܽ݊݀ ൈ ܾܽݏ݁௟௔௥௚௘௦௧ ௘௫௣௢௡௘௡௧. 
And the minimum value that can be represented is 
 ݏ݈݈݉ܽ݁ݏݐ ݏ݂݅݃݊݅݅ܿܽ݊݀ ൈ ܾܽݏ݁௦௠௔௟௟௘௦௧ ௘௫௣௢௡௘௡௧ .  
II. FLOATING POINT MULTIPLIER DESIGN  
    A floating point number has four parts: sign, exponent, 
significand or mantissa and the exponent base. A floating 
point number is represented in IEEE-754 format [1, 2] as  
േݏ ൈ  ܾ௘  or  േݏ݂݅݃݊݅݅ܿܽ݊݀ ൈ ܾܽݏ݁௘௫௣௢௡௘௡௧ . The exponent 
base for binary format is 2. To perform multiplication of two 
floating point numbers േݏ1 ൈ  ܾ௘ଵ and േݏ2 ൈ  ܾ௘ଶ, the 
significant or mantissa parts are multiplied to get the product 
mantissa and exponents are added to get the product exponent. 
i.e.; the product is േሺݏ1 ൈ ݏ2ሻ  ൈ  ܾሺ௘ଵା௘ଶሻ. The hardware 
block diagram of floating point multiplier is shown in fig. 2.  
Double precision 
1 11 52 
Sign    Exponent                           Mantissa 
 
Single precision 
1 8 23 
Sign    Exponent                           Mantissa 
 
 
Fig. 1 Floating point formats in the proposed model 
 
R.K.Sharma 
School of VLSI Design and Embedded Systems 
National Institute of Technology Kurukshetra 
Kurukshetra, India 
rksharama@nitkkr.ac.in 
 
Arish S 
School of VLSI Design and Embedded Systems 
National Institute of Technology Kurukshetra 
Kurukshetra, India 
arishsu@gmail.com 
Cite as: S. Arish and R. K. Sharma, "An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam 
algorithm," 2015 International Conference on Signal Processing and Communication (ICSC), Noida, 2015, pp. 303-308. doi: 10.1109/ICSPCom.2015.7150666
    The important blocks in the implementa
floating point multiplier [3] is described belo
 
A. Sign Calculation 
    The MSB of floating point number repre
The sign of the product will be positive if 
are of same sign and will be negative if
opposite sign. So, to obtain the sign of the pr
a simple XOR gate as the sign calculator. 
B. Addition of Exponents 
    To get the product exponent, the input ex
together. Since we use a bias in the floa
exponent, we need to subtract the bias f
exponents to get the actual exponent. The
127ଵ଴ (01111111ଶ) for single precis
1023ଵ଴(0111111111ଶ) for double prec
proposed custom precision format also, a bia
    The computational time of mantiss
operation is much more that the exponen
simple ripple carry adder and ripple bor
optimal for exponent addition. 
C. Karatsuba-Urdhva Tiryagbhyam binary m
    In floating point multiplication, mos
complex part is the mantissa multiplicatio
operation requires more time compared to ad
number of bits increase, it consumes more 
double precision format, we need a 53x53 bi
single precision format we need 24x24 
requires much time to perform these operat
major contributor to the delay of the floating
To make the multiplication operation more 
faster, the proposed model uses a combina
algorithm and Urdhva Tiryagbhyam algorithm
Fig. 2 Floating point multiplier 
tion of proposed 
w.  
sents the sign bit. 
both the numbers 
 numbers are of 
oduct, we can use 
ponents are added 
ting point format 
rom the sum of 
 value of bias is 
ion format and 
ision format. In 
s of 127 is used.   
a multiplication 
t addition. So a 
row subtracter is 
ultiplier 
t important and 
n. Multiplication 
dition. And as the 
area and time. In 
t multiplier and in 
bit multiplier. It 
ions and it is the 
 point multiplier. 
area efficient and 
tion of Karatsuba 
. 
    Karatsuba algorithm uses a
where it breaks down the input
Least Significant half and th
operands are of 8-bits wide. Ka
for operands of higher bit lengt
not as efficient as it is at highe
problem, Urdhva Tiryagbhyam
stages. The model of Urdh
shown in Fig. 3. 
 
    Urdhva Tiryagbhyam algor
binary multiplication in terms
number of bits increases, dela
products are added in a ripple 
multiplication, it requires 6 
manner. And 8-bit multiplicatio
Compensating the delay wil
Urdhva Tiryagbhyam algorith
number of bits is much more. I
higher stages and Urdhva Tir
stages, it can somewhat compe
algorithms and hence the mul
The circuit is further optimized
save adders instead of ripple 
delay to a great extent with 
These two algorithms are exp
sections. 
 Urdhva Tiryagbhyam algorith
    Urdhva-Tiryagbhyam sutra i
method for multiplication [4, 5
applicable to all cases of mult
short and consists of only on
‘Vertically and crosswise’. In U
the number of steps required fo
and hence the speed of multipli
    An illustration of steps for c
bit numbers is shown below
a3a2a1a0 and b3b2b1b0 and le
product. And the temporary par
 
Fig. 3 Karatsuba-Urd
 divide and conquer approach 
s into Most Significant half and 
is process continues until the 
ratsuba algorithm is best suited 
h. But at lower bit lengths, it is 
r bit lengths. To eliminate this 
 algorithm is used at the lower 
va-Tiryagbhyam algorithm is 
ithm is the best algorithm for 
 of area and delay. But as the 
y also increases as the partial 
manner. For example, for 4-bit 
adders connected in a ripple 
n requires 14 adders and so on. 
l cause increase in area. So 
m is not that optimal if the 
f we use Karatsuba algorithm at 
yagbhyam algorithm at lower 
nsate the limitations in both the 
tiplier becomes more efficient. 
 by using carry select and carry 
carry adders. This reduces the 
minimal increase in hardware. 
lained in detail in the below 
m for multiplication 
s an ancient Vedic mathematics 
, 6, 7]. It is a general formula 
iplication. The formula is very 
e compound word and means 
rdhva Tiryagbhyam algorithm, 
r multiplication can be reduced 
cation is increased. 
omputing the product of two 4-
 [8, 9]. The two input are 
t the  p7p6p5p4p3p2p1p0 be the 
tial products are t0, t1, t2, … , t6. 
 
hva multiplier model 
The partial products are obtained from the s
The line notation of the steps is shown in Fig
 
Step1: t0ሺ1ܾ݅ݐሻ ൌ a0b0. 
Step2: t1ሺ2ܾ݅ݐሻ ൌ a1b0 ൅ a0b1. 
Step3: t2ሺ2ܾ݅ݐሻ ൌ a2b0 ൅ a1b1 ൅ a0b2 
Step4: t3ሺ3ܾ݅ݐሻ ൌ a3b0 ൅ a2b1 ൅ a1b2 ൅
Step5: t4ሺ2ܾ݅ݐሻ ൌ a3b1 ൅ a2b2 ൅ a1b3. 
Step6: t5ሺ2ܾ݅ݐሻ ൌ a3b2 ൅ a2b3. 
Step7: t6ሺ1ܾ݅ݐሻ ൌ a3b3 
 
The product is obtained by adding s1, s2 
below, where s1, s2 ܽ݊݀ s3 are the partial sum
 
s1 ൌ t6 t5ሾ0ሿ t4ሾ0ሿ t3ሾ0ሿ t2ሾ0ሿ t1ሾ0ሿ t0 
s2 ൌ t5ሾ1ሿ t4ሾ1ሿ t3ሾ1ሿ t2ሾ1ሿ t1ሾ1ሿ 
s3 ൌ t3ሾ2ሿ 
 
Product ൌ t6  t5ሾ0ሿ  t4ሾ0ሿ  t3ሾ0ሿ  t2ሾ0ሿ  t1ሾ0ሿ 
          t5ሾ1ሿ  t4ሾ1ሿ  t3ሾ1ሿ  t2ሾ1ሿ  t1ሾ1ሿ 
                    t3ሾ2ሿ       0          0         0 
 
            p7 p6   p5      p4       p3       p2       p1      
 
This method can be further optimize
number of hardware. A more optimized hard
[9, 10] is shown in Fig. 5. This model 
eliminate the need for three operand 7-bit 
reduces hardware and delay. The adders 
ripple manner.  
Fig. 4 Line notation of Urdhva Tiryagbh
teps given below. 
. 4. 
a0b3. 
ܽ݊݀ s3 as shown 
 obtained. 
   t0  + 
   0   + 
    0     
 p0 
d to reduce the 
ware architecture 
actually helps to 
adder and hence 
are connected in 
The expressions for produ
p0 ൌ a0b0 
p1 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 1
      ൌ ܮܵܤ ݋݂ ሺa1b0 ൅ a0b1ሻ 
p2 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 2
      ൌ ܮܵܤ ݋݂ ሺMSBሺADDER1ሻ
p3 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 3
     ൌ ܮܵܤ ݋݂ ሺMSBሺADDER 2ሻ
p4 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 4
ൌ ܮܵܤ ݋݂ ሺMSBሺADDER
p5 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 5
ൌ ܮܵܤ ݋݂ ሺMSBሺADD
p6 ൌ ܮܵܤ ݋݂ ൫ܵݑ݉ሺܣܦܦܧܴ 6
ൌ ܮܵܤ ݋݂ ሺMSBሺ
p7 ൌ ܥܽݎݎݕ ݋݂ ܣܦܦܧܴ  
 
yam sutra 
Fig. 5 Hardware arch
Tiryagbhya
ct bits are as shown below. 
ሻ൯ 
ሻ൯ 
൅a2b0 ൅ a1b1 ൅ a0b2ሻ 
ሻ൯ 
൅a3b0 ൅ a2b1 ൅ a1b2 ൅ a0b3ሻ 
ሻ൯ 
1ሻ൅a3b1 ൅ a2b2 ൅ a1b3ሻ 
ሻ൯ 
ER1ሻ൅a3b2 ൅ a2b3ሻ 
ሻ൯ 
ADDER1ሻ൅a3b3ሻ 
 
 
itecture for 4x4 Urdhva 
m multiplier. 
 
Since there are more than two operands in 
can use carry save addition to implement ad
technique reduces the delay to a great exten
ripple carry adder. 
Karatsuba Algorithm for multiplication 
Karatsuba multiplication algorithm [11, 12]
multiplying very large numbers. This metho
Anatoli Karatsuba in 1962. It is a divide and
in which we divide the numbers into their
half and Least Significant half and then 
performed.  
Karatsuba algorithm reduces the numbe
required by replacing multiplication opera
operations. Additions operations are faster th
and hence the speed of multiplier is increase
of bits of inputs increase, Karatsuba algorith
efficient. This algorithm is optimal if width
than 16 bits. The hardware architecture of Ka
is shown in fig. 6. Karatsuba algorithm for tw
can be explained as follow. 
 
Productൌ ܺ. ܻ 
X and Y can be written as, 
       ܺ ൌ 2௡/ଶ.  Xl ൅  Xr 
       ܻ ൌ 2௡/ଶ.  Yl ൅  Yr   
Where  Xl,  Yl and  Xr,  Yr are the Most Sig
Least Significant half of X and Y respectiv
number of bits. 
Then,  
ܺ. ܻ ൌ ቀ2೙మ.  Xl ൅  Xrቁ . ሺ2
೙
మ.  Yl ൅  Yr
             ൌ 2௡.  Xl Yl ൅ 2௡/ଶ ሺ Xl Yr ൅  Xr Ylሻ ൅
 
The Second term in equation (3) can be optim
the number of multiplication operations.  
 
i.e.;     Xl Yr ൅  Xr Yl ൌ ሺ Xl ൅  Xrሻሺ Yl ൅  Yrሻ െ
Fig. 6 Karatsuba multiplier 
adders 2 to 5, we 
ders 2 to 5. This 
d compared to the 
 is best suited for 
d is discovered by 
 conquer method, 
 Most Significant 
multiplication is 
r of multipliers 
tions by addition 
an multiplications 
d. As the number 
m becomes more 
 of inputs is more 
ratsuba algorithm 
o inputs X and Y 
          (1) 
          (2) 
nificant half and 
ely, and n is the 
ሻ 
 Xr Yr             (3) 
ized to reduce 
 Xl Yl െ  Xr Yr 
   
The equation (3) can be re-writ
ܺ. ܻ ൌ 2௡.  Xl Yl ൅  Xr Yr ൅
      െ X
  
The recurrence of Karatsuba al
ܶሺ݊ሻ ൌ 3ܶ ቀ݊2ቁ ൅
D. Normalization of the result 
    Floating point representatio
mantissa, which always has a v
in the memory to save one bit
considered to be the hidden bit
left of decimal point. Usua
shifting, so that the MSB of m
radix 2, nonzero means 1. The
multiplication result is shifted 
immediate left of decimal p
operation of the result, the exp
one. This is called normalizatio
of hidden bit is always 1, it is c
E. Representation of exception
    Some of the numbers ca
normalized significand. To rep
code is assigned to it. In the 
output signals namely Zero, Inf
Denormal to represent these 
 ݁ݔ݌݋݊݁݊ݐ ൅ ܾ݅ܽݏ ൌ 0 and ݏ݅݃
is taken as Zero (±0). If the pro
255 and ݏ݂݅݃݊݅݅ܿܽ݊݀ ൌ 0, the
(∞). If the 
 ݁ݔ݌݋݊݁݊ݐ ൅ ܾ݅ܽݏ ൌ 255 and ݏ
result is taken as NaN. Denorm
numbers without a hidden 1 
exponent. Denormals are us
numbers that cannot be represe
the product has  ݁ݔ݌݋݊݁݊ݐ ൅ ܾ
then the result is represented
represented as േ0. s ൈ 2ିଵଶ଺ , w
 
III. IMPLIMENTA
    The main objective of this p
a floating point multiplier w
operation both in terms of d
multiplication is the most com
multiplier, we designed a mult
speed and increase in delay an
increase in number of bits. 
IEEE-754 standard format is im
and tested. The multiplier u
replacing simple adders with e
adders and carry save adders. 
simulated using Xilinx Synthes
Saprtan-3E and Virtex-4 fpga
Virtex-4 fpga is given in table 
 
           (4) 
ten as,  
2೙మ ሺሺ Xl ൅  Xrሻሺ Yl ൅  Yrሻ 
l Yl െ  Xr Yrሻ                          (5) 
gorithm is, 
ܱሺ݊ሻ  ܱሺ݊ଵ.ହ଼ହሻ 
ns have a hidden bit in the 
alue 1 and hence it is not stored 
. A leading 1 in the mantissa is 
, i.e. the 1 just immediate to the 
lly normalization is done by 
antissa becomes nonzero and in 
 decimal point in the mantissa 
left if the leading 1 is not at the 
oint. And for each left shift 
onent value is incremented by 
n of the result. Since the value 
alled ‘hidden 1’. 
s 
nnot be represented with a 
resent those numbers a special 
proposed model, we use four 
inity, NaN (Not-a-number) and 
exceptions. If the product has 
݂݊݅݅ܿܽ݊݀ ൌ 0, then the result 
duct has  ݁ݔ݌݋݊݁݊ݐ ൅ ܾ݅ܽݏ ൌ
n the result is taken as Infinity 
product has 
݂݅݃݊݅݅ܿܽ݊݀ ് 0, then the 
alized values or Denormals are 
and with the smallest possible 
ed to represent certain small 
nted as normalized numbers. If 
݅ܽݏ ൌ 0 and ݏ݂݅݃݊݅݅ܿܽ݊݀ ് 0, 
 as Denormal. Denaormal is 
here s is the significand. 
TION AND RESULTS 
aper is to design and implement 
hich must be efficient in its 
elay and area. Since mantissa 
plex part in the floating point 
iplier which can operate at high 
d area is significantly less with 
Floating point multiplier with 
plemented using Verilog HDL 
nits are further optimized by 
fficient adders like carry select 
The model is synthesized and 
is Tools (ISE 14.7) targeted on 
. The summary of results on 
I and table II. Comparison with 
various multiplier units is given in tables III, IV, V, VI and 
VII. 
 
IV. CONCLUSION AND FUTURE WORK 
    This paper shows how to effectively reduce the percentage 
increase in delay and area of a floating point multiplier by 
using a very efficient combination of Karatsuba and Urdhva-
Tiryagbhyam algorithms. The model can be further optimized 
in terms of delay by using pipelining methods and precision of 
the result can be increased by adding efficient truncation and 
rounding methods. 
 
 
 
 
 
                   REFERENCES 
 
[1]   IEEE 754-2008, IEEE Standard for Floating-Point Arithmetic, 2008. 
[2]   Computer Arithmetic, Behrooz Parhami, Oxford University Press, 2000. 
[3] B. Jeevan , S. Narender , C.V. Krishna Reddy, K. Sivani, “A High 
SpeedBinary Floating Point Multiplier Using Dadda Algorithm”, 
International Multi-Conference on Automation, Computing, 
Communication, Control and Compressed Sensing, pp. 455-460, 2013 
[4]  “Vedic mathematics”, Swami Sri Bharati Krsna Thirthaji Maharaja, 
Motilal Banarasidass Indological publishers and Book sellers, 1965 
[5]   R. Sridevi, Anirudh Palakurthi, Akhila Sadhula, Hafsa Mahreen, “Design 
of a High Speed Multiplier (Ancient Vedic Mathematics Approach)”, 
International Journal of Engineering Research (ISSN : 2319-6890), 
Volume No.2, Issue No.3, pp : 183-186, July 2013 
TABLE I  
Performance analysis of Karatsuba-Urdhva multipliers  
   
 8-bit 
multiplier 
16-bit 
multiplier 
24-bit 
multiplier 
32-bit 
multiplier 
Slices 113 410 972 1389 
LUTs 120 451 1018 1545 
IOBs 33 65 97 129 
Delay 9.396ns 11.514ns 12.996ns 13.141ns 
௠݂௔௫ 
(MHz) 
274.469 248.964 226.508 209.606 
Logic 
levels 
14 22 31 39 
TABLE II 
 Performance analysis of Floating point multipliers in the proposed 
model. 
  
Slices 
 
LUTs 
 
IOBs 
 
Delay 
(ns) 
 
௠݂௔௫ 
(MHz) 
Max. 
comb. 
path 
delay(ns) 
Single 
precision 
977 1073 97 16.182 226.508 9.831 
Double 
precision 
3877 4033 193 18.966 173.952 10.736 
TABLE III  
Delay comparison of various 8-bit multipliers with proposed 
Karatsuba-Urdhva multiplier 
 Ref. [8] Ref. [9] Ref. [13] Proposed 
multiplier 
Width 8-bit 8-bit 8-bit 8-bit 
Delay 28.27ns 15.050ns 23.973ns 9.396ns 
TABLE IV  
 Delay comparison of various 16-bit multipliers with proposed 
Karatsuba-Urdhva multiplier 
 Ref. [14]-vedic 
multiplier 
Ref. [7] Proposed 
multiplier 
Width 16-bit 16-bit 16-bit 
Delay 13.452ns 27.148ns 11.514ns 
 
TABLE V  
Delay and area comparison of 24-bit multipliers with proposed 
Karatsuba-Urdhva multiplier 
 Slices  LUTs Delay 
Ref. [15] 1306 2329 16.316ns 
Proposed 
multiplier 
972 1018 12.996ns 
TABLE VI 
 Delay and area comparison of 32-bit multipliers with 
proposed Karatsuba-Urdhva multiplier 
 LUTs Delay 
Ref. [14]- Modified Booth 
multiplier (Radix-8) 
2721 12.081ns 
Ref. [14]- Modified Booth 
multiplier (Radix-16) 
7161 11.564ns 
Ref. [14] 2704 9.536ns 
Proposed multiplier 1545 13.141ns 
        TABLE VII 
Delay and area comparison of SP-floating point multiplier with 
proposed SP FP multiplier 
 Slices  LUTs Delay 
Ref. [15] 1269 2270 18.783ns 
Ref. [3] 1149 1146 -- 
Proposed 
multiplier 
977 1073 16.182ns 
[6]  Nivedita A. Pande, Vaishali Niranjane, Anagha V. Choudhari, “Vedic 
Mathematics for Fast Multiplication in DSP”, International Journal of 
Engineering and Innovative Technology (IJEIT), Volume 2, Issue 8, pp. 
245-247, February 2013 
[7]   R.K. Bathija, R.S. Meena, S. Sarkar, Rajesh Sahu, “Low Power High 
Speed 16x16 bit Multiplier using Vedic Mathematics”, International 
Journal of Computer Applications (0975 – 8887), Volume 59– No.6, pp. 
41-44, December 2012 
[8]   Poornima M, Shivaraj Kumar Patil, Shivukumar , Shridhar K P , Sanjay 
H, “Implementation of Multiplier using Vedic Algorithm”, International 
Journal of Innovative Technology and Exploring Engineering (IJITEE), 
ISSN: 2278-3075, Volume-2, Issue-6, pp. 219-223, May 2013 
[9]    Premananda B.S., Samarth S. Pai, Shashank B., Shashank S. Bhat, 
“Design and Implementation of 8-Bit Vedic Multiplier”, International  
Journal of Advanced Research in Electrical, Electronics and 
Instrumentation Engineering, Vol. 2, Issue 12, pp. 5877-5882, December  
2013 
[10]  Harpreet Singh Dhillon, Abhijit Mitra, “A Reduced-Bit Multiplication 
Algorithm for Digital Arithmetic”, World Academy of Science, 
Engineering and Technology, Vol 19, pp. 719-724, 2008 
[11]   N.Anane, H.Bessalah, M.Issad, K.Messaoudi, “Hardware 
implementation of Variable Precision Multiplication on FPGA”, 4th 
International Conference on Design & Technology of Integrated 
Systems in Nanoscale Era, pp. 77-81, 2009 
[12] Anand Mehta, C. B. Bidhul, Sajeevan Joseph, Jayakrishnan. P, 
“Implementation of Single Precision Floating Point Multiplier using 
Karatsuba Algorithm”, 2013 International Conference on Green 
Computing, Communication and Conservation of Energy (ICGCE), pp. 
254-256, 2013 
[13]  R. Sai Siva Teja, A. Madhusudhan, “FPGA Implementation of Low-
Area Floating Point Multiplier Using Vedic Mathematics”, International 
Journal of Emerging Technology and Advanced Engineering, ISSN 
2250-2459, Volume 3, Issue 12, pp. 362-366, December 2013. 
[14]  Jagadeshwar Rao M, Sanjay Dubey, “A High Speed and Area Efficient 
Booth Recoded Wallace Tree Multiplier for fast Arithmetic Circuits”, 
2012 Asia Pacific Conference on Postgraduate Research in 
Microelectronics & Electronics (PRIMEASIA), pp. 220-223, 2012. 
[15]   Anna Jain, Baisakhy Dash, Ajit Kumar Panda, Muchharla Suresh, 
“FPGA Design of a Fast 32-bit Floating Point Multiplier Unit”, 
International Conference on Devices, Circuits and Systems (ICDCS), pp. 
545-547, 2012  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
