VLSI Implementation of Reconfigurable FFT Processor Using Vedic Mathematics by Babu, Bharatha K. & Nanthini, G.
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 1 
VLSI Implementation of 
Reconfigurable FFT 
Processor Using Vedic 
Mathematics 
K. Bharatha Babu, G. Nanthini  
IARS' International Research 
Jorurnal. International Association of 
Research Scholars, 29 Aug. 2015. 
Web. 29 Aug. 2015. 
http://irj.iars.info/index.php/82800502
201501 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 2 
VLSI Implementation of 
Reconfigurable FFT Processor 
Using Vedic Mathematics 
K. Bharatha Babu, G. Nanthini 
Abstract- Fast Fourier transform has been used in wide range of applications such as 
digital signal processing and wireless communications. In this we present a 
implementation of reconfigurable FFT processor using single path delay feedback 
architecture. To eliminate the use of read only memory’s (ROM’S). These are used to 
store the twiddle factors. To achieve the ROM-less FFT processor the proposed 
architecture applies the bit parallel multipliers and reconfigurable complex multipliers, 
thus consuming less power. The proposed architecture, Reconfigurable FFT processor 
based on Vedic mathematics is designed, simulated and implemented using VIRTEX-5 
FPGA. Urdhva Triyakbhyam algorithm is an ancient Vedic mathematic sutra, which is 
used to achieve the high performance. This reconfigurable DIF-FFT is having the 
high speed and small area as compared with other conventional DIF-FFT 
Key words – Bit parallel multiplier, Complex multiplier, FFT,  
Vedic Mathematics 
INTRODUCTION 
Discrete Fourier transform (DFT) is a very significant method in digital signal processing 
and communications. However DFT is computational exhaustive and has a time 
complexity of O(N2). Fast Fourier transform (FFT) was introduced by Cooley and Tukey 
to efficiently decrease the time complexity to O(Nlog2N), where N indicates the FFT 
size. 
For hardware performance, different FFT processors have been suggested. These 
implementations can be generally classified into memory- based and pipeline architecture 
methods. Memory- based architecture is generally realized to design an FFT processor, 
which is called as the single processing element (PE) approach. Which design technique 
is generally arranged of a main PE and several memory units, thus the power 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 3 
consumption and the hardware cost are both lower than the other architecture method. 
[3]. On the other hand, this kind of architecture method has extensive latency, small 
through put and cannot be parallelized. Generally, the pipeline FFT processors have two 
well-liked design types. One is the single path delay feedback (SDF) pipeline architecture 
and the other is multipath delay commentator (MDC) pipeline architecture. Thus the 
single path delay feedback (SDF) pipeline architecture is excellent in less memory space 
requirements. Its multiplication computation is less than 50%. [3] The design of control 
unit will be easy. In portable low power DSP device applications these implementations 
are advantageous due to low power. Depending upon these reasons, a SDF pipeline FFT 
is selected in our work. 
The FFT computation requires the multiplication of input signals with various twiddle 
factors for an output, which results in higher hardware cost because it requires the more 
number of ROM to store the twiddle factor values. Using the shift and add operations the 
complex multipliers are used in the processor. Hence, the processor requires only a two-
input digital multiplier and does not require any ROM to store the twiddle factors.[3] Our 
proposed design uses the reconfigurable complex constant multiplier and bit parallel 
multipliers in place of using ROM‟S. 
FFT ALGORITHM 
For N point input sequence the DFT is defined as given below, 
 
Where  which is called twiddle factor. However, a straight 
forward realization of this algorithm is clearly impractical due to more hardware 
requirement. Conversely an implementation of this algorithm is clearly unfeasible 
because of high hardware necessity. To enhance the calculation time speed and reduce 
the hardware cost the fast Fourier transform (FFT) was developed. Input signal of FFT 
has been analyzed by using decimation-in-frequency (DIF) and decimation in time (DIT) 
decomposition it is used to construct a signal flow graph (SFG) efficiently. In this our 
work uses DIF decomposition because it goes with approach of single path delay pipeline 
facility. N=16 point DIF FFT SFG is given below, 
The radix-2 DIF FFT given above emerges regularity in SFG and requires less number of 
complex multipliers. This is suited for hardware implementation, because to reduce the 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 4 
chip area some complex multiplications can be shortened. For example, an input 
sequence multiplied by w162in above figure can be stated as: 
 
Here (a+jb) indicates a discrete-time signal in complex form. 
Equally, the complex multiplication of w166 is provided by: 
 
"These above two equations will ease hardware implementation in the upcoming, because 
they only require calculating the multiplication by v2/2 and two real additions. 
Particularly, the multiplication by v2/2 can be achieved easily. The inverse discrete 
Fourier transform (IDFT) of length N is specified by:" 
 
For reducing the chip area the same hardware core has been reused. The above equation 
can be rewrite as: 
 
Here the * symbol denotes a conjugate. The exceeding new form can be evaluated as a 
common DFT. In additional words, DFT and IDFT can use again the same hardware 
core, while IDFT need some extra computations. These additional computations contain 
conjugating the input data Xk and the output of DFT, also dividing the earlier output by 
N. clearly, this new version of DFT/IDFT technique will also make simpler the design 
exertion of an DFT/IDFT processor and which reduce the occupied chip area, if not 
concurrently and both the DFT/IDFT systems are stimulated optionally. 
III. PROPOSED ARCHITECTURE 
The hardware implementation of FFT processors generally uses a ROM to store the 
required twiddle factors, and word length complex multipliers to execute FFT computing. 
Conversely, this establishes high hardware expenditure. A bit parallel complex constant 
multiplier is used to develop the previous problem. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 5 
3.1 BIT PARALLEL MULTIPLIER 
For the reduction of chip area the multiplication by 1/v2 can use a bit parallel multiplier 
to substitute the word length multiplier and the estimation of square root. The bit parallel 
operation of power of 2 is provided by: 
 
If a simple implementation for the equation is realized, it will establish a worst accuracy 
caused by truncation error, and will use high hardware cost. To increase the accuracy and 
hardware cost the above equation is rewritten as: 
 
The input is first shifted to two bits right. So it has the value of divided by 4. The input 
and the shifted by two are get added. And then the output of the first adder gets shifted by 
4 bits right. And then the adder output and the shifted output get added here. And then 
again shifted by 2 bits right again. And then it has been added with the input. The adder 
output is finally shifted by 1 bit right to get the final output. 
3.2 VEDIC MATHEMATICS 
„Veda‟ is a Sanskrit word it means „knowledge‟. [10]. Vedic principles have been derived 
by Swami Bharathi Krishna Thirthaji in early decades of the 20th century. Vedic 
mathematics has 16 sutras and 13 sub sutras. Vedic mathematics is used to reduce the 
complexity, operating time, circuit area, power consumption, etc. Vedic mathematics uses 
the one or two step procedures to derive the problems. Vedic mathematics is a mental and 
speed method. It converts the toughest mathematics into playful method. In Vedic 
mathematics there is only one general technique is used to solve the all cases. VEDIC 
MATHEMATICS is a mathematical elaboration of „simple sixteen mathematical 
formulae from the „Vedas‟. Vedic multipliers are designed using URDHVA-
TIRYAGBHYAM multiplication sutra. It is used to reduce the complexity of 
multiplication of large numbers. When recurring the decimals and auxiliary fractions it 
can be handled by Vedic mathematics. Vedic mathematics outlines division of Jyotish 
Shastra which is one of the six parts of Vedangar. The Jyotish Shastra or astronomy is 
made up of three parts called Skandar. Askanda means the big branch of a tree shooting 
out of the trunk. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 6 
3.3 URDHVA-TRIYAGBHYAM: 
Urdhva-Triyagbhyam is the general formula suitable to all cases of multiplication and 
division of large number by another large number. 
Example 1: 
124 x 132 
Proceeding from right to left 
i. 4 x 2 +8. First digit + 8 
ii. (2 x 2) + (3 x 4) =4 +12 = 16. The digit 6 is retained and 1 is carried over to left 
side. Second digit =6. 
iii. (1 x 2) + (2 x 3) + (1 x4) =2 +6 +4 +12. The carried over 1 of above step is added. 
12 + 1 =13. Now 3 is retained and 1 is carried over to left side. Thus third digit 
=3. 
iv. (1 x 3) + (2 x 1) =3 +2 =5. The carried over 1 of above step is added. 5 +1 =6. It is 
retained thus fourth digit =6 
v. (1 x1) =1. As there is no carried over number from the previous step is to retained. 
Thus fifth digit =1 
124 x 132 =16368. 
IV. LITERATURE REVIEW 
1. A LOW POWER 64 POINT PIPELINE FFT/IFFT PROCESSOR 
Single path delay feedback style is used for the proposed architecture. ROM‟S are 
eliminated. It is used to store the twiddle factors. The bit parallel multiplier and 
reconfigurable complex multipliers are used to achieve the ROM-less FFT processor, it is 
consuming less power. The design uses 33.6k gates and it is consuming about 9.8mw 
power. These implementation has been classified into memory based and pipelined 
architecture styles. The memory based architecture style is widely used to design an FFT 
processor; it is called as the single processing element approach. SDF has been designed 
using PE and some memory units. Hardware cost is low. Power consumption is less than 
the other architecture styles. The disadvantage of SDF is Long latency, Low throughput 
and can‟t be parallelized. Pipeline architecture design has been classified into two types 
First one is single-path delay feedback (SDF) pipeline architecture, and the other one is 
multipath delay commutator (MDC) pipeline architecture. It requires less memory space. 
Multiplication computation will be less. The control unit design will be easy. However, 
the FFT computation needs to multiply the input signals with required different twiddle 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 7 
factors for an output. Which results higher hardware cost because it requires large 
number of ROM‟S to store the twiddle factor. In this processor a complex multiplier is 
recognized with shift-and-add operations. The processor requires only two input digital 
multipliers and does not require ROM for internal storage of coefficients. The ROM size 
is reduced to reduce the chip area. [3]. 
2. A 1-GS/S FFT/IFFT PROCESSOR 
Mixed radix multipath delay feedback (MRMDF) pipelined architecture is used in this 
FFT design, by using the multi data-path scheme it can give a higher throughput rate. In 
MRMDF the hardware cost of memory is 38.9% and complex multipliers hardware cost 
is 44.8% .to reduce the use of complex multiplications the high radix FFT algorithm is 
used. It consumes 175mw power and dissipating 77.6mw power. In our view for high- 
throughput-rate the pipelined architecture is a good choice. Higher throughput rate can be 
given with tolerable hardware cost. FFT pipelined architecture has two groups one is 
multipath delay commutator and the other is single path delay feedback. [10]. 
In MDC scheme M parallel input data must be sustained simultaneously, this scheme 
presents M times higher throughput rate than SDF scheme. In MDC architecture there are 
some restrictions on the number of data path, the FFT size, and the radix-r FFT 
algorithm. In MDC scheme the need of memory and complex multiplier is higher than 
that of SDF scheme. The scheme uses the less memory and hardware cost. If the input 
data are rearranged in the input buffer previously they are loaded into the MDC 
processor, the MDC architecture is more applicable than the SDF architecture. In general 
throughput rate can be increased by increasing the number of data paths in MDC scheme. 
The MRMDF architecture has lower hardware cost compared with the MDC scheme. The 
higher radix FFT has been used to achieve the less amount of power convention. [10]. 
3. MEMORY SYSTEM DESIGN 
The input data and the intermediate results are reordered using memory to compute the 
DFT through FFT. The needed memory size is proportional to N, and the count of 
memory access is proportional to NlogrN . Reducing the memory size and the count of 
memory access is important. For word sequential I/O, the two samples are separated 
using N/2 clock cycles if one sample is accessible per clock cycle. Finally, the first N/2 
samples have to be accumulated in a local memory until the other data samples Xn+(N/2) 
appears. Same constraints also used in the other FFT algorithm. [15]. 
Two different buffering strategies are used for pipeline FFT architecture. First one is 
delay commutator (DC) architecture, and the second one is delay feedback (DF) 
architecture. The DC approach is given below in diagram. At the first N/2 cycles, the 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 8 
initial N/2 samples are stored in “N/2 FIFO_I”. at the second N/2 cycles, xn+(N/2) from 
the input received by the butterfly and xn from “N/2 FIFO_I” band gives the output. In 
the interim one of the results created by the first butterfly is saved into N/2 FIFO_II. And 
the next result is fed to the multiplier directly. In which N cycles, data are saved into “N/2 
FIFO_I‟ in this FIFO in the first N/2 cycles after that are read from the FIFO in the next 
N/2 cycles. In the final result, the utilization rate of FIFO is only 50%. In DF style, the 
incoming samples are saved in the “N/2 FIFO” for the duration of the N/2 cycles. While 
xn+(N/2) appears, for computation the radix-2 butterfly unit inputs will get xn+(N/2) 
from the input and xn from the feedback FIFO . in the butterfly unit outputs one result is 
feedback to the “N/2 FIFO”, it will explain the name “DELAY FEDBACK”. The data is 
read and write to the each memory cell. The use of each FIFO is increased to 100%. [15]. 
4. A LOW POWER, HIGH PERFORMANCE, 1024 POINT FFT PROCESSOR 
A single chip, energy efficient 1024 point FFT processor is presents in this. In a standard 
0.7mm CMOS process 460000 transistor design has been designed and it‟s fully 
operational on first pass silicon. It can calculate the 1024 point complex FFT in 330ms at 
1.1v supply voltage its consuming 9.5mw power. Resulting in adjusted energy efficiency 
is 16 times greater than the previous most efficient known FFT processor. At 3.3v, it 
process at 173 MHZ its clock frequency is 16 times greater than the before longest rate. 
[1]. 
While advances in semiconductor processing technology have allowed the performance 
and integration of FFT processors to rise steadily, these advances have also, 
inappropriately start to increase in power consumption. The applications are affected only 
by power and not by the performance (portable applications) are important and 
increasing. 
In many CMOS circuits energy dissipation is related to the square of the supply 
voltage.[Bass .B.M , 1999]. Finally large efficiency can be achieved by reducing the 
supply voltage. Unluckily the circuit performance is reduced by the lower supply voltage. 
The processor existing here is performs with a low supply voltage Vdd, which means the 
values of the transistor thresholds vt, its increase the energy efficiency of the overall 
system. To get back the some lost performance, the processor uses a high performance 
algorithm and architecture. It carries out better than prior drawings. [1]. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 9 
5. DESIGN AND IMPLEMENTATION OF A 1024 POINT PIPELINE FFT 
PROCESSOR 
A 1024-point pipeline FFT processors design and implementation is presented here. The 
design is depending on a new form of FFT, THE RADIX-2² algorithm. Minimum needs 
for both dominant components in VLSI implementation have been gained by exploring 
the spatial regularity of the new algorithm. For the pipelined 1k FFT processor only 4 
complex multipliers and 1024 complex word data memories are used. The chip is 
fabricated on a o.5µm CMOS processor it occupies an area of 40mm². It can perform 
2n,n =0,1,...10 complex point forward and inverse FFT in real time with up to 30 MMHZ 
sampling frequency with a supply voltage of 3.3v. The signal quantization noise ratio is 
above 500 db for white noise input. FFT has been widely used in many DSP applications 
it‟s based on OFDM principle. When the device complexity and power consumption is 
less by using a real time FFT processor by replacing the dc modulators instead of each 
separate sub-carrier, then only the system design is acceptable. In terms of arithmetic 
operations, and communicational intensive, and in terms of data transferring in the 
storage the FFT operations have been specified. [7]. 
Where N is the size of the transform O(log N) arithmetic operations are needed per 
sample cycle for real time operating FFT transform. High speed real time processing can 
be determined in two ways, a single processor driven to a high clock rate in a 
conventional, general purpose processor approach. To perform the operation O (log N) 
times the sampling frequency. Parallel processors performing on a clock rate equivalent 
to the sampling rate these are utilized to get the good performance in application detailed 
approach. [7]. 
The second approach is very important when power consumption is restricted by the 
application surrounding, pipeline FFT processor is a class of architecture to application 
specific real time DFT execution using fast algorithms. On a clock frequency of the input 
data sampling it is described by a non stopping process. While high speed processing a 
lower clock rate is an advantage for pipeline architecture. Pipeline architecture is proper, 
which can be easily measured and described when HDL is used in the design. When 
different sizes are to be executed in the same chip its more flexible. To model and 
combination the design VHDL is used. The expected power consumption will be very 
less when using the voltage scaling, low frequency is obtained. Model determined here is 
0.5µ CMOS process. [7]. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 10 
6. A POWER SCALABLE RECONFIGURABLE FFT/IFFT BASED IC 
A multiprocessor architecture design is introduced by a single chip reconfigurable 
FFT/IFFT processor. Multilevel reconfigurabillity is analyzed by dynamically allocating 
execution resources required by appropriate applications. The processor IC was designed 
in 0.25µm CMOS process. It operates 8 point to 4096 point complex FFT/IFFT with 
power consumption scalability and presents valuable tradeoff between algorithm 
reliability, implementation complexity and energy efficiency. 
In signal processing for communication, the current technologies to a randomly involving 
standards and formats needs programmable answers that operates both algorithm 
flexibility and low implementation complexity. In portable applications the low power 
dissipation is a toughest one. Whereas the microprocessor and FPGA using devices gives 
an implementation complexity at the very high power dissipation, while comparing with 
the ASIC solutions. [16]. 
OFDM is a proposed modulation method in nowadays. In computation intensive and data 
transfer intensive the FFT is one part of such modulations. In analysis the system 
performance FFT processor plays an important role. In some applications such as digital 
signal processing applications are incorporated with a RISC processor it will handle the 
protocols. Its operation in the system can be upgraded to new standard with software 
variation only. [16]. 
To maintain the power dissipation as minimum a processor computes FFT with 
importantly scalable power dissipation through FFT length. By choosing a smallest FFT, 
can minimize the power dissipation on OFDM modulator. A particular system in FFT 
processor has to meet the certain performance required for the worst case applications. A 
maximum operating capacity of 2048 points FFT while consuming not more the 200mw. 
Example, then two variable length FFT processors will consume 200mw while operating 
a 2048 point FFT. [16]. 
While analyzing the DSPs their power dissipation can increase high and may not vary 
based on FFT size. A custom designed processor provides a high performance and low 
power requirements, it affects the reconfiguration of FFT processor for variable length 
FFT processor. The single chip reconfigurable FFT gives a solution between presented 
ASIC and software programmable general-purpose digital signal processors. [16]. 
7. ORDERD PIPELINE FFT ARCHITECTURE 
A pipelined N-point radix-4 FFT architecture is shown below, it has log4N stages. Within 
each word cycle one output is produced in each stage. Each stage consists of a butterfly 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 11 
unit, a commentator and a complex multiplier. At each stage the output must be ordered 
according the m value. In the first four word cycles the outputs are associated with m1=0, 
and then these are associated with m1=1 in the next four cycles. The input data for each 
summation at stage t are spited in time by Nt words. This necessary commentator 
includes of six shift registers along with a three multipliers. [2]. 
Normally the fixed coefficients are to the complex multiplier. It is starting from m1=0 
and ending with m1=3 for stage 1 of a 16-point FFT processor. The ordered coefficient 
set is achieved by first arranging the imaginary parts of the coefficient on the basis of 
hamming distance. It has been followed by selecting the corresponding real part of the 
coefficient or its two‟s complement is depending upon the hamming distance. Which is 
respect to the previously ordered real part. A flag bit is assigned to represent the presence 
of real part in two‟s complement. To selectively complement the multiplier output this 
flag bit is used. [2]. 
The new architecture for the 16-point ordered pipelined FFT processor is shown in the 
above figure. It is understandable that DO is in normal order to be straightly fed to the 
stage 2 commutator. As given the stage 2 commutator will be identical. [2]. 
V. CONCLUSION 
A ROM-less and low power pipeline FFT processor has been designed. Considering the 
symmetric property of twiddle factors in FFT, we have designed a reconfigurable 
complex multiplier such that the size of twiddle factor ROM is importantly reduced. This 
result proves that our design requires the lesser hardware cost and power consumption 
than existing designs. Obviously, our proposed design can also be suited to high-point 
FFT applications, with a lesser size of twiddle factor ROM‟S. 
The proposed architecture, Reconfigurable FFT processor based on Vedic mathematics 
will be designed, simulated and implemented using VIRTEX-5 FPGA. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 12 
FIGURES AND TABLES 
FIGURES 
 
Figure 1: N=16 point DIF FFT signal flow graph 
 
Figure 2: Bit parallel multiplication by 1/v2 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 13 
 
Figure 3: Multiplication by WN N/8 
TABLES 
Table 1: Comparison  
 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 14 
REFERENCES 
[1]. Bass .B.M , (1999). “A low power, high performance, 1024-point FFT 
processor,‟‟ IEEE journal of solid-state circuits, vol.34, no.3,pp.380-387. 
[2]. Bi Guoan and E.V.Jones, (1989). “A pipelined FFT processor for word sequential 
Data,” IEEE transaction on acoustics, speech, and signal processing, vol.37, 
no.12. 
[3]. Chu Yu, Mao-Hsu Yen, Pao-Ann Hsiung, and Sao-Jie Chen, (2011). “A low- 
power 64-point pipeline FFT/IFFT processor for OFDM applications”, IEEE 
transactions on consumer electronics, vol.57, no.1. 
[4]. Cooley.J.W and Tukey. J.W, (1965). “An algorithm for the machine calculation of 
complex Fourier series,” Math computational, vol.19, pp.297-301. 
[5]. Hasan.M, Arslan.T and Thompson.J.S, (2003). “A novel coefficient ordering 
based low power pipelined Radix-4 FFT processor for wireless LAN 
applications,” IEEE Transaction on consumer Electronics, vol.49, no.1. 
[6]. Hasan.M and Arslan.T, (2003). “Implementation of low power FFT processor 
cores using a novel order based processing scheme”, accepted for publication in 
IEE proceedings on circuits, Devices and systems. 
[7]. He.S and Torkelson.M, (1998). “Design and implementation of a 1024-point 
pipeline FFT processor,” in Proc. IEEE Custom Integrated Circuits Conf. 
(CICC‟98), pp. 131–134. 
[8]. Jen-chi Kuo, Ching-Hua Wen, Chih-Hisu Lin, and An-Yeu wu, (2003) “VLSI 
design of a variable length FFT/IFFT processor for OFDM based communication 
system,” EURASIP journal on applied signal processing, no.13.pp.1306-1316. 
[9]. Jung.Y, Yoon.H and Kim.J, (20030 “New efficient FFT algorithm and pipeline 
implementation results for OFDM/DMT applications,” IEEE transaction on 
consumer electronics, vol.49, no.1, pp.14-20. 
[10]. Lin Y-M, Liu H-Y, and Lee C-Y, (2005). “A 1 GS/s FFT/IFFT processor for 
UWB applications,‟‟ IEEE journal of solid-state circiuts, vol.40, no.8, pp.1726-
1735. 
[11]. Parthi. K.K, (1999) VLSI Digital signal processing systems: Design and 
Implementation, New York: Jon Wiley and sons. 
  Vol. 05 No. 02 2015 
p-ISSN 2202-2821 e-ISSN 1839-6518 (Australian ISSN Agency) 82800502201501 
 
www.irj.iars.info  Page 15 
[12]. Sarada.V, Vigneswaran.T, (2013). “Reconfigurable FFT processor,” International 
Journal of Engineering and Technology, vol.5, no.2. 
[13]. Sri Sathya Sai Veda Pratistan “Vedic Mathematics”, Book. 
[14]. Wei Han.T. Arsan, Erdogan.a.t, Hasan.m, (2004). “A novel low power pipelined 
FFT based on sub expression sharing for wireless LAN applications,” IEEE 
workshop on signal processing systems, pp.83-88. 
[15]. Wen-Chang Yeh and Chein-Wei Jen, (2003). “High-speed and low power split-
radix FFT,” IEEE Transaction on signal processing, vol.51, no.3, pp.864-874. 
[16]. Zhong. G, Xu.F and Wilson. A.N, (2006). “A power-scalable reconfigurable 








– END – 
 
 
 
 
