I. INTRODUCTION
Fast Fourier Transform (FFT) is a very efficient algorithm which plays a significantly important role in many applications of digital signal processing (DSP). It has been applied in various telecommunication fields such as linear filtering, spectrum analysis, digital video broadcasting and orthogonal frequency demodulation multiplexing (OFDM). The rapidly increasing demand of OFDM based applications, including modern wireless telecommunication such as LAN, needs real-time high speed computation in Fast Fourier Transform algorithm.
FFT is identical to the Discrete Fourier Transform (DFT) which transforms one function from time domain representation to frequency domain representation. However, FFT employs a much reduced computation complexity, by using divide and conquer approach. This has made the design of FFT processor a critical requirement for the upcoming wireless technology. Meanwhile, the fast growing of semiconductor technology also contribute to the implementation of FFT processor in various DSP application, the critical issue for future wireless receivers is the combined requirements for high-performance, low power and flexibility [1] .
With the advent of these requirements, many different hardware architectures have been proposed for the implementation of FFT algorithms. Normally, the main concern of the design approach will be concentrate on power consumption and architectural size. In multi-carrier code division multiple access (MC-CDMA) or orthogonal frequency division multiplexing (OFDM), the two most power consuming blocks in the receiver are FFT and Viterbi decoder [2] , this is due to the memory requirement to buffer the input data and the intermediate result for the computation. Power consumption of FFT processor depends on the size of the wordlength of the data and the FFT coefficients [3] . Larger word length means higher SNR for a fixed-point FFT. Since the errors in FFT depend on the SNR, where the higher the SNR the lower the errors, it is important for the FFT processor to have larger word length. However, power consumption will be increased due to more switching activities (SA) [4] . It is therefore highly desirable to design an FFT processor which has a good trade-off between SA and SNR. This paper presents a multi-objective algorithm for adaptation of a FFT processor. A specially tailored MOGA is developed in order to adapt the FFT processor while dynamically optimizing both the SA and SNR. Impact of wordlength optimization of fixed-point FFT coefficients on SA and SNR using MOGA is analyzed. A Radix-4 Single Path Delay Feedback (R4SDF) is designed using programming language of Verilog Hardware Description Language. The MOGA is designed using verilog as well which is applied to search the best FFT coefficient with optimum performance in term of SNR and SA. This project was identified as being important in providing necessary background to the research of reconfigurable FFT processor.
II. FFT DESIGN
16-points R4SDF FFT processor is designed according to the 16-points Radix-4 Signal Flow Graph (SFG) as in Figure  1 [5] . The Radix 4 architecture is chosen due to its less complexity in processing element and control. It requires 15 memory elements or registers, 16 complex adders, and 1 complex multiplier. R4SDF block diagram is as shown in the Figure 2 . The BF is the butterfly element that used to do the data shifting and arithmetic computation purpose. The number of memory element for the first stage is 4 for each row and 1 for the second stage. Figure 3 show the four operation modes in the R4SDF architecture. The operation mode from 1 to 3 are shifting the input data into the different register banks and at the same time the data is also shifted out from related register bank. The operation is carried out in First in, First out (FIFO) order. The operation 4 is the BF arithmetic computation and the computed results are feedback to the register banks. III. MULTI-OBJECTIVE GENETIC ALGORITHM Multi-objective GA is crucial in order to optimize multiple fitness measures in many real world problems with multiple conflicting objectives [6] . In multi-objective optimization, there is a possibility that more than one optimal solution is obtained. In this work, MOGA is used to optimize the wordlength of the fixed-point FFT coefficients or twiddle factor for two objectives, which are the SNR value and the SA. These two objectives conflict with each other, as the decrease in wordlength will reduce the SA and also reduce the SNR value. However, higher SNR value is desirable in FFT design.
MOGA is used in this optimization because it is a class of evolutionary algorithm (EA) that can be used for the discovery of near-optimal solutions to multi-modal problems. It generates a certain population size of solutions and simultaneous discover various solutions from the population. The objective is to find the best solution for the FFT coefficients which have optimum performance in terms of SNR and SA values as compared to the reference solution.
A. Flow Chart
From the study of [10] , a simple GA works as illustrated in Figure 4 . It starts with a randomly generated population of n l-bit chromosomes (candidate solutions to a problem). In this work, the chromosome is the coefficients of the FFT Processor, and the population size is chosen to be 50 which as it is sufficient to provide solution diversity. The fitness f(x) of each chromosome x in the population are then calculated. A pair of parent chromosomes from the current population is selected, the probability of selection being an increasing function of fitness. Selection is done "with replacement", meaning that the same chromosome can be selected more than once to become a parent. With probability p c (the "crossover probability" or "crossover rate"), the pair is being crossover at a randomly chosen point (chosen with uniform probability) to form two offspring. If there is no crossover takes place, two offspring that are exact copies of their respective parents are formed. The crossover rate is defined to be the probability that two parents will cross over in a single point. There are also "multi-point crossover" versions of the GA in which the crossover rate for a pair of parents is the number of points at which a crossover takes place. The two offspring are being mutated at each locus with probability p m (the mutation probability or mutation rate), and place the resulting chromosomes in the new population. If n is odd, one new population member can be discarded as random. All these steps starting from the selection of the parent chromosomes will be repeated until n offspring have been created. After that, the current population is replaced with the new population. Finally the same procedures starting from calculating the fitness f(x) of each chromosome x in population is being repeated. 
B. Chromosomes Representation
The FFT coefficients which are in 32 bits binary string represent as a chromosome. There are total 16 coefficients in R4SDF FFT processor, thus there will be 16 chromosomes in each set. Total 50 set of chromosomes in a population. Figure 6 illustrates the methodology used to evaluate the error fitness [7] . Initially, with a sequence of input data, x(n) and 16-bit FFT coefficients, the FFT processor calculates the outputs, X 1 (k). Next, with the same x(n) and with the optimized FFT coefficients, the FFT processor again calculates the outputs, X 2 (k). Both outputs are then compared for error calculations, e(k). The corresponding SNR value for all the FFT outputs are then calculated using the equation in (1) [8] . 
C. Error Fitness Evaluation

D. Power Fitness Evaluation
Power evaluation is performed by calculating the sum of switching activity based on hamming distance in the FFT coefficients using a specific word length. It can be shown that, switching power, P sw is the main source of power consumption in a typical CMOS logic gate. Equation (2) illustrates how switching power is calculated [9] .
V dd is the supply voltage, f is the clock frequency, C load is the load capacitance of the gate, k is the switching activity factor which is defined as the average number of times the gate makes an active transition in a single clock cycle. If C load , V dd , and f are constants, then P sw will be directly proportional to the k.
E. Overall Fitness Evaluation
In a multi-objective GA, a fitness function that will account for the effect of both the SNR value and switching activity are needed. In fact, there are many methods can be used in order to evaluate and find the candidate solution in multi-objective GA. In this project, a method called weighted-sum method is used to evaluate the candidate solutions. The power consumption factor and the signal to noise ratio factor is defined to be equally important. activity switching total activity switching invert SNR total
SNR is the signal to noise ratio of the candidate solution, total SNR is the sum of total of signal to noise ratio for all the 50 population candidate solution. Invert switching activity is the invert of the switching activity calculated by subtracts the switching activity by 512. This is because the maximum possible switching activity for the processor is 32 bits multiply by 16 equal to 512.
IV. VERILOG IMPLEMENTATION
The GA programming is implemented in the Verilog Hardware Description Language test bench after the FFT processor is designed in verilog file. In GA programming, there is the need to generate sets of random number for population and selection process. The function $random in verilog is used. It is needed for selection of new population, chromosomes for crossover and mutation, and also generates the probability of crossover and mutation P c , P m , For Example, if the P m is set to 0.5, the random function is used to generate number from 0 to 100, if the random number between 0 to 50 is generated, then the mutation will take place. The test bench of the verilog enable user to read and write text file as input and output file using $readmem and $writemem functions. Thus, it is useful for inputting the FFT data and coefficient in batch.
V. RESULTS AND DISCUSSION
In this work, the multi-objective Genetic Algorithm is used to adapt the FFT coefficients at the word lengths of 10, 11, 12, 13, 14, 15, and 16. The results are compared with a reference value for the switching activity and SNR. The reference value for the switching activity is 192 found in [5] , in which it is the total switching activity for the original FFT coefficients before implement the GA. Besides, the reference value for the SNR is found to be 66dB for architecture of R4SDF. The reference SNR value is calculated from 100 sets of readings. Table 1 and Table 2 show the SNR value before and after GA optimization. The word length or the bits used in the processor is important regarding the FFT accuracy and power consumption. Larger word length will contribute to better FFT accuracy but higher power needed due to the higher SA. In this work, the FFT word length is reduced bit by bit until 10 bits and the MOGA is implemented to seek for better SNR and SA performance. Without GA optimization  Bits  SNR  SA  16  67  192  15  67  178  14  67  166  13  66  152  12  63  140  11  59  128  10  53  116   TABLE II  SNR VALUE AND SA VALUE AFTER GA OPTIMIZATION   With GA optimization  Bits  SNR  SA  16  67  188  15  67  176  14  66  164  13  65  150  12  61  138  11  55  124  10 51 114 Figure 7 and 8 show the example average SNR and SA values which GA is able to find in 16 bits in 50 generations. The whole process shows the decrease in SNR value in order to obtain better SA value. It can be observed that as the SNR are decreased, the GA try to balance up with lower SA value, the lower SA contribute to lower power consumption. This means that the power consumption of the FFT Processor after implementation of the GA is lower than the original FFT before implementation of GA. The solutions also have an acceptable SNR value. VI. CONCLUSION In this paper we have presented a multi-objective Genetic Algorithm for the FFT Processor which may be used in many fields like in the telecommunication applications. Results show GA can be a good approach to find the solution for multi-objectives problems and contribute to reduction in switching activity compared to the reference value with acceptable SNR value in FFT design.
