Abstract-Orthogonal frequency division multiplexing (OFDM) based on the use of discrete fractional Fourier transform (DFrFT) has recently gained interest due to its lower sensitivity to synchronization errors in comparison with conventional OFDM based on the use of the discrete Fourier transform (DFT). Although this higher robustness to synchronization errors is a well-recognized fact, only few works are available in the literature that concern with DFrFT hardware implementation. In this work, we consider its implementation in a Field Programmable Gate Array (FPGA). To verify the design of the DFrFT-based OFDM system, we use FPGA-in-the-Loop (FIL) co-simulation method to evaluate bit error rate (BER) in presence of carrier frequency offset (CFO) when transmission takes place over a frequency selective Rayleigh fading channel.
I. Introduction
Orthogonal frequency-division multiplexing (OFDM) based on discrete Fourier transform (DFT) has drawn major attention in wireless communication due to its various advantages like efficient bandwidth utilization, high data rate, less complex equalization, robustness against multi-path fading channel, etc. [1] . Hence, is has been adopted in many wireless communication standards, such as IEEE 802.11a, IEEE802.16a, LTE, LTE-Advanced, and terrestrial digital video broadcasting system [2] .
In spite of several advantages, OFDM also presents some disadvantages, among which a prominent one is high sensitivity to synchronization errors, i.e. symbol timing offset and carrier frequency offset (CFO) [3, 4] . The presence of CFO depends on several causes: Doppler spread, phase noise, and mismatching of transmitter and receiver oscillators' frequencies. The main negative effect of CFO is the introduction of inter-carrier interference (ICI) which, basically, consists in a loss of orthogonality between subcarriers.
Nowadays, discrete fractional Fourier transform (DFrFT) is emerging as an efficient tool for performing timefrequency analysis in many fields of digital signal processing [1] . DFrFT is a generalization of the DFT. There are many papers available in the literature that deal with the study of DFrFT-based OFDM systems [5] - [10] . The main motivation behind use of DFrFT in OFDM system is its higher robustness to synchronization errors compared to DFT, especially CFO.
A careful survey of the literature reveals that many discrete versions of DFrFT have been so far proposed [10] . All the proposed versions fall into four categories: linear combination-based method, sampling-based method, weighted summation-based method, and eigenvector decomposition-based method. It is shown in [10] that when the length N of the block of samples used in DFrFT computation is a power of 2, the complexity of samplingmethod and linear-based method is in the order of O(N log2 N), which is the same as that of fast Fourier transform (FFT), the efficient implementation of DFT. In contrast, for weighted summation-based method and for eigenvector decomposition-based method the complexity is in the order of O(N 2 ). As shown in Table I of [10] , complexity of different DFrFT algorithms depends on the constraints that are set on the discrete implementations. As far as the use of DFrFT in an OFDM system is concerned, the most important constraint is reversibility for which the inverse DFrFT (IDFrFT) satisfies Hermitian property and, therefore, allows us to invert the direct transform operation done by DFrFT. For this reason, we focus here on the closed-form type form of sampling-based method since it ensures reversibility with the lowest possible complexity.
The main contribution of this work is a software defined radio (SDR) implementation of the closed-form type of sampling-based DFrFT and its hardware co-simulation in an OFDM system. We present an FPGA prototyping of the OFDM receiver based on DFrFT by using the model based design flow for the Xilinx ZedBoard, equipped with a Zynq-7000 FPGA family SoC device. The realized hardware design is tested by using FPGA-in-the-Loop (FIL) cosimulation methodology. The bit error rate (BER) of the DFrFT-based OFDM system is evaluated in case of transmission over a frequency selective Rayleigh fading channel in presence of CFO. Zero forcing equalization of the received signal is performed in the receiver. The SDR model is built by using the Toolboxes provided by MATLAB and Simulink.
The paper is organized as follows. Section II describes the DFrFT-based OFDM system. The FIL co-simulation set-up is described in Sec. III. The details about implementation results are given in Sec. IV. Section V concludes the paper.
II. Description of the DFrFT-based OFDM System
The block diagram of the DFrFT-based OFDM system is shown in Fig. 1 . Starting from the encoder block, a high data rate stream is split into a number of low data rate streams that are first applied to a serial to parallel (S/P) converter and then transmitted in parallel using N sub-carriers. After S/P conversion, the diagram shows an OFDM modulator based on the sampling-based method of inverse IDFrFT kernel, where the information block contains N symbols. To mitigate the effect of inter-symbol interference, caused by channel time spread, a cyclic prefix (CP) is inserted between two successive OFDM symbols. After that, we consider transmission over a frequency selective Rayleigh multi-path fading channel where additive white Gaussian noise (AWGN) is added to the received signal. By assuming the perfect knowledge about the OFDM symbol start at the receiver, CP is removed and the resulting samples are S/P converted. After this, the DFrFT is used to process the samples and the resulting symbols are first equalized and then again serialized. The equalized samples are finally decoded to obtain the originally transmitted data.
A. IDFrFT kernel at the transmitter
The block of N transmitted symbols is applied to the input of the IDFrFT kernel. After N-point IDFrFT computation, the expression of the m-th transmitted sample is written as
where X(k) is the symbol transmitted on the k-th subcarrier and F-α (m,k) is the IDFrFT kernel given by
where and are the sampling intervals in the time and in the fractional Fourier domain, respectively, which are related as × = 2 |sin( )|/ . The fractional Fourier domain makes an angle = × /2 with the time-domain, with 0 ≤ a ≤ 1. For = /2 , i.e., a = 1, DFrFT converts into its DFT counterpart.
In order to get an efficient receiver implementation, a full fixed-point data type FPGA prototyping is addressed in this paper. The IDFrFT kernel given in (2) has many floatingpoint operations: square root, exponential and trigonometric functions. There are many hardware efficient algorithms for representing a floating-point function into its fixed-point form. Among these, the set of shift-add algorithms, known as CORDIC, can be employed for computing a wide range of trigonometric, hyperbolic, linear, and logarithmic functions [11] . The IDFrFT kernel contains the following floatingpoint functions: square root, exponential, sin(·), cos(·), and cot(·). First, we need to convert into fixed-point functions using the CORDIC algorithms. While the CORDIC implementations of sin(⋅), cos(⋅), and exp(⋅) functions are already available, there is no direct implementation available for cot(⋅), which is present in the exponent of (2) . By using the trigonometric property of cot(⋅)=cos(⋅)/sin(⋅), implementation becomes possible. After the implementation of CORDIC algorithms, the Simulink model of the IDFrFT kernel is shown in Fig. 2 . Each factor in the mathematical expression of the IDFrFT kernel in eq. (2) is reported as a sub-block in the Simulink model. It is worth observing that HDL Coder will be used in the following to automatically generate the VHDL code for the FPGA. As given in eq. (2), the IDFrFT kernel has two variables, , , both taking values from 0 to − 1. So, after computation of "1st_Term" and "2nd_Term" of the kernel, their product is obtained as matrix multiplication. This is because HDL Coder does not support matrix multiplication operation. Hence, to perform the required "Matrix Multiplication", we had to develop our own block implementing elementwise multiplication.
B. DFrFT kernel at the receiver
At the receiving side, after DFrFT computation, the received signal on the -th subcarrier is
where ( ) is the received signal before the DFrFT block and Fα (q,n) is the kernel given by
The Hermitian property of the DFrFT kernel makes almost identical to that of IDFrFT and, therefore, the same considerations done in Sec. III.A for the IDFrFT kernel hold for the DFrFT kernel as well. III. FPGA-in-the-Loop co-simulation of Receiver with equalization of DFrFT-based OFDM
As an example of Simulink model that implements a DFrFT-based OFDM system, we consider the case of transmission of a video signal as shown in Fig. 3 . The video signal is captured from a generic webcam and FIL cosimulation is run both to verify the correctness of the FPGA implementation and to accelerate the simulation at the receiver [12] [13] [14] . Since the receiver of the DFrFT-based system includes also zero forcing equalization, its overall complexity turns out to be higher than that at the transmitter and, therefore, the possibility of performing its rapid is more challenging. The Simulink setup of the receiver with zero forcing equalization of the DFrFT-based OFDM system is given in Fig. 4 . As a figure of merit we analyse the quality of the reproduced video by comparing the output of DFrFTbased OFDM system with that of the conventional one based on the DFT for different CFO [15] .
A. Receiver with zero-forcing equalization of the DFrFTbased OFDM system Zero forcing equalization is implemented by assuming ideal knowledge of the channel. We compute the frequency response of the channel and use it to divide sub-carrier by sub-carrier the received signal after the DFrFT block.
B. Floating-point to fixed-point conversion
The FPGA implementation requires conversion from floating-point to fixed-point data type. However, conversion from floating-point to fixed-point is very challenging and time consuming, typically demanding from 25 to 50% of the total design and implementation time. The conversion process introduces quantization errors that depend on the word and the fractional lengths that, in turn, impacts on the FPGA hardware resources. The optimization of the word length is an iterative process that is carried out using the Fixed Point Toolbox, available in Simulink, with the goal of achieving the best possible performance.
Here, we compare floating-point and fixed-point results for the implementation of the IDFrFT/DFrFT kernels. Figure  5 shows an example of the output of the IDFrFT kernel for floating-point and for fixed-point with word length of 12 bits. Steps for the design are repeated serval times using different random inputs to average over different fixed-point outputs. In Fig. 5 the average quantization error due to the conversion from floating-point to fixed-point for different values of word length is shown. The average quantization errors obtained for different values of the word length are shown in Fig. 6 . Similarly, the results for floating-and fixedpoint outputs of the fixed-point DFrFT kernel are shown in Fig. 7 . Those for the average quantization error obtained for different values of the word length are shown in Fig. 8 .
The average quantization error ec(n) for fixed-point representation with codeword length of c bits is defined as
where B is the number of input blocks containing the N samples to be transformed and ec,i(n) is the quantization error the -th input block given by 
where top error (TE) = max (maximum positive error, maximum negative error) and maximum absolute value MAE = max (abs (floating-point value) ). Based on the definition of MPQE, we simulated the model by applying four different inputs as a test bench into our IDFrFT kernel. The computation results of MPQE for each input is given in the corresponding column of Table 1 . We also performed the average of MPQE over the different inputs for fixed word lengths. Results for the average MPQE are reported in Table 1 from which it is clear that our system performs better if we select word length =12. 
C. Code Conversion (from. mdl and .m file to VHDL)
After fixed-point implementation of the kernels, we have used HDL Coder to generate the VHDL code. The HDL workflow advisor available in HDL Coder guides through the conversion of the Simulink model to VHDL code.
D. FPGA-in-the-Loop co-simulation
For verification purposes, HDL Verifier was used to run FIL co-simulation of the DFrFT-based OFDM receiver along with equalization. Our goal was to run both the receiver and the equalizer on the FPGA board, in order to increase the simulation speed as well as get rapid development of these algorithms in FPGA. Validation of the FIL co-simulation was done through a comparison with the theoretical analysis given in [6] .
IV. Implementation Results
First, the correctness of DFrFT-based OFDM system is verified by substituting α=π/2 in order to get conventional OFDM-based DFT. An 8-point implementation of the DFrFT-based OFDM is considered. Transmission takes place over a multi-path Rayleigh fading channel with 2-tap equal power delay profile in presence of a normalized carrier frequency offset = 0.1. As shown in Fig. 9 , BER performance of the DFrFT-based OFDM system performs better than the one based on DFT in presence of CFO. In the Monte Carlo simulation of DFrFT-based OFDM system, implementation of the IDFrFT/DFrFT block is given by the floating-point model. Table 1 . We also computed the quantization error from floating to fixedpoint for given word lengths. Figure 5 shows the results for floating and for fixed-point output of IDFrFT kernel at word length of 12. The computation of the quantization error was done for different values of the word length, so that we were able to analyze the impact of word length on the quantization conversion error as given in Fig. 6 . Similarly, results for the DFrFT kernel output and for the quantization error are given in Fig. 7 and Fig. 8, respectively. A summary of the Average max percentage quantization error for different inputs at different word lengths is reported in Table 1 . After fixed-point conversion, we run simulations based on the fixed-point model. In Fig. 9 , a perfect match is observed with Monte Carlo simulations (marked as ). The figure reports BER versus versus signal-to-noise ratio per bit Eb/N0, where Eb is the received signal energy per bit and N0 is the power spectral density of AWGN. After that, the VHDL code of the kernels was generated and optimized. In order to validate our design on the hardware, we generated the bit stream for programming the FPGA on the ZedBoard using Xilinx ISE design suite, that is integrated in HDL Coder. ZedBoard is a complete prototyping development kit for Xilinx Zynq ® -7000 all programmable SoC family. Xilinx ISE compiles and generates the bit stream file which is then loaded into the FPGA using JTAG via the USB connection. After generation of the bit stream, we were able to run the FIL co-simulation and verify the correctness of our model. A Perfect match is observed between FIL co-simulation (marked as ×), compared with the fixed-point implementation (marked as ) and also with the Monte Carlo simulation shown in the Fig. 9 .
V. Conclusion
In this paper rapid prototyping of the receiver for a DFrFT-based OFDM system has been considered by adopting a Model-Based Design approach with the help of MATLAB and Simulink. Iterative verification of each step was done starting from floating-point to fixed-point representation, HDL code generation and, finally, hardware co-simulation. We have considered FIL co-simulation of the receiver with the implementation of the equalization of DFrFT-based OFDM system for BPSK transmission over a frequency selective Rayleigh fading channel in presence of CFO. Simulation results clearly demonstrate that the FPGA implementation of a DFrFT-based OFDM system in presence of CFO has the same performance as that obtained from Monte Carlo simulation. Also, the performance is validated with the fixed-point model of the DFrFT-based OFDM. The approach described in the paper constitutes an efficient way to convert the floating-point model into a fixed-point one to be run in an FPGA and then verify its correctness through FIL co-simulation.
