Abstract-A low-complexity high performance Rayleigh fading simulator, and its Field Programmable Gate Array (FPGA) implementation are presented. This proposed method is a variant of the method of filtering of the white Gaussian noise where the filter design is accomplished in the analog domain and transferred into digital domain. The proposed model is compared with improved Jakes' model [1], auto-regressive filtering [2] and IDFT [3] techniques, in performance and computational complexity. Proposed method outperforms AR(20) filter and modified Jakes' generators in performance. Although IDFT method achieves the best performance, it brings a significant cost in storage and is undesirable. The proposed method achieves high performance with the lowest complexity, and its performance has been verified on Virtex4 and Spartan3e FPGA platforms. Our fixed-point Rayleigh fading-channel simulator utilizes only 2% of the configurable slices, lOA. of the Look-Up- Table (LUT) resources and 3°A. of the dedicated multipliers on a Xilinx Virtex4 -xc4vsx35 FPGA platform.
I. INTRODUCTION
In this paper, a low complexity prototype hardware architecture for modeling of Rayleigh fading process is developed targeting the Xilinx [4] Virtex4 -xc4vsx35 and Spartan3e -xc3s500e Field Programmable Gate Array (FPGA) development platforms. FPGAs offer a lot of flexibility, fixed-point arithmetic units with parameterized precisions, variable-length registers, numerous dedicated signal processing cores, speed, reliability, low tum-around time in the design phase, and very cheap alternative to the expensive emulators available on the market [5] , [6] .The FPGA realization of our method has been implemented using constant coefficient multipliers and the theoretical performance of the proposed method has been verified using this FPGA implementation. Hardware-based simulators can greatly reduce the simulation time compared to software-based simulators [7] . Many laboratory channel simulation tools today use hybrid DSP/FPGA solutions [8] , [9] or stand-alone FPGAs to generate wireless multipath channel models [10] , [11] . A much more cost-effective approach is to implement the entire simulator on a single FPGA chip [10] . The main contributions of our work include: 1) a novel lowcomplexity filter design scheme for fading-channel simulators with quantitative performance analysis; 2) an efficient hardware architecture for implementing filter-based simulator realized on a portion of a FPGA that produces fading channel characteristics.
The fading caused by multipath propagation in wireless communication systems is commonly modeled as a random process having Rayleigh distributed envelope, and is characterized by its power spectral density and its auto-correlation function. In the communications literature, Jakes' model [12] has been of great interest which is based on sum of sinusoids approach. Simulators based on white noise filtering methods [2] , [13] and on the Inverse Discrete Fourier Transform (IDFT) method [3] , [14] have also become popular. It was shown in [15] that the fading signals which are produced by classical Jakes' simulator are not wide-sense stationary (WSS). On the other hand simulators based on the IDFT method are of high-quality and efficient. A disadvantage of the IDFT method is that all samples are generated with a single fast Fourier transform (FFT), hence the storage requirements make it useless for the generation of very large number of samples and for sample-by-sample simulations. In this paper, we consider using a fading filter to filter white Gaussian noise that was first proposed in [13] . Unlike other filter structures [1]- [3] , [12] , [14] , a different optimization and design criterion is used to set the filter parameters in the analog domain that yields the transfer function of the fading filter. Bilinear transform is then used to get the desired filter structure as an ARMA(' Y, 'Y) filter, where ' Y is the filter order. Comparisons to other methods are made in terms of complexity and also in terms of performance by using quantitative performance measures introduced in [16] . The quantitative performance measures [16] have not been investigated in the previous similar studies [9] - [11] .
The organization of this paper is as follows: After a brief overview of Rayleigh fading statistics of the wireless channels, Section II provides the derivation of the proposed fading filter, followed by performance and complexity comparisons. Our proposed filters hardware implementation is provided in Section III. Finally Section IV provides concluding remarks.
II. DERIVATION OF THE FADING FILTER

A. Rayleigh Fading Statistics
Rayleigh fading process is characterized by the Gaussian Wide-Sense Stationary (WSS) uncorrelated scattering fading model [17] . In this model, time variability of the channel is determined by its autocorrelation function. This statistic generally depends on the propagation geometry, the velocity of the mobile and the antenna characteristics. A common assumption is that the propagation path consists of two-dimensional 978-1-4244-6689-0/10/$26.00 ©2010 IEEE
B. Novel Filter Design
A straightforward method to simulate a faded signal is to amplitude modulate the carrier signal with a low-pass filtered Gaussian noise source as shown in Figure I . In order to obtain where G1(s) and G 2(s) are as given in (2) , and the selection of Q is such that there is a pre-specified frequency response level at W = W x rad/sec; for example for the third-order filter if Q = v'lO then the magnitude of GO will have a gain of 7dB at W = W x (lOdB gain from the second order filter and3dB from the first order part making the overall gain of 7dB). In order to find the parameters of the fading filter transfer function, G")' (s), we will first set the filter order 1 and Q. Then defining S(J; 10) , as an approximation to the theoretical spectral density of (I), by '<;Y Signal
Transmi tter
isotropic scattering with a vertical monopole antenna at the receiver [12] . In this case the in-phase or quadrature part of the received signal envelope must be independent and each must have zero mean for Rayleigh fading, and theoretical spectral density of in-phase (or the quadrature) part of the received faded signal envelope is where (12 is the variance of the complex zero-mean white
Gaussian noise, and
uk=l k with {gtH=l' {grH=o are the auto-regressive and movingaverage filter taps, of the ARMA(/,/) model, respectively. The generated Rayleigh fading process has an autocorrelation function, Rxx [k] , which can be found by directly using WienerKhinchine theorem [18] . That is, provided in the s-domain, we can use the bilinear transform to get G")'(z) with an ARMA(/,/) model, or impulse invariance method to get a G")'(z) with an ARC!) model (all pole filter), where time varying frequency selective fading channel we must have a bank of these fading filters where each filter generates the corresponding fading channel tap. A fading filter with impulse response g(k) can be designed so that its output spectral density is an approximation to theoretical spectral density of the complex envelope of the faded signal S(J) of (I). We will use filter structures that were proposed in [13] 
")' -
if 1 odd, filter impulse response and as given as the inverse Z-transform of the transfer function G"'((z).
In (8) and (9) 
D. Performance Quality Measures
The quality measures that were first introduced in [16] 
II Qm ean(dB) I Qma x(dB)
measures were then averaged over 50 independent simulation trials. Plots of the empirical autocorrelation functions of the AR(20) model, our proposed Rayleigh fading generator via ARMA(3,3) and the IDFT method are shown in Figure 3 . The 
E. Performance Comparisons
The quality measure comparison results, which are presented in Table II , compare the quality of the real part of the simulator outputs. Perfect Rayleigh fading sequence generation method corresponds to 0 dB for both measures. 
C. Performance and Complexity Evaluation
The tested simulation methods in comparison to our proposed method are as follows.
I) Our Proposed Filter Design Method:
Our filter design was accomplished in the analog domain and transferred into the digital domain and implemented via ARMA model or AR model by bilinear transform using the MATLAB function bilinear, or impulse invariance method by MATLAB function impinvar respectively. After the filter coefficients were calculated, the Rayleigh fading sequence was generated by a direct structure using the MATLAB function filter.
2) IDFT Method: The simulator used was implemented as described in [3] . The MATLAB function ifft was used for IDFT computation.
3) AR Method:
The method of [2] was implemented via MATLAB functionfiltic, to generate first p (model order) stationary Rayleigh fading samples and then MATLAB function filter was used to generate the other samples.
4) WSS-improved Jakes ' Model:
The method used w based on the sum of sinusoids technique of [1] . The m malized low-pass discrete fading process is generated finite number of sinusoids, therefore this WSS simulator not autocorrelation ergodic; hence, theoretical calculations quality measures can not be done for this method. 
III. HARDWARE IMPLEMENTATION OF THE PROPOSED ALGORITHM
The FPGA implementation of our proposed algorithm, performed using Xilinx's System Generator, is shown in Figure 4 .
System Generator Tool produces a design that is targeted towards Xilinx Virtex4 ML402 Development Kit, consisting ofaXilinx xc4vsx35 FPGA chip. The primary simulation and debugging tool used for this paper was MATLAB [20] .
The integrity of the third order IIR Filter has been confirmed with bit and cycle modeling within the Matlab/Simulink environment. The communication between Development board and Matlab/Simulink on the host PC is via the USB interface.
Simulink version 7 and Xilinx's co-simulation tools were also used for debugging [4] . results show that the IDFT method generally provides closer the highest quality Rayleigh samples.
The AR model of [2] provides a more precise match to the desired autocorrelation function as the order of the model used increases. But our proposed filter design method provides same accuracy with much lower order models. Table III provides the number of real multiplications required to generate 2 20 complex Rayleigh variate samples. As an example, our ARMA(3,3) fading sequence generator has a significant computational and performance advantage over AR (20) generator of [2] : Proposed ARMA(3,3) model requires less than one-third of the multiplications while achieving about 0.7dB lower mean basis power margin and about ldB lower maximum basis power margin. Proposed ARMA(3,3) fading generator outperforms modified Jakes' generator with 8 and 16 sinusoids by 32dB and 2dB respectively in performance, while requiring less than one-tenth of the multiplications required by the Jakes ' generators with 8 and 16 sinusoids. The main advantage of the method provided herein is that the samples of the fading sequence can be generated as they are required (sample-by-sample basis) while achieving the lowest complexity of all the Rayleigh fading generators mentioned. The computational efficiency of the IDFT method brings a cost in storage requirements as all samples are generated using a single IFFT. Our proposed fading generator and other generators don't have such a limitation. IIR filters with general purpose multipliers can be used as building blocks for cascade or parallel realizations of higher order IIR filters. Figure 5 shows the Direct Form structure I with the seven multiplier coefficients, and the detailed implementation of the third -order direct form I filter using Xilinx System Generator for DSP.
System Generator translates the Simulink model into a hardware realization by mapping Xilinx block set elements and converts Simulink hierarchy into hierarchical VHDL netlist. The used multiplier blocks are an IP core from Xilinx that implements a multiplier.
The delay element Z-l in an IIR filter signifies a full word delay. The delay is used to align data words and to propagate control signals that must also be properly synchronized.
A random noise signal generator, using Gaussian-Ziggurat method, was used to generate a discrete time Gaussian white noise signal, and is passed through our ARMA(3,3) filter. Co-simulation has been used to verify our filter frequency response of the real hardware as shown in Figure 6 .
A. Resource Consumption
Our filter was targeted at the Xilinx's Spartan3e-xc3s500e and Virtex4-xc4vsx35 development platforms. Design statistics and resource consumption of our filter is shown in Table IV for both platforms. As a comparison our proposed design uses 277 slices on the Spartan3e-xc3s500e platform, and 365 slices on the Virtex4-xc4vsx35 platform, whereas the in terms of the quality measures, it brings a significant cost in storage requirements as all samples are generated using a single IFFT. Thus the IDFT method is undesirable from simulation point of view when the Rayleigh fading samples are generated as they are required. The main advantage of our ARMA(3,3) Rayleigh fading generator is that the samples of the Rayleigh fading sequence can be generated as they are required while achieving the lowest complexity of all the Rayleigh fading generators mentioned.
Using Xilinx ISE 9.2i, the fading filter presented in Section II was described in SimulinkNHDL synthesized and tested on Xilinx Spartan3e and Virtex4 platforms. The coefficients were generated using Matlab 7.2 [20] .
We report here the hardware co-simulation of the proposed channel simulator utilizes only 2% of configurable slices, and 4% of the dedicated multipliers and 1% of the available LUTs on the Virtex4 xc4vsx35 platform.
reported slice usage in [19] is 6178 on Virtex-xc2vI500-5 platform. Similarly the LUTs utilized in our design are 473 and 574 in Spartan and Virtex4 platforms respectively, while the reported LUT usage in [19] is 6232 on Virtex-xc2vI500-5 platform. IV. CONCLUSION A low-complexity and high performance implementation of a Rayleigh fading channel simulator was presented. The hardware realization of our channel simulator was implemented using FPGAs, and verified with Xilinx co-simulation. The FPGAs have been adapted well to the design of the Fading Filter Design for Multipath Rayleigh due to the use of IP cores from Xilinx.
Our proposed ARMA(3,3) filter has been compared with improved Jakes' model of [1], AR fading filter approximation of [2] , and to the IDFT technique of [3] , in terms of performance measures and computational complexity. Our ARMA(3,3) Rayleigh fading generator, outperforms AR (20) generator of [2] , by about IdB in both performance measures provided, while requiring approximately a quarter of the multiplications required by the AR (20) generator. Similarly, our ARMA(3,3) fading generator outperforms modified Jakes' generator with 8 and 16 sinusoids by 32dB and 2dB respectively, while requiring less than one-tenth of the multiplications required by the Jakes' generators with 8 and 16 sinusoids. While the IDFT method of [3] achieves the best performance 
