I. INTRODUCTION
H ARDWARE simulators of mobile radio channel are very useful for the test of wireless communication systems [I] , [2] . The current communication standards indicate a clear trend in industry toward supporting Multiple-Input Multiple Output (MIMO) functionality. In fact, several studies published recently present systems that reach an order of SxS and higher [3] . This is made possible by advances at all levels of the communication platform [4] [5] . With the continuous increase of field programmable gate array (FPGA) capacity, entire baseband systems can be mapped onto faster FPGAs for more efficient testing and verification. As shown in [6] [7] , the FPGAs provide the greatest flexibility in algorithm design.
The simulator is configured for Long Term Evolution Systems (LTE) and Wireless Local Area Networks (WLAN) S02. 11 ac standards. The channel models used by the simulator can be obtained from standard channel models, as the TGn S02. 11n [S] and LTE [9] models, or from measurements using a MIMO channel sounder designed at IETR [10] [11] .
At IETR, several architectures of the digital block of a hardware simulator have been studied [12] [13] . In [14] , a method fitting the space time-frequency cross-correlation matrix to the estimated matrix of a real-world channel was presented. This solution shows that the error can be important. Typically, wireless channels are simulated using finite impulse response (FIR) filters, as in [l3], [15] and [16] . The Fast Fourier Transform (FFT) modules can also be used to obtain an algebraic product, as in [12] , [14] and [15] .
In this paper, we present two approaches. The first one performs in frequency domain, while the second approach is based on FIR filter. The main contributions of the paper are:
-The considered frequency architectures in [12] , [14] and [15] operate correctly for signals not exceeding the FFT size. Thus, new frequency architecture avoiding this limitation and working in streaming mode is presented. -The time domain architecture presented in [13] and [15] produces an occupation of 11 % to 13 % of slices on the FPGA for one SISO channel. However, in this paper, we present a time domain architecture with an occupation of 3 % for one SISO channel and up to 4S % for MIMO 4x4. -The channel frequency responses can be presented in baseband with a complex envelope, or with the real signal with limited band between fe-Bl2 and fe+BI2, where fe is the carrier frequency and B the bandwidth. To eliminate the complex multiplication and the fo the hardware simulation operates between L1 and B+L1, where B»L1>O depends on the band-pass filter and used to prevent the overlap of the positive-negative sides of the responses. -For indoor environments, tests have been made using S02. 11ac signals [17-1S] . However, in this paper, tests are made for outdoor environments with LTE signals for a MIMO 2x2 time varying channel. -A study is made for the first time relating the number of bits of the impulse responses to the relative error of the output signals. Thus, it is possible to have a trade-off between the occupation on the FPGA and the error. The rest of this paper is organized as follows. Section II presents the architectures for a SISO channel. Section III describes their hardware implementation. Moreover, the accuracy of the architectures is analyzed. Lastly, Section IV summarizes some conclusions and some prospects.
II.

PRINCIPLE, ARCHITECTURE AND OPERATION
The design of the radio frequency (RF) blocks of the simulator was realized during a previous work [12] [13] .
978-1-4673-2489-2/12/$31.00 ©2012 IEEE P ALMYRE II project mainly concerns the MIMO channel models and their hardware implementation into the simulator.
The channel models can be obtained from standard channel models or from measurements using a MIMO channel sounder designed at our laboratory. However, in this paper we use the LTE models which are popular and well known.
A. LTE Channel Models
The LTE models are used for mobile wireless applications and cover the most used scenarios for LTE applications [9] . A set of 3 channel models are proposed in [9] : The Pedestrian A model (EPA), the Vehicular A model (EVA) and the Typical Urban model (ETU). The sampling frequency is f.. = 50 MHz and the sampling period is T, = IIf.. = 20 ns.
B. Time-Varying EVA Channel Model
In the MIMO context, little experimental results have been obtained regarding channel time-variation, partly because of limitations in channel sounding equipment [19] . At a center frequency of l.8 GHz and a vehicular speed of 80 kmlh, the Doppler spread is f d = 133 Hz. Thus, we have chosen a refresh frequency irer = 300 Hz > 2fd '
As a time-varying channel, we consider a 2x2 MIMO Rayleigh fading channel.The MIMO channel matrix H can be characterized by two parameters: the power Pc of constant channel components which corresponds to the Line-Of-Sight (LOS), and the power Ps of the channel scattering components which corresponds to the Non-Line-Of-Sight (NLOS).
The ratio P cf P, is the Ricean K -factor. Assuming all coefficients of H are Rice distributed, then H is expressed by:
(1)
where Hr and Hvare the constant and the scattered matrices respectively. The total received power is P = Pc + Ps. Thus:
where K is the Ricean factor and it is equal to zero to obtain a Rayleigh fading channel. For EVA model, P is given in [9] for each of the nine taps. For 2 transmit and 2 receive antennas:
where Xij (i-th receiving and j-th transmlttmg antenna) are correlated zero-mean, unit variance, complex Gaussian random variables. The vector vect(Hv} can be divided into a covariance matrix and a vector spatially white Rayleigh Independent and identically distributed MIMO channel:
Hw is a Rayleigh fading matrix of independent zero mean, unit variance, complex Gaussian random variables.
LTE has defined the correlation for all four SISO channels which are considered identically distributed and normalized providing unitary average energy:
a1, a2 represent the correlations between channels at two receive antennas, but originating from the same transmit antenna, T,l, T,2 respectively (SIMO). fJ 1 , fJ 2 represent the correlations between channels at two transmit antennas, but originating from the same receive antenna, Rxl, Rx2 respectively (MISO). 51, 52 are the cross-correlation between antennas of the same side of the link.
For simplification, we consider:
I) a1 =a2 and fJ 1 =fJ 2 , thus, they can be denoted as a and f3.
2) 51= a x fJ and 52= a* x fJ.
Then it is possible to define a 2x2 transmit and receive correlation matrices, R, and Rn to decompose any MIMO system into two interconnected MISO/SIMO sub systems. This decomposition gives a simpler and less general model of the covariance matrix:
where ® is the Kronecker product, and R, and Rr are the correlation matrices at the transmitter and the receiver respectively. They are defined by:
The values a and f3 are defined by the LTE models. For high correlation: a =0.9, f3 =0.9. Therefore: h22' at a given time, by taking the LOS impulse response as a reference, with ire! = 300 Hz between the successive profiles. 
C. Digital Block
In this section, an improved frequency architecture and a time domain architecture based on a FIR filter are presented.
1) New Frequency Domain Architecture
The new frequency architecture presented in Fig. 1 has been verified with Gaussian impulse signals [20] . It operates correctly for signals with a number of samples exceeding N, where N = 2 " is the size of the FFT module.
For EVA channel, the largest excess delay is 126T,. Thus, N = 128 samples. However, it is mandatory to extend each partial input of N samples with a "tail" of N null samples, as in [20] , to avoid a wrong result. Therefore, the FFT/IFFT modules operate with 256 samples. Due to the use of a 14-bit digital-to-analog convertor (DAC), the final output must be truncated. The immediate solution is to use the "brutal" truncation which keeps the 14 first bits. However, a better solution is the sliding truncation presented in Fig. 2 which uses the 14 most significant bits.
00010100110011000
.1 Truucatiou 
2) Time Domain Architecture
For EVA, N = 127 samples and it imposes the use of 9 multipliers. Fig. 3 We have developed our own FIR filter instead of using Xilinx MAC FIR filter to make it possible to reload the FIR coefficients. The general formula for FIR with 9 multipliers is:
The index q suggests the use of quantified samples and hq(ik) is the attenuation of the k' h path with the delay ikT,. Fig. 4 shows the XtremeDSP Virtex-4 board from Xilinx [5] used to implement the simulator, and described in [20] . 
III. IMPLEMENTATION AND TESTS
A. Implementation and Results of the Frequency Architecture
As the development board has 2 ADC and 2 DAC, it can be connected to only 2 down-conversion RF units and 2 up conversion RF units. Therefore, four SISO channels in frequency domain are needed to simulate a one-way 2x2 MIMO radio channel. Fig. 5 shows the connection between the computer and the FPGA board to reload the coefficients. The refreshing period is (1/0.3) ms during which we must refresh all of the four SISO profiles, i.e. 256x4 = 1024 words of 32 bits (16 bits for the real part and 16 bits for the imaginary part) = 4096 bytes to transmit for one MIMO profile, which is: 4096xO.3 kHz = l.2288 MB/s. The PCI bus is chosen to load the profiles. It has a speed up to 30 MB/s.
The V4-SX35 utilization summary is given in Table II for MIMO 2x2 frequency architecture with their additional circuits used to dynamically reload the channel coefficients. 
B. Implementation and Results of Time Domain Architecture
For the time domain architecture, the amount of data transmitted for a profile is: 9x4 = 36 words of 16 bits = 72 bytes, which is: 72xO.3 kHz = 21.6 kB/s. In order to determine the accuracy of the digital block, a comparison is made between the theoretical and the Xilinx output signal. An input Gaussian signal x(t) is considered and long enough to be used in streaming mode: we consider Xm= Vm/2. The theoretic output signals are:
The relative error is computed for each output sample by: 
YxHinx l -Ytheory l 
Fig. 6 . Xilinx output signals and SNR, using frequency architecture.
The relative error is high only for small values of the output signal because the Gaussian signal test is close to O.
Thus, in this case, the Xilinx output signal is smaller than The global values of the relative error and of the global SNR of the output signal before and after the final truncations are necessary to evaluate the accuracy of the architectures.
The global relati ve error and SNR are computed by:
where Table IV shows the global values of the relative error and SNR using frequency and time domain architectures. With the frequency architecture presented in Fig. 1 , it is not suitable to modify the number of bits used for H to reduce the slice occupation on the FPGA. In fact, the global error presented in Table IV increases brutally for a small reduction of the number of bits, and the occupation on the FPGA in Table III will decrease with about 10 % which will not have an effect on implementing MIMO systems with higher order.
For the frequency architecture, the results are given in Table IV . The sliding truncation reduces with only 16 % the relative error obtained with brutal truncation. Thus, the brutal truncation is more suitable to use. Also, it offers a reduction of the slice occupation on the FPGA and it avoids the need of a reconfigurable analog amplifier after the DAC.
With the time domain architecture, while reducing the number of bits used for h, the global relative error increases as presented in Fig. 8 . We conclude that for a number of bits for h higher than 8, the average error is acceptable when using the sliding window truncation and the global SNR is more than 50 dB.
By reducing the number of bits of h from 16 to 8, we reduce the occupation on the FPGA from 12 % to 11.6 %.
The goal is to compare the frequency architecture with the time domain architecture, by considering three points: the precision, the FPGA occupation and the latency.
If we compare the results in Table IV , we observe that the SNR is higher for the time domain architecture with sliding window truncation. However, with the frequency domain architecturethe SNR is higher using the brutal truncation.
According to Tables II and III , the time domain architecture presents a slice occupation of 12 % on the FPGA Virtex-IV, which is better than the occupation of the frequency architecture (72 %). Thus, in the time domain, up to 22 SISO channels can be implemented on a single Virtex-IV.
The time domain architecture presents another advantage by generating a latency of 103 ns for each simulated profile, while the frequency architecture has a latency of 8.8 Ils.
IV. CONCLUSION
After a comparative study, in order to reduce the occupation on the FPGA, the error of the output signals and the latency of the digital block, the time domain architecture represents the best solution, especially for MIMO systems.
Simulations will be made using a Virtex-VII [5] XC7V2000T platform to simulate MIMO systems with higher order. Measurement campaigns will also be carried with the MIMO channel sounder realized by IETR, for various types of environments. A Graphical User Interface will also be designed to allow the user to select the propagation environment, the channel model and the channel parameters.
The final objective of this work is to simulate realistic propagation channels for different MIMO standards and en vironments.
V.
