Introduction
Multiple-Input Multiple-Output (MIMO) systems make use of antenna arrays simultaneously at both transmitter and receiver to improve the channel capacity and the system performance. Because the transmitted electromagnetic waves interact with the propagation environment (indoor/outdoor), it is necessary to take into account the main propagation parameters for the design of the future communication systems.
Hardware simulators of mobile radio channel are very useful for the test and verification of wireless communication systems. These simulators are standalone units that provide the fading signals in the form of analog or digital samples [1] , [2] .
The current communication standards indicate a clear trend in industry toward supporting MIMO functionality. A support for higher order of antenna arrays will be required to enable higher channel capacity and system performance. In fact, several studies published recently present systems that reach a MIMO order of 8×8 and higher [3] . This is made possible by advances at all levels of the communication platform as, for example, the monolithic integration of antennas [4] and the design of the simulator platforms [5] .
With the continuous increase of field programmable gate array (FPGA) capacity, entire baseband systems can be efficiently mapped onto faster FPGAs for more efficient prototyping, testing and verification. As shown in [6] , the FPGAs provide the greatest flexibility in algorithm design and visibility of resource utilization. Also, they are ideal for rapid prototyping and research use such as testbed [7] .
The simulator is reconfigurable with standards bandwidth not exceeding 100 MHz, which is the maximum for FPGA Virtex IV. However, in order to exceed 100 MHz bandwidth, more performing FPGA as Virtex VI can be used [5] .
The simulator is configured with the Long Term Evolution System (LTE) and Wireless Local Area Networks (WLAN) 802.11ac standards.
The channel models used by the simulator can be obtained from standard channel models, as the TGn 802.11n channel models [8] and LTE channel models [9] , or from real measurements conducted with the MIMO channel sounder designed and realized at IETR [10] . Different architectures of antenna arrays can be used for outdoor and indoor measurements [11] .
At IETR, several architectures of the digital block of a hardware simulator have been studied, in both time and frequency domains [12] , [13] . Moreover, [14] presents a new method based on determining the parameters of a channel simulator by fitting the space time-frequency cross-correlation matrix of the simulation model to the estimated matrix of a real-world channel. This solution shows that the error obtained can be important.
Typically, wireless channels are commonly simulated using finite impulse response (FIR) filters, as in [13] , [15] and [16] . Nowadays, different approaches have been widely used in filtering, such as distributed arithmetic (DA) and canonical signed digits (CSDs).
For a hardware implementation, it is easier to use the FFT (Fast Fourier Transform) module to obtain an algebraic product. Thus, frequency architectures are presented, as in [13] and [15] .
The previous considered frequency architectures in [13] operate correctly only for signals with a number of samples not exceeding the size of the FFT. However, in this paper, a new frequency domain architecture avoiding this limitation, and a new time domain architecture are both tested for a scenario using TGn channel models.
The main contributions of the paper are:
• In general, the channel impulse responses can be presented in baseband with its complex values, or as real signals with limited bandwidth B between f c -B/2 and f c + B/2, where f c is the carrier frequency. In this paper, to eliminate the complex multiplication and the f c , the hardware simulation operates between and B + , where depends on the band-pass filters (RF and IF). The value is introduced to prevent spectrum aliasing. In addition, the use of a real impulse response allows the reduction by 50% of the size of the FIR filters and by 4 the number of multipliers. Thus, within the same FPGA, larger MIMO channels can be simulated.
• In this study, we related the number of bits used in the time domain architecture to the relative error of the output signals in order to identify the best trade-off between the occupation on the FPGA and the accuracy. Therefore, an improvement solution based on an Auto-Scale Factor (ASF) is presented.
• Tests have been made for indoor [17] [18] , and outdoor [19] fixed environments using standard channel models. In this paper, which is an extension of [17] , tests are made with scenario that switches between indoor environment and another and make it possible to simulate heterogeneous networks [20] . Moreover, tests are made with time-varying channels.
• To decrease the number of multipliers on the FPGA and to switch from one environment to another, a solution is proposed to control the change of delays in architecture for time-varying channel.
The rest of this paper is organized as follows. Section 2 presents the channel models and the scenario proposed for the test. Section 3 describes the new architectures of the digital block of the hardware simulator in frequency and time domain respectively. The prototyping platform used to implement these architectures and their occupation on the FPGA are also described. Section 4 presents the accuracy of the Xilinx output signals. The output SNR for the entire scenario is provided. Lastly, Section 5 gives concluding remarks and prospects.
Channel Description
A MIMO propagation channel is composed of several time variant correlated SISO channels. For MIMO 2×2 channel, the received signals y j (t, ) can be calculated using a time domain con o tion v lu :
The associated spectrum is calculated by the Fourier transform (using o FFT m dules): (2) According to the considered environment, Table 1 summarizes some useful parameters. (3) and for the time domain by: (4) where W tF is the closest value for W teff which is imposed by the size N F = 2 n of the FFT modules.
Channel Models
Two channel models are considered to cover indoor and outdoor environments: the TGn channel models (indoor) and the LTE channel models (outdoor). Moreover, using the channel sounder realized at IETR, measured impulse responses are obtained for specific environments: shipboard, outdoor-to-indoor.
TGn Channel Models
TGn channel models [8] have a set of 6 profiles, labeled A to F, which cover all the scenarios. Each model has a number of clusters. For example, model E has four clusters. Each cluster corresponds to specific tap delays, which overlaps each other in certain cases. Table 2 summaries the relative power of the impulse responses for TGn channel model E by taking the Line-Of-Sight (LOS) impulse response as reference [8] . The relative powers of all impulse responses for all TGn channel models are presented in [8] . According to the standard and the bandwidth, the sampling frequency is f s = 180 MHz and the sampling period is T s = 1/f s . The relative power of the first tap is different than zero because the impulse response is in Non-Line-Of-Sight (NLOS).
LTE Channel Models
LTE channel models are used for mobile wireless applications. A set of 3 channel models is used to simulate the multipath fading propagation conditions. A detailed description is presented in [9] .
Measurement Data
The channel models used by the simulator can also be obtained from measurements by using a time domain MIMO channel sounder designed and realized at the IETR [10] and shown in Fig. 1 .
The sounder uses a periodic pseudo binary sequence. It has 11.9 ns temporal resolution for 100 MHz bandwidth. The carrier frequencies are 2.2 GHz and 3.5 GHz.
Two measurement campaigns were carried out: The first campaign concerns a shipboard environment, while the second one considers an outdoor-to-indoor environment. The measured MIMO impulse responses are used thereafter by the hardware simulator.
For the shipboard measurement campaign [21] at 2.2 GHz, a Uniform Linear antenna Array (ULA) and a Uniform Rectangular antenna Array (URA) were used for the transmitter (Tx) and the receiver (Rx) respectively, to characterize the double directional channel on a 120 o beam width in the horizontal plan.
For the outdoor-to-indoor measurements [22] at 3.5 GHz, it has been shown that the penetration of electromagnetic waves mainly occurs through openings like doors and windows. Thus, a receiver located inside a building receives signals coming from few main directions. Two UCA (Uniform Circular Array) were developed to characterize 360° azimuthal double directional channel at both link sides. 
Proposed Scenario
The proposed scenario covers indoor environments at different environmental speeds. They consider the movements from an environment to another using an 802.11ac signal which has a 180 MHz sampling frequency (f s ) at a central frequency of 5 GHz.
A person moves from an office environment to a large indoor environment, then to an outdoor environment. For this scenario, the TGn channel model B, C and E cover the entire channel. Thus, three environments in this scenario are considered. Fig. 2 and Table 3 present the scenario and the movement of the person in it. 
Time-Varying 2×2 MIMO Channel
In this section, we present the method used to obtain a model of a time variant channel, using the Rayleigh fading. A 2×2 MIMO Rayleigh fading channel [23] [24] is considered. The MIMO channel matrix H can be characterized by two parameters:
1) The relative power P c of constant channel components corresponds to LOS paths.
2) The relative power P s of the channel scattering components corresponds to NLOS paths.
The ratio P c /P s is called Ricean K-factor. Assuming that all the elements of the MIMO channel matrix H are Rice distributed, it can be expressed for each tap by: (6) where H F and H V are the constant and the scattered channel matrices respectively.
The total relative received power is P = P c + P s . Therefore:
If we replace Equation (7) and Equation (8) in Equation (6) we obtain:
To obtain a Rayleigh fading channel, K is equal to zero, so H can be written as: (10) P is the relative power of the impulse response. It is derived from Table 2 for each tap. For 2 transmit and 2 receive antennas: (11) where X ij (i-th receiving and j-th transmitting antenna) are correlated zero-mean, unit variance, complex Gaussian random variables as coefficients of the variable NLOS (Rayleigh) matrix H V .
To obtain correlated X ij elements, a product-based model is used [23] . This model assumes that the correlation coefficients are inde h of the link: pendently derived at eac end (12) H w is a matrix of independent zero mean, unit variance, complex Gaussian random variables. R r and R t are the receive and transmit correlation matrices. They can be written by: (13) where is the correlation between channels (between their average signal gain) at two receives antennas, but originating from the same transmit antenna (SIMO). It is the correlation between channels that have the same Angle of Departure (AoD).
is the correlation coefficient between channels at two transmit antennas that have the same receive antenna (MISO).
The use of this model has two conditions: 1) The correlations between channels at two receive (resp. transmit) antennas are independent from the Rx (resp. Tx) antenna.
2) If s 1 (resp. s 2 ) is the cross-correlation between antenna AoD esp. AoA) at the sa e side of the link, then: s 1 = + an s 2 = + . s (r m d and are expressed by : (14) where D = 2 d/λ, d = 0.5λ is the distance between two successive antennas, λ is the wavelength and R xx and R xy are the real and imaginary parts of the cross-correlation function of the ns rrelated a co idered co ngles:
The PA S (Power Angular Spectrum) closely matchs the Laplacian distribution [25] [26] : (17) where is the standard deviation of the PAS.
Architecture and Implementation on FPGA
In this section, improved frequency and time domains architectures are presented and implemented on a FPGA Virtex-IV.
Frequency Domain Architecture
The new frequency architecture for a SISO channel is presented in Fig. 3 . This architecture has been verified and tested with Gaussian impulse response and a description is presented in [27] . It operates correctly for signals with a number of samples exceeding the size of the FFT module. In general, for each SISO channel, the size of the FFT/IFFT modules is determined by the last excess delay of the impulse response of the channel. However, by simulating a scenario all the channels have to be considered. The highest last excess delay for the three environments is for E 3 (Model E).
For TGn channel model E, N eff = 131 samples. Thus, N = 128 samples (the last tap has a relative power of -24.6 dB, therefore it will be considered as zero). However, to test the new architecture, it is mandatory to extend each partial input of N samples with a "tail" of N null samples, as in [27] , to avoid a wrong result. Therefore, 256-FFT/IFFT modules are used.
H is the representation of h in the frequency domain. It can be calculated by: (18) where h q is n 1 s ed by: h quantified o 6 bits and W q i comput (19) where (20) and each w l is quantified on 12 bits (which is the best trade-off between the occupation on FPGA of the FFT block and its accuracy).
The truncation block is located at the output of the digital adder. It is necessary to reduce the number of bits after the sum of the signals computed by the IFFT blocks to 14 bits. Thus, these samples can be accepted by the digital-to-analog converter (DAC), while maintaining the highest accuracy.
The immediate solution is to keep the most significant 14 bits. It is a "brutal" truncation. This truncation decreases the real value of the quantified output sample. 17 -14 = 3 bits will be eliminated. Thus, instead of an output sample y, we obtain , where is the biggest integer number smaller or equal to u. However, for low voltages of the output of the digital adder, the brutal truncation generates zeros to the input of the DAC.
Therefore, a better solution is the sliding window truncation presented in Fig. 4 which uses the 14 most effective significant bits. This solution modifies the output sample values. Therefore, the use of a reconfigurable amplifier after the Digital-Analog convertor must be used to restore the correct output value. In order to implement the hardware simulator, the adopted solution uses a prototyping platform (XtremeDSP Development Kit-IV for Virtex-IV) from Xilinx [5] , which is presented in Fig. 5 and described in [27] . The simulations and synthesis are made with Xilinx ISE [5] and ModelSim software [28] .
The V4-SX35 utilization summary for this architecture with FFT 256 and IFFT 256 blocks is given in Table 4 . 
Time Domain Architecture
In general, for each channel the FIR width and the number of used multipliers are determined by the taps of each channel. However, by simulating a scenario all the channels have to be considered. To use limited number of multipliers on the FPGA and to switch from one environment to another, a solution is proposed to control the change of delays in architecture by connecting each multiplier block of the FIR by the corresponding shift register block. Therefore, the number of multipliers in the FIR filters is equal to the maximum number of taps between all channels of all environments. E 3 has the highest number of taps which is 18. Therefore, 4 FIR filters with 18 multipliers each are considered. For TGn channel model E, the length of the FIR filter is N = t s mputed as: 131. Thus, the ou put ignal can be co (21) Table 5 shows the device utilization for four FIR filter 131 for 18 selected positions for the channel impulse response which are considered as non-null, in one V4-SX35 after synthesis, mapping and route. k where k is an integer verifying: 0.5 < 2 k .x < 1. However, we cannot predict x and multiply each sample by ASF at a high sample frequency. Therefore we will use the ASF on the MIMO impulse responses. If h max = max (|h|) < 0.5 it will be multiplied by where is the unique integer verifying 0.5 < .h max < 1. In the case of a brutal truncation, ASF=2 k . However, for sliding truncation, if the output signals are presented on more than 14 bits, the sliding factor has to be considered to amplify the output signal in order to obtain the correct result. In this case, ASF = -. The ASF is sent to a reconfigurable analog amplifier to restore the true value of the output signals. ASF can be presented on 14 bits (limited by the D/A convertor). The first bit is "1" if it is a multiplication by ASF, and "0" if it is a division by ASF.
Results and Accuracy

Data Transfer Description
The channel impulse responses are stored on the hard disk of the computer and read via the PCI bus and then stored in the FPGA dual-port RAM. Fig. 7 shows the connection between the computer and the FPGA board to reload the coefficients. The successive profiles are considered for the test of a 2×2 MIMO time-varying channel. The MIMO profiles are stored in a text file on the hard disk of a computer. This file is then read to load the memory block which will supply RAM blocks on the simulator (one block for each tap of the impulse response).
Reading the file can be done either from USB or PCI interfaces, both available on the used prototyping board. The PCI bus is chosen to load the profiles. It has a speed of 30 (MB/s). In addition, this is a bus of 32 (bits). Thus, on each clock pulse two samples of the impulse response are transmitted.
The Nallatech driver in Fig. 7 provides an IP sent directly to the "Host Interface" that reads it from the PCI bus and stores these data in a FIFO memory. The module called "Loading profiles" reads and distributes the impulse responses in "RAM" blocks.
While a MIMO profile is used, the following profile is loaded and will be used after the refresh period.
Accuracy
In order to determine the accuracy of the digital block, a comparison is made between the theoretical and the Xilinx output signals.
A Gaussian input signal x(t) is considered long enough (more than N = 256 samples) to be used in streaming mode. To simplify the calculation, we consider x 1 (t) = x 2 (t)=x(t) given by: (22) where 
Relative Error and SNR
The relat is co ted fo t sample by: ive error mpu r each outpu (25) where The results are given with brutal truncation (B.T.) and sliding truncation (S.T.). Fig. 9 shows a snapshot of the Xilinx output signal y 1 with their relative error and SNR using the time domain architecture for E 1 , E 2 and E 3 . 
Mean Global Relative Error and Global SNR with Time-Varying Profiles
The global values of the relative error and of the SNR computed for the output signal before and after the final truncations are necessary to evaluate the accuracy of the architecture. The gl rel e e o uted by: obal ativ rr r is comp (27) The global y SNR is computed b : (28) where E = Y Xilinx -Y theory is the error vector.
For a given vector X , … x L ] | || is:
(29) Table 6 shows the mean global values of the relative error and the SNR for all profiles using the two architectures. The results are given with sliding truncation and using ASF. To compare the time domain architecture with the new frequency domain architecture, three points resume the comparison: the precision, the occupation on the FPGA and the latency.
With sliding window truncation, the relative error do not exceed 1 % (for the worst case, with TGn model B), which is sufficient for the test. However, the time domain architecture presents high precision.
In terms of occupation of slices on the FPGA Virtex-IV, the occupation for the time domain architecture is 43 % in contrast with the occupation of the frequency domain architecture which is 96 %. Thus, the time domain architecture presents another advantage. Moreover, with the time domain architecture we can simulate up to 8 SISO channels. Therefore, MIMO 4×2 system can be used and which operates via 18×8 = 144 multipliers and producing an occupation of 87 % of slices on the FPGA.
In term of latency, the time domain architecture presents another advantage by generating a latency of 165 ns. However, the new frequency architecture generates 7 s. Therefore, the time domain architecture is more efficient to use, especially for MIMO systems. However, the use of more performing FPGAs as Virtex-VII is mandatory to solve the occupation problem for the new frequency domain architecture and to simulate high order MIMO systems.
Conclusion
This paper presents a frequency domain and time domain architectures for the digital block of a hardware simulator of MIMO propagation channels. This simulator is used for WLAN 802.11ac applications. It characterizes an indoor scenario using TGn channel models. After the description of the general characteristics of the hardware simulator, the new architectures of the digital block have been presented and designed on a Xilinx Virtex-IV FPGA. Their accuracy, occupation on the FPGA and latency have been analyzed.
After a comparative study, in order to reduce occupation on the FPGA, the error and the latency of the digital block, the time domain architecture present the best solution for indoor environments.
For our future work, simulations made using a Virtex-VII [5] XC7V2000T platform will allow us to simulate up to 300 SISO channels. In parallel, measurement campaigns will be carried out with the MIMO channel sounder realized by IETR to obtain the impulse responses of the channel for specific and various types of environments. The final objective of these measurements is to obtain realistic MIMO channel models in order to supply the hardware simulator. A graphical user interface will also be designed to allow the user to reconfigure the simulator parameters.
