Abstract-This paper describes the design steps and final implementation of a MIMO OFDM prototype platform developed to enhance the performance of wireless LAN standards such as HiperLAN/2 and 802.11, using multiple transmit and multiple receive antennas. We present the design methodology, including cross-validation of the Matlab, C++ and VHDL components, and the final demonstrator architecture. We highlight the increased measured performance of the MIMO testbed over the singleantenna system.
I. INTRODUCTION
This paper describes in some detail the implementation aspects of a testbed for wireless LANs using OFDM and multiple antennas. The main goal of the testbed development is the validation of the performance of MIMO algorithmic approaches in real-time hardware, in order to highlight the potential improvements brought by different modes implemented, in comparison with the single antenna case. We also highlight the methodology used in the testbed development, using an FPGA for digital processing. The full setup is depicted on Figure 1 . The steps followed include full C++ floating-point and fixed-point simulations and cross-validations with VHDL code simulations. Performance results have been checked at each step.
After an extensive algorithmic study (details of which can be found in the companion paper [1] ), a limited set of MIMO algorithms were selected for further implementation, based on the following criteria: 1) best complexity vs. performance trade-off, assessed by high-level simulations, 2) acceptable level of complexity, given the available computational resources, 3) embedded reconfigurability with a single antenna fallback mode. In the view of the above, the selected solution for demonstration embeds up to N T = 2 Tx and N R = 2 Rx antennas. The modulation is based on the Alamouti Space-Time block code [2] for the 2 Tx modes, and on persubcarrier Maximum Ratio Combining (MRC) for the 2Rx modes, since it is very low-complexity and it still offers excellent performance on a realistic channel generated by the ray-tracing simulator of [3] , [4] . The demonstrated system also embeds an OFDM modulator, with modulation schemes ranging from BPSK to 16QAM, and forward error correction mechanisms based on the HiperLAN/2 (or equivalently, IEEE802.11a) specifications, as well as synchronization and frequency-domain least-squares channel estimation procedures [5] . This results in a point-to-point setup, with embedded performance evaluation mechanisms (such as Bit Error Rate computation), usable in different locations and scenarios.
II. HARDWARE CHARACTERISTICS

A. Baseband platform
The algorithms have been ported to the Advanced Research Technologies (ART) platform. The hardware used in this demonstration platform consists of a radio module and a baseband processing card. The baseband processing card consists of a general-purpose microprocessor, for the non timecritical MAC code, a large FPGA (a 1.5M gates ALTERA EP20K1500EBC652-1X) for the time-critical Data Link Control (DLC) and physical (PHY) layer functions, and the analog components needed to interface to the RF subsystems. Explicitly, there are two digital-to-analog converters (DACs) and two analog-to-digital converters (ADCs) on each baseband card. The FPGA transmitter and receiver designs include the automatic gain control (AGC), the synchronization, the OFDM engine and the digital I/Q modulator/demodulator. The base-band processor is connected to the radio front end via two 12-bit AD/DA converters (respectively Analog Devices AD9432BST-105 and AD9752ARU), sampled at 80MHz. The transmitter and receiver designs are based on proprietary blocks written in VHDL. We choose the development to be performed within three main steps: algorithm level simulation (Matlab & C++), then VHDL simulation and FPGA implementation. The validation of the test bed relies on a cross verification: simulation and implementation environments are compared via common scenarios and bit-accurate test vectors. The base-band implementation of the receiver is split into two designs: one is dedicated to the testbed signal acquisition while the other one helps to validate the digital loop-back process of the cyclic prefix OFDM (CP-OFDM) modulator. In addition, the transmitter uses a frame scenario player relying on an instruction pipelined sequencer. The system can handle BSPK, QPSK and 16-QAM constellations. The PCI base-band board is controlled by a C application under MS-Windows. It allows to set any base-band parameters and frame scenario. On the receiver side, the signal acquisition and analysis is handled by an additional Matlab environment including the OFDM demodulator chain for both, the CP-OFDM and PRP-OFDM modes. All commands and transfers initiated by the host PC are relayed by the PowerPC available on the board; we choose to run this CPU with the real time operating system (RTOS) VxWorks. The signal acquisition modules handled by the PowerPC allow to capture approx. six OFDM frames of 256 data symbols each.
B. RF front-ends
The architecture of the receiver RF front-end is based on classical heterodyne (double IF -Intermediate Frequency) conversion. It performs the frequency down-conversion only, since the demodulation is performed digitally in baseband (digital I&Q separation) on a signal sampled at 80MHz, with a lower IF at 20MHz. At the transmitter, the frequency upconversion is achieved with the same principle. The obvious advantage of this scheme is that only a single baseband processing cards is needed for the 2 radios. A digital antialiasing filter is needed in the receiver to prevent the real and imaginary parts of the signal being mixed, which can be combined with an adjacent channel rejection filter. Although analog I&Q would allow to use a lower sampling clock, and therefore reduce clock jitter and non-linearities due to skew, it requires the compensation of the imbalances, both in amplitude and phase, introduced by the separate processing of analog I and Q channels. Furthermore, using digital I&Q involves only one DAC and ADC per radio channel, versus two in the analog version. Two local oscillators are used for the frequency conversions (the same for up and down conversions) and are based on programmable PLL synthesizers which enable channel selection over the HiperLAN/2 band. The demonstrator uses the lower HiperLAN/2 band (5.180 − 5.320 GHz), in which the EIRP is limited to 23 dBm. The maximum receiving power to be supported according to the specifications is −30 dBm (class 2 receiver). The measured performances of the receiver fulfill the sensitivity and SNR specifications for the various data rates (from 6Mbit/s to 54 Mbit/s). The demonstrator, pictured in Figure 2 (a), is built upon purchased components and circuits that were mounted on dielectric substrates. Some circuits were specifically designed for the RF architecture such as local oscillators, Image Rejection filter, Automatic Gain Control (AGC) amplifiers at the first and second IF. Compliant with the IEEE802.11a/HiperLAN/2 definition of the PHY layer, the demonstrator architecture of the 5GHz part uses an intermediate fixed frequency at 930MHz and a voltage controlled oscillator (VCO) at 4.2GHz to address several 20MHz channels between 5.15GHz and 5.35GHz. At the receiver, the chain embeds an AGC with a dynamic range of 80dB that pushes the overall sensitivity below -85dBm. At the transmitter, the output power is controlled by an attenuator with 15dB of dynamic range followed by two power amplifiers in order to maintain the output power between -13dBm and +10dBm.
In order to reduce cost, size and volume, the integration of the RF front-end close to the antenna is essential. An RF frontend, pictured in Figure 2(b) , was designed specifically for this project. Almost all of the RF and IF functions of a single front-end were implemented on one side of a 10cm×15cm double-side PCB, with the DC power supply occupying the other side, together with the synthesizers and the second IF AGC amplifier. In order to achieve this, new designs of some circuits were performed: the Image Rejection filter was designed on a separate high permittivity dielectric substrate and then mounted on the PCB, the transmitter power amplifier, the receiver low noise amplifier and the switch were mounted using separate packaging integrated circuits. The sheet metal antenna is reported on the opposite side of the RF circuits. The integrated front-end circuit board is depicted on Figure 2(b) .
C. Antenna characteristics
The antennas selected for the demonstration platform address different sets of requirements, and thus have different characteristics at the Mobile Terminal and the Access Point. At the MT, we selected low cost sheet metal patch antennas to be integrated into the lid of a laptop, on the corners, so as to get maximum space diversity (see Figure 3(a) ). They are also oriented so as to get polarization diversity. The radiation pattern is omnidirectional, and such antennas present a good radiation efficiency, but a poor isolation in cross-polarization configuration. It was verified that in all cases the coupling between elements is below -30dB. At the AP, we selected again a low cost sheet metal antenna, as baseline single element, having a radiation pattern with 110
• 3dB aperture in the vertical plane, and 90
• in the horizontal plane, and the 2 antennas used also present polarization diversity with 15dB decoupling. The antenna gain here is 6dBi. In addition, we considered a second possibility, which consists in using the baseline elements described above within an antenna array, having this time a radiation pattern of (110
• , 30 • ) and a 9dBi gain. Again, it is possible to use polarization diversity in addition to space diversity, and the antennas are mounted on a board, so that they can be tilted (see Figure 3(b) ).
III. DEVELOPMENT METHODOLOGY
The VHDL design is validated through the following steps: 1) test VHDL-blocks using software simulation (e.g.
[i]FFT, Viterbi decoder, synchronization, etc.); all output vectors are compared to bit-accurate test-vectors in C++ and/or Matlab; 2) synthesize the blocks and check their behavior within the FPGA on the platform. During this process, the FPGA outputs must correspond exactly to the results of the bitaccurate simulations. This approach is valid for the Tx as well as for the Rx: concerning the Tx, the signals at the output of the I/Q modulation (which feed the D/A conversion) are read and validated. 3) these reference signals are used to test the Rx modules.
They are stored in an internal memory of the platform and processed by the Rx substituting real received data. This way, it is ensured that no interference due to D/A or A/D conversion is present. Processing results of the Rx base-band blocks are read and compared to the results of bit-accurate simulations. The base-band loop-back allows to check the interoperation of the Tx and Rx without requiring any RF front-end. Nevertheless, a simple connection of the Tx outputs to the Rx inputs is insufficient since neither of the analog front-ends deal with I/Q (de)modulation. Thus, digital I/Q (de)modulation at an IF of 20MHz is implemented on the base-band platform using an 80MHz sampling clock. Hence, the real-valued signal after the D/A conversion is centered around 20MHz with a 20MHz bandwidth. An extensive evaluation of the precision of the different variables involved at the receiver side was performed using a fixed point C++ simulator. First, each variable is isolated by assuming that the rest has full precision. The number of bits used for the fixed-point representation is then adjusted in order to avoid error propagation and have reasonable performance degradation.
IV. OVERALL TESTBED PERFORMANCE: MEASUREMENTS IN THE WIRELESS SET-UP
The wireless testbed was operated in our lab, within a demonstration scenario aiming at emphasizing the performance improvement brought by multiple antennas in the WLAN context. The exact configuration of the demonstration scenario is depicted on Figure 4 , with the AP on the left and the MT on the right. Given the presence of lab furniture, it was expected that the propagation channel contained multipath. Realizations of the MIMO propagation channel are plotted on Figure 5 , both in frequency domain and time domain. The two curves are: the first channel response estimated during a given data acquisition period, and the last channel response of the same acquisition period, leading to the computation of a given Bit-Error Rare (BER) value. Figure 5 highlights several important characteristics of the propagation at hand in our demonstration context:
• During the data acquisition phase leading to the computation of a given BER value, the channel is static; • The MIMO propagation channel encountered is frequency selective, with attenuations up to 15dB from one carrier to another (power measurements); • The multipath does not create any additional inter-block interference, since it is completely absorbed by the 16 sample (800ns) guard interval; However, this guard period is not over-dimensioned for our scenario; • The 4 sub-channels display very different characteristics. The BER was measured with respect to the transmitted power for all supported antenna configurations, and the benefits of the proposed MIMO extensions appear clearly: up to 14.5 dB gain is measured for the 2Tx, 2Rx configuration with respect to the reference single antenna setup. These results have been measured for a 16QAM constellation and coding rate 1/2, which corresponds to a data rate of 24Mbps, and they are presented in Figure 6 . Some comments on these results are in order. First of all, the performance improvement when going from the 2×1 (MISO) configuration to the 1×2 (SIMO) configuration is 3dB, as expected. Second, a large reduction in Tx power required can be expected at a given BER level, if a large difference in slope is observed. Now, the slopes at the bottom part of Figure 6 appear to be quite similar for the different cases. It may be that this impression arises due to the possibility that the implemented receiver only starts to work well at lower BER. This is an effect that would raise the curves at higher BER levels, creating a steeper slope for the lower BER levels. Also, as mentioned earlier, each point on these curves has been obtained with a different channel realization, that probably did not move much over the 20 frames of data gathered to produce each point. This lack of averaging leads to some variations in the curves. It should be emphasized that the results obtained depend on the positioning of the antennas of MT and AP, though the results reported in Figure 6 correspond to a random positioning, that has not at all be chosen so as to lead to maximum gain. Several other positions have been tried, with less extensive measurement results, and in one other position a gain of 9dB was observed.
V. CONCLUSIONS
The measured results obtained with the 2 × 2 testbed confirmed the algorithmic choices made on the basis of analysis and simulations. In particular, the simple Alamouti code with standard channel coding and simple channel estimation provides impressive improvements for the 2 × 2 configuration over the 1 × 1 configuration in terms of robustness to the placement of receive and transmit antennas and general BER. This improved performance can be exploited to increase bitrate and/or decrease Tx power and/or increase range. Also, patch antennas were found to constitute a satisfactory lowcost solution, allowing for the integration of active RF components close to the antennas at the MT. The developed testbed demonstrates that interesting MIMO performance can be obtained with readily available hardware components and reasonable wordlengths and overall complexity. Experience with the testbed shows that the choice of antenna array (pattern and e.g. polarization diversity) is a critical issue and that the algorithms may be usefully adapted to the considered environment. 
