Abstract-This paper describes the implementation of a digital compensation scheme, called CSAD, for correcting the effects of wideband gain and phase imbalances in dual-branch OFDM receivers. The proposed scheme is implemented on a Xilinx Virtex-4 field programmable gate array (FPGA). The flexible architecture of the implementation makes it readily adaptable for different broadband applications, such as DVB-T/H, WLAN, and WiMAX. The proposed correction scheme is resilient against multipath fading and frequency offset. When applied to DVB-T, it is shown that an 11-bit arithmetic precision is sufficient to achieve the required BER of 2x10 -4 at an SNR of 16.5 dB. Using this bit-precision, the implementation consumes 1686 Virtex-4 slices equivalent to about 42600 gates.
I. INTRODUCTION
dual-branch direct-conversion receiver architecture is attractive as it supports very high level of integration. With this architecture, radio frequency signals are downconverted to in-phase (I) and quadrature (Q) signal components in one single step. However, for correct operation, the I and Q signal branches must maintain equal gain and quadrature phase.
In wideband applications, precise matching of gain and phase is very difficult if not impossible to achieve over the entire signal bandwidth. As a result, this gives rise to frequency dependent gain and phase errors, known as wideband IQ imbalances. Moreover, direct-conversion receivers require very large baseband amplification, often in excess of 60 dB, before signal digitization and demodulation. This large gain requirement makes accurate matching of gain and phase even more difficult to achieve. In practice, a well designed wideband dual-branch receiver is likely to encounter average frequency dependent gain and phase imbalances in the order of 0.5 -1 dB and ±3°-5º, respectively [1] . When such a receiver is used to demodulate a high order modulation Rajitha B. Palipana is with the Department of Electrical and Computer Engineering, Curtin University of Technology, Perth, Australia (e-mail: r.palipana@curtin.edu.au).
Kah-Seng Chung is with the Department of Electrical and Computer Engineering, Curtin University of Technology, Perth, Australia (e-mail: k.chung@curtin.edu.au) scheme, such as 64-QAM, the resulting bit error rate will be degraded. For example, the presence of 1 dB gain and 5° phase deviations will give rise to about 2.5 dB degradation in SNR when 64-QAM is adopted in conjunction with an error correction code of rate 2/3 [2] . This degradation in SNR increases to over 10 dB when a code rate of 7/8 is used [2] .
This problem of gain and phase imbalances needs to be overcome in order for a dual-branch receiver to successfully operate with high order modulation schemes. There are a number of IQ compensation techniques published in the literature, but most of them target a single particular application [3, 4] . It is envisaged that future applications will call for a single receiving device to be able to operate with different modulation schemes, preferably with only software modifications. This suggests that any proposed architecture for the mismatch compensation scheme must be sufficiently flexible to allow such changes to be made through minor modifications in software. Furthermore, such a compensation scheme has to be robust. One such compensation scheme, called complex symmetric adaptive de-correlation (CSAD), has been proposed and analyzed in [5] . This paper focuses on the hardware implementation of this gain and phase imbalance compensation scheme. This paper is organized as follows. Section II provides an introduction to the effects of receiver impairments in a direct conversion receiver operating with an orthogonal division frequency division multiplexing (OFDM) signal. This is followed by a methodical approach for correcting such adverse affects. Section III describes the hardware implementation of the CSAD scheme. Finally, results obtained through hardware-in-the-loop testing are presented in Section IV.
II. CORRECTION OF FRONT-END IMPAIRMENTS
For a direct-conversion receiver, the discrete time representation of the received baseband OFDM signal in the presence of multi-path, AWGN, frequency offset, and gain and phase imbalances can be expressed as [5] are indeed uncorrelated even in the presence of frequency offset and multipath fading. According to [5] , the CSAD filter algorithm operates as follow:
I.
Obtain the signal estimates
II.
Update the weights
Note that the bold face letters denote vectors of length L , where L is the length of the de-correlating adaptive filters 1 
Now, assume that the frequency offset f ∆ is estimated accurately, its effect on 1 . The resulting signal becomes h n is small and N is large. After this operation, the signal d 1f [n] is given by 1 
In the case that the length of the impulse response,
h n h n ⊗ is less than the length of the cyclic prefix of the OFDM signal, an estimate of the original signal [ ] c n can be recovered using a pilot based LMS equalizer [5] . Figure 2 shows the various functional blocks for the FPGA implementation of the CSAD filter. The inputs to the CSAD filter, denoted by 'in-phase signal in' and 'quadrature signal in', are the real and imaginary signal components of 1 [ ] r n , which is the received signal that has been corrupted by a combination of gain and phase imbalances, frequency offset, AWGN and multi-path fading. As given by (2.a), the desired signal 1 real multiplications. In order to reduce the actual number of multipliers needed, the same multipliers are reused. Suppose that the sampling period is T s , and 4L real multiplications are required to be performed within this period. Now, by upsampling the signals by a factor of L, a multiplication can be carried out within a time period of T s /L. In this case, instead of 4L multipliers, this allows the use of only four multipliers, one each for calculating the real-real, imaginary-imaginary, realimaginary, and imaginary-real products involved in the multiplication of two complex quantities. Each of the four 'convolution sum' blocks, shown in Fig. 2 , contains an addressable shift register, a multiplexer, a real multiplier and an accumulator for calculating the convolution sum. Next, consider the weight update operation, as given in (2.c). To obtain the updated weights at time index n+1, it requires knowledge of 1 In view of this observation, the computation of 1 [ 1] n + w involving the evaluation of 1 Normally, gain and phase imbalances in a receiver are relatively time invariant. However, being an adaptive system, the weights of the CSAD filter will continue to be influenced by the instantaneous values of the received signal. When a large step size is used, the adaptive weights tend to vary somewhat about the average convergent value. This variation in weights can degrade the BER performance and make the signal equalization difficult. A flywheel or averaging circuit has therefore been implemented to smoothen out the instantaneous variations in the adaptive weights. The resultant smoothened weights are used to correct the signal while the weight updates continue to operate using the instantaneous values. A multiplexer is used here to switch between the instantaneous and averaged weight values.
The operation of the CSAD filter does not require synchronization of the OFDM signal to be achieved. Also, its insensitivity to the presence of frequency offset and multi-path fading suggests that the CSAD filter can operate directly on the digitised outputs of the down-converted in-phase and quadrature components of the received signal.
Often, the accuracy of a frequency offset estimator is affected by the presence of gain and phase imbalances, especially when a short cyclic prefix is used to estimate the frequency offset. In this situation, it is beneficial for the gain and phase imbalances to be first compensated using the CSAD filter prior to estimation of any residual frequency offset [5] .
IV. RESULTS
The setup used for the hardware-in-the-loop testing is shown in Fig. 3 . The test signal used is DVB-T operating in the 2k mode with non-hierarchical 64-QAM modulation and a convolutional code rate of 2/3 [6] . Pilot insertion, inverse Fourier transform and cyclic prefix attachment are carried out in the OFDM TX block. Baseband in-phase and quadrature signal components, corrupted by IQ imbalances and frequency offset, are saved in two data files along with the timing information. These files are accessed by the FPGA through a JTAG interface.
The CSAD filter is implemented on a Xilinx ML401 development board [7] using Xilinx System Generator v8.1 and Xilinx ISE foundation edition 8.1. After the in-phase and quadrature components have been processed by the CSAD filter, the resultant signals are output via the JTAG interface and stored in another data file on a PC. This data file is then imported to Matlab where frequency offset correction (FOC), cyclic prefix removal, FFT and channel equalization (CEQ) are carried out. The correction of the frequency offset is performed by multiplying the CSAD filtered signal by 2 / s j fn Nf e π∆ . The residual scaling introduced by the CSAD filter and the multi-path channel is then corrected by a frequency domain LMS filter [5] . The computational complexity of the CSAD filter is proportional to the number of adaptive weights, L. According to [2] , the average BER performance is relatively insensitive to L once it is chosen to be larger than a certain minimum value. For practical values of gain and phase imbalances, L = 3 is adequate. For the test, the gain and phase imbalances are allowed to vary by 1.5 dB ± and 10 ± , respectively, over the DVB-T signal bandwidth of 7.61 MHz. These variations are produced by combining the constant gain and phase imbalances of 1 dB and 5º together with the frequency dependent values derived from 1 2 ( ) 0.01 0.01
In the implementation, the performance of the CSAD filter as a function of bit-precision is studied using the average BER as the performance metric. In this case, all the mathematical operations required by the CSAD filter are carried out with the same bit-precision. Fig. 4 shows the average bit error rates achieved with different values of bit precision, ranging from 8 bits to 16 bits, in the presence of AWGN. For quasi-error free reception in DVB-T, it is specified that the bit error rate after Viterbi decoding has to be less than 2×10 -4 at an SNR of 16.5 dB (Table A .1 in [6] ). This level of performance could just be achieved using a wordlength of 11 bits.
Next, the performance of the CSAD filter operating in the Ricean channel as given in the DVB-T standard [6] is investigated in the presence of a frequency offset close to half the sub-carrier spacing, i.e., 2000 Hz. The BER curve obtained after the corrections of the gain and phase imbalances, and the frequency offset is shown in Fig. 5 . When compared with the BER curve obtained assuming no circuit impairments, the degradation in SNR is less than 0.3 dB at the required BER of 2×10 -4 . For this example, the CSAD filter is implemented with 11-bit precision.
To provide an indication on the hardware complexity of the CSAD filter, the Xilinx tool called XFLOW is used to obtain the gate count for implementing a given bit precision. Table 2 shows the number of lookup tables (LUTs) and flip-flops used by the design for four different precision levels, ranging from 11 bits to 16 bits. It is shown that a 12-bit implementation consumes only about 10 % of the gate resources offered by the Virtex-4 FPGA, which has a total of 21504 LUTs and 21504 flip-flops.
V. CONCLUSIONS
This paper describes an FPGA implementation of a digital IQ imbalance compensation scheme based on an CSAD filter. It is shown that the required BER of less than 2×10 -4 specified for DVB-T reception can be met when the CSAD filter is implemented with a bit precision of at least 11 bits. This is achieved in the presence of severe gain and phase imbalances of 1.5 dB ± and 10 ± , respectively, varying over a bandwidth of 7.61 MHz. Furthermore, the operation of the CSAD filter does not depend on the synchronization of the OFDM signal being achieved. In fact, the CSAD can operate on any Gaussian like signals, including those OFDM signals associated with WLAN and WiMAX. 
