



Sun, Y., Nix, A. R., Milford, D. J., & Bull, D. R. (2000). Low complexity synchronisation, equalisation and diversity combining for home-based Hiperlan/1 transceivers. In Proceedings of the 51st IEEE Vehicular Technology Conference, Tokyo, Japan. (Vol. 3, pp. 2242 - 2246). Institute of Electrical and Electronics Engineers (IEEE). 10.1109/VETECS.2000.851671

Link to published version (if available): 10.1109/VETECS.2000.851671

Link to publication record in Explore Bristol Research PDF-document

# University of Bristol - Explore Bristol Research General rights

This document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Full terms of use are available: http://www.bristol.ac.uk/pure/about/ebr-terms.html

## Take down policy

Explore Bristol Research is a digital archive and the intention is that deposited content should not be removed. However, if you believe that this version of the work breaches copyright law please contact open-access@bristol.ac.uk and include the following information in your message:

- Your contact details
- Bibliographic details for the item, including a URL
- An outline of the nature of the complaint

On receipt of your message the Open Access Team will immediately investigate your claim, make an initial judgement of the validity of the claim and, where appropriate, withdraw the item in question from public view.

## Low Complexity Synchronisation, Equalisation and Diversity Combining for Home-Based Hiperlan /1 Transceivers

Y. Sun\*, A. Nix, D. Milford and D. Bull

CCR, University of Bristol Room 4.09, Merchant Venturers Building Woodland Road, Bristol BS8 1UB, UK

Tel: +44 117 954 5169, E-mail: Andy.Nix@bristol.ac.uk

Abstract: This paper discusses the development of a Hiperlan /1 radio modem for the simultaneous transmission of video, voice and data in the home environment. In particular, the performance of the wireless modem has been optimised to support MPEG II video streams. The modem will eventually be integrated with digital set-top box equipment to support wireless Internet and digital television applications around the home.

This paper describes a number of enhanced architectures and algorithms that enable an efficient FPGA implementation of the baseband digital equaliser, synchroniser and diversity combiner. Results show that the entire baseband receiver can easily fit within a single Xilinx Virtex device and that error-free packet rates in excess of 99% are possible, even in worst case radio channels.

#### I. Introduction

The use of Hiperlan /1 technology for the wireless distribution of voice, video and data is an innovative concept offering mass-market applications. Integrating a highspeed wireless modem into a digital set-top box offers the possibility of high volume products. These large volumes can then be used to drive down the price of the radio technology and thus broaden its application. In this paper we promote the development of multi-media services by bringing into the home a flexible and powerful wireless data network. The ESPRIT WINHOME (Wireless INnovation in the HOME) project has constructed a comprehensive set of multi-media demonstrators based on the idea of wireless home distribution for video and Internet services using a 23.5 Mb/s serial modem.

Hiperlan /1 represents the best wireless LAN standard currently available for the home distribution of digital video. The standard enjoys a large European bandwidth allocation in a dedicated and unlicensed band (5.15-5.3 GHz). Hiperlan products are also feasible for operation in North America and Japan. The lack of Hiperlan /1 products has demonstrated the non-trivial aspect of commercial implementation. This is particularly true when it comes to an economic implementation of the high-speed modem. In particular, there is a strong need to develop a low-cost equaliser/synchroniser. A wide range of equaliser techniques were investigated in WIN-HOME. Using these studies, the consortium chose to develop an adaptive Decision Feedback Equaliser (DFE) to eliminate the harmful effects of Inter-Symbol Interference (ISI).

A standard DFE is a considerable implementation challenge at bit rate in excess of 20 Mb/s. Within WIN-HOME, a Delayed Least Mean Squares (DLMS) equaliser architecture was proposed together with a novel realerror driven adaption algorithm that offers low complexity and high performance [2]. Hardware implementation has been performed and complete integration is expected in the first quarter of 2000. A full VHDL description of the high-speed equaliser/synchroniser is presented in this paper for implementation in FPGA. A number of important design considerations and structural modifications are also proposed to ease implementation and ensure high quality performance. Issues such as time and frequency synchronisation, signal quantisation and pipeline architectures and algorithms are discussed. To verify our designs, the results of the final bit-level VHDL simulation have been compared with earlier theoretical and high-level simulation results. Timing simulations have also been performed based on the latest Xilinx VIRTEX process. Chip-level optimisation and complexity estimates are also studied and presented in this paper.

A number of further antenna enhancements have also been studied within WINHOME. In particular, the feasibility of down converting two RF signals from a pair of spaced antennas. Using optimal combining at baseband, the proposed diversity scheme has been shown to significantly improvement modem performance.

#### **II.** General System Description

The WINHOME project aims to develop a fully functioning wireless LAN demonstrator for the home environment. The prototype is optimised for video transmission and makes full use of state-of-the-art research in physical layer equalisation and error resilient video transcoding. The block diagram for the hardware demonstrator is shown in figure 1.



Figure 1: Block diagram of demonstrator

The hardware can be broken down into three main sections, the network card (Thomson CSF), the baseband modem (University of Bristol) and the RF and antenna (Grundig). The first version of the main board is shown in figure 2. Since the instantaneous gross 'air rate' is approximately 24 Mb/s, a single general purpose Digital Signal Processor (DSP) is not fast enough to support the

<sup>\*</sup> Y.Sun is now with the Telecommunications Research Lab of Toshiba Research Europe Ltd, Bristol, UK.

<sup>0-7803-5718-3/00/\$10.00 @2000</sup> IEEE.

high speed equalisation required. Furthermore, pipelining methods to enhance throughput and reduce clock rates are not feasible for DSP solutions.



Figure 2: Main board implementation

The realisation of the final WINHOME design was fully produced in VHDL for FPGA implementation. The design of the synchro-equaliser is shown in figure 3.



Figure 3: Block diagram of designed synchro-equaliser

The core of the design is based around the DLMS DFE. In general, the DFE has been implemented using a pipelined architecture that was made possible using the *delayed* LMS algorithm (see section V). Several improvements have been implemented in this equaliser including a pipelined architecture for the feedforward filter, a transposed transversal filter structure for the feedback filter, a real-error DLMS adaption algorithm and a double frequency complex multiplier structure.

Synchronisation is a very important aspect of the modem and must be kept as simple and robust as possible. The synchronisation process is based on the complex output from a sequence correlation circuit. The method proposed in [2] works well in theory, however in practice it is complex to implement and suffers from phase ambiguity. To overcome these limitations an enhanced double frequency fully complex correlation method has been developed (see section III). To compliment this approach, a simplified phase calculation architecture has been implemented for the purposes of frequency offset correction (see section IV).

### **III. Enhanced Synchronisation Architecture**

The implementation of time and frequency synchronisation is based on three main sections (complex correlation, magnitude calculation and peak detection) as shown in figure 4. For GMSK, a 31-symbol complex correlation can be implemented using two real filters [2].



Figure 4: Synchronisation implementation

However, problems with the real and imaginary timing elements make some aspects of the required bit toggling difficult to achieve without the introduction of phase ambiguity. In the final WINHOME correlator design an enhanced complex correlation solution was used based on double frequency multipliers as shown in figure 5.



Figure 5: Fully complex correlation design

As shown above, the received signals are sampled at the symbol clock rate (24 MHz). However, the toggling between the real and imaginary parts is performed at twice the symbol rate – i.e. a double frequency of 48 MHz. After the two real filters, four outputs are combined to realise the fully complex correlation. The real filter architecture is demonstrated in figure 6.



Figure 6: Real filter architecture for correlation circuit

The implementation of the real filter shown in figure 6 has been carefully designed to minimise the number of multipliers and thus reduce the resulting chip-size. The multipliers used in figure 6 are real and operate at twice the bit clock frequency (less then 20 ns in our final implementation). Using the *shift* (5-bit) function, all adder outputs are truncated to the same 8-bit resolution. The adders also operate at double frequency to produce two alternative output streams. Since a conventional complex correlation process requires four real filters [3], the WINHMOE architecture halves implementation size.

The magnitude of the complex correlation is conventionally computed as  $\sqrt{V^2+W^2}$ , where V and W represent I and Q samples. However, since this requires two real multiply operations and a square root, the magnitude-squared function is often preferred. Furthermore, the so-called four-region approximation may be sufficiently accurate for the simple purpose of synchroniser peak detection. This is defined as:

$$mag = \max(G, 7/8G + 1/2L)$$
 (1)

where G and L denote the Greater and Lesser of  $\{IVI, IWI\}$ . A suitable architecture for this function is given in Figure 7 to implement the magnitude calculation. The implemented performance for the final synchronisation circuit is shown in figure 8 (right), where the left-hand figures shows the ideal theoretical result for the purposes of comparison.

0-7803-5718-3/00/\$10.00 ©2000 IEEE.



Figure 7: Implementation of magnitude calculation

The implementation results are plotted below for the output magnitude calculation.

| Theoretical Result                                                      |                     | Implementation Results                                                                                         |
|-------------------------------------------------------------------------|---------------------|----------------------------------------------------------------------------------------------------------------|
| 05                                                                      |                     |                                                                                                                |
| مراجع میں معرف میں معرف میں اور میں |                     | าวว่า การสองสองสองสองสอง                                                                                       |
|                                                                         | _اوه 😤 البناي       |                                                                                                                |
| <u> </u>                                                                | о`во ую као ``o<br> | 0 10 10 40 10 00 10 10 10 10                                                                                   |
| 29                                                                      | - +                 |                                                                                                                |
| and white white                                                         | \~A/A¦A,-  €        | the hard the                                                               |
| 0 0 0 0 0 0 0 7                                                         |                     | x x x x x x x x x x x x x x x x x x x                                                                          |
|                                                                         |                     |                                                                                                                |
|                                                                         |                     |                                                                                                                |
|                                                                         |                     | المعالية المساجعة ال |
| 0 10 20 30 40 50 70 X                                                   | u \$5 90 µ00 0 4    | 10 20 10 40 50 10 70 10 50 100                                                                                 |
| lterations                                                              |                     | Rerations                                                                                                      |

Figure 8: Implementation results of synchronisation

Once the correlation has been performed, the optimum time-synchronisation point must be determined by searching for a global peak. This can realised using a simple comparator to check the magnitude-squared output from the correlator at each sample period against a predetermined threshold. The threshold level should be set above the worst-case sidelobe levels exhibited by the cross correlation sequence. Once the threshold level is exceeded a peak detection algorithm must be implemented within a fixed time window. Since the home wireless channel generally experiences rms delay spreads less than 50 ns (excess delays less than 250ns), the time window in the design was set at 8 symbols periods (~340 ns).

## **IV. Frequency Offset Correction**

The Hiperlan /1 high bit rate synchronisation sequence comprises five 31-bit m-sequences. Each m-sequence is repeated three times, starting with m1 as shown below.



Figure 9: Synchronisation sequence and its application

From figure 9, it can be seen that there are a total of 14 full m-sequences that are suitable for synchronisation, channel estimation and coarse frequency offset detection.

In order to simplify the practical WINHOME implementation, only the first ml sequences are used for timing synchronisation and coarse frequency offset correction (the remaining sequences are used for equaliser training). Figure 9 also shows the timing of the two mlsequence correlation peaks obtained using this scheme. The duration between the two peaks will be 62 symbol periods. The maximum possible frequency offset in Hiperlan /1 is 104 kHz (10 ppm for the Tx and Rx at a carrier frequency of 5.2 GHz), which equals  $1.59^{\circ}$  per symbol. Therefore, the maximum phase rotation during the 62 symbols is  $\pm 98.58^{\circ}$ . Since this angle is much less than 360 degrees it avoids the possibility of ambiguity.

At each of the detected correlation peaks the channel phase must be calculated from the I-Q correlation samples. The phase calculation is based on equation (2).

$$\theta = \arctan(y/x) \tag{2}$$

where x and y represent the values of the real and imaginary components respectively.

Conventional implementation of equation (2) needs a division operation combined with a look up table (LUT). The hardware realisation for this approach is high. In the WINHOME design a 'Cordic' rotation is used to obtain the phase estimation. The method relies on the 2-D rotation of the peak-detected vector towards the x-axis with a pre-set iteration step (e.g.,  $3.6^{\circ}$  or  $1.8^{\circ}$  depending on the accuracy required). Once the rotated vector reaches the x-axis, the value of  $\theta$  in equation (2) can be estimated from the number of fixed iteration steps required. The implementation block diagram is shown in figure 10, where the *shift* block performs a 4-bit shift equating to a pre-set iteration step of approximately  $3.6^{\circ}$ .





The VHDL implementation of the above phase calculation requires just 108 CLB slices using the Xilinx Virtex device. The critical part of the circuit shown in figure 10 has just a single adder and a single shift. Hence, the speed of the calculation is mainly dependent on the clock rate. With the 'sort' and 'Transfer function' approach shown in figure 10, the actual phase region for hardware operation lies between  $0^{\circ}$  and  $45^{\circ}$ . Thus, with a step size of  $3.6^{\circ}$ , the maximum number of iterations will be 12.

The implementation performance of the phase calculation circuit is shown in figure 11, where the right-hand figure shows the error property with an iteration step of  $3.6^{\circ}$ . More accurate calculations can easily be achieved with a decrease in the iteration step size (e.g.,  $1.8^{\circ}$ ).



Figure 11: Performance of phase calculation circuit

VTC2000

The proposed simplified phase calculation is now performed at each of the two correlation peak points and the coarse frequency offset estimated from the phase difference. The performance of the coarse frequency offset detection algorithm is shown in figure 12.



Figure 12: Coarse frequency offset detection

The performance was achieved assuming a Hiperlan /1 modem operating with an average Eb/No of 17 dB and a channel rms delay spread of 40 ns. The average coarse frequency offset detection error can be seen to be less than 2 kHz. This value is sufficiently small to enable equaliser training *after* coarse correction.

## **V. DLMS DFE Equaliser Implementation**

The design/entity hierarchy for the FPGA synchroequaliser is presented in figure 13, where 'syctfqof' represents the entity for synchronisation and frequency offset correction and 'eqblkbox' the entity for the DLMS DFE equaliser.



Figure 13: Design/entity hierarchy of synch-equaliser

The DLMS equaliser is based on a DFE architecture with 6 feedforward filter (*FFF*) taps and 5 feedback filter (*FBF*) taps. The feedforward filter is implemented using the pipelined architecture shown below [2].



In figure 14,  $fRe_V$  and  $fIm_V$  represent the input I and Q signals,  $Re_fc^*$  and  $Im_fc^*$  the feedforward coefficients and  $SI \sim S30$  the mapping nets in the VHDL design. The multipliers are complex and implemented using the double-frequency architecture shown in figure 15.



Figure 15: Double frequency complex multiplier

The proposed double frequency complex multiplier requires just 58 CLB slices, however the latency becomes 2 symbol periods. Therefore, the *FFF* has a total latency of 4 symbol durations. The output of the complex multipliers and adders are limited to an 8-bit resolution to trade-off dynamic range for reduced area size.

The implementation block diagram for the feedback filter is shown in figure 16. This uses a transposed transversal filter to avoid latency. For the FBF latency cannot be tolerated and all operations must be complete within a single symbol period.



Figure 16: Implementation of FBF architecture

In the FBF, the structure of the GMSK modulation format can be exploited by generating a real error for use with the real-error LMS algorithm [2]. This modification simplifies the resulting error generation and filter tap update processes in addition to improving the overall performance (see figure 18).

The resulting implementation performance for the synchro-equaliser described here is shown in figure 17 for (a) an ideal channel and (b) a worst-case channel. In an ideal channel rapid convergence is achieved and the resulting constellation diagram shows two distinct levels in I and Q. In the worst case channel there is significant ISI prior to equalisation and the resulting constellation diagram shows samples points scattered all over the I-Q plane. After the training process two distinct I and Q are achieved. Distinct levels, rather than precise phase points, are generated because of the use of the real-error scheme, which only constrains either the I or Q signal at any given sample point [2].

Figure 17 confirms the performance of the final VHDL equaliser design in a low received power, high delay

0-7803-5718-3/00/\$10.00 ©2000 IEEE.

VTC2000

spread channel. Under such conditions, the modem is expected to suffer from a small residual packet error rate, somewhere in the region of 4-5%. While for most applications this will be corrected using the Hiperlan /1 ARQ protocol, within WINHOME the use of antenna diversity has been studied to further enhance performance.



Figure 17: Performance of synchro-equaliser

Using the synchro-equaliser described previously, simple switched antenna diversity can be easily implemented with small modifications. There are two main diversity schemes proposed for future WINHOME use: (i) power based switched diversity using the real-error LMS algorithm; (ii) DFE internal dual antenna diversity combining [5]. Assuming that dual antenna down-conversion is realisable at reasonable cost (which appears to be the case) then the existing synchro-equaliser design can be used to implement both diversity schemes.

Figure 18 shows the simulated packet throughput without ARQ for the various equaliser-diversity structures (a channel rms delay spread of 40ns was used for these simulations). The results indicate that a 99% error-free packet throughput can be achieved using the simple switched diversity scheme. Using the more complex dual down-conversion DFE combined architecture, a value approaching 100% can be achieved.





0-7803-5718-3/00/\$10.00 ©2000 IEEE.

### **VI.** Conclusions

In this paper a detailed design for a DLMS DFE(6,5)equaliser has been presented. Through a combination of a real-error scheme, feedforward filter pipelining, I-Q signal quantisation, double frequency multipliers and advanced antenna diversity the results demonstrated the viability of a low complexity, high performance modem. The issue of synchronisation was shown to be critical in the development of an equaliser for Hiperlan /1. Here we have demonstrated how the training sequence can be used together with a complex correlator to obtain accurate timing and phase information. The correlator was shown to be more complex for GMSK, and unless care is taken there is the possibility of phase ambiguity at the correlation peak. Through the use of a real filer correlation architecture, a magnitude approximation algorithm, and the Cordic approach for phase calculation, a low complexity complex correlator was implemented. Based on the phase difference between two complex correlation peaks, the degree of phase rotation was estimated and a coarse frequency offset estimated. Incoming training data was then corrected and fine frequency tracking was performed using LMS data-derived updates. Simulation results from our VHDL implementation verified the accuracy and relatively low complexity of this approach.

The Delayed LMS feedforward and feedback filter architectures have been described and the use of double frequency real multipliers proposed to reduce area size. VHDL simulations showing equaliser convergence and the resulting constellation diagrams have confirmed the correct operation of these circuits.

Finally, to further improve performance, the addition of antenna diversity was considered. Here results indicate that packet throughputs well in excess of 99% are possible (compared with 94% for a standard receiver).

The final WINHOME synchro-equaliser design was implemented in FPGA using approximately 30% of the Xilinx Virtex capacity. This result demonstrates the practicality and commercial viability of high-speed equalisation providing careful design considerations are applied.

#### **VII.** Acknowledgements

The authors would like to thank all the members of the WINHOME consortium: Thomson CSF (Fr), Grundig (UK), SCT (UK) and Eutelsat (Fr) for their support during the implementation of this work. This study was performed as part of the ESPRIT WINHOME project (25048).

#### **VIII. References**

- "WINHOME 25048: Wireless Innovation for Multimedia in the Home", Project Programme, August 1997.
- [2] Y.Sun, A.R.Nix, et al, "Design of a Novel Delayed LMS Decision Feedback Equaliser for HIPERLAN /1 FPGA Implementation", VTC'99, USA.
- [3] J.G. Proakis, Digital Communications (Third Edition), USA, McGraw-Hill Int. Editions, 1995.
- [4] ETSI HIPERLAN Type 1 specifications, ETS 300 652, October 1996
- [5] Y.Sun, A.R.Nix & J.P.McGeehan." HIPERLAN Performance Analysis with Dual Antenna Diversity and Decision Feedback Equalisation", VTC'96, Atlanta, USA.