Abstracr -In this paper results of an IEEE 802.11a compliant low-power baseband processor implementation are presented. The detailed structure of the baseband processor and its constituent blocks is given. Additionally, the design flow is briefly described and synthesis and layout results are reported.
I. INTRODUCTION
Fourth generation (4G) wireless and mobile systems are today very attractive for research and development. New types of services will be universally available to consumers and for industrial applications with the use of 4G devices. Broadband wireless networks will enable packet based high-speed data transfer suitable for video transmission and mobile Internet applications.
This paper is based on the outcomes of a project that aims to develop a wireless broadband communication system in the 5 GHz band, compliant with the IEEE 802.11a [I] standard. This standard specifies broadband communication systems using OFDM (Orthogonal Frequency Division Multiplex) with data rates ranging from 6 -54 Mbitls. According to the standard. physical layer computational requirements can be achieved by the adequate digital baseband processing. Practical implementation of the IEEE 802.1 la compliant baseband processor can be done in several ways. In general, software based baseband processing can be done using either a multiprocessor system or a single DSP processor with a number of hardware accelerators. The standard defines very intensive computational activities, and possible software solution leads to an increased power dissipation. ming critical control functions a token-flow approach was adopted [2] . Every block in the baseband processor has an input signal, which indicates that valid data is ready for processing. A similar signal is generated by every block upon output to indicate that data can be processed by the subsequent block. The token-flow approach can easily be expanded with clock-gating. In this way an efficient power saving mechanism is implemented.
ARCHITECTURE OF THE BASEBAND PROCESSOR
A block diagram of the Baseband processor is shown in Figure 1 . In order to achieve low power dissipation and silicon usage the architecture is divided in three principle blocks: Transmitter, Receiver and EPP (Enhanced Parallel Port) block. By this division, the baseband processing can provide two almost independent dataflow directions: transmit and receive.
The EPP block is a standard interface block and is used to provide board communication with the higher MAC (Medium Access Control) layer. The main content of this block is 2
KByte buffer for the temporary data storage in the receive and transmit direction.
The transmitter block consists of an Input buffer, Scrambler, Signal field generator, Encoder, Interleaver, Mapper, Interpolation filter, 64-point IFFTlFFT (Inverse Fast Fourier Transform / Fast Fourier Transform) and circuitry for Pilot insertion (with pilot scrambler), Guard interval insertion, and Preamble insertion. The IFFT/FFT is an single block used in both, receive and transmit direction in order to optimize baseband processor smcture. On the other hand, this solution is more complex for implementation, because of incomplete decoupling between the transmitter and receiver datapath.
The standard [I] defines the procedure for the receiver and transmitter datapath processing. Fundamental issue, not tackled by the standard, is the mechanism of the synchronization and the channel estimation. The solution for this problem is one of the most important outcomes of this work. Channel estimation is based on a decision-directed method
[3] with simplified residual phase estimation and correction mechanism. This type of channel estimation is based on a feedback loop and our receiver for that reason involves additional encoding, interleaving and mapping (Fig. 2) . The interesting point in this concept is that it makes use of a division unit to correct the data samples (equalizer).
0-7803-7963-2/03/$17.00 02003 IEEE
Fig. 1 Block diagram of the Baseband processor
The estimator is designed in such a way that the samples of symbol i are used to calculate an estimation of the channel, which will be used to correct the symbol itD, where D is the delay introduced by the feedback loop. The synchronizer has to fulfill the following operations: frame detection, carrier frequency offset estimation, symbol timing estimation, extraction of the reference channel and data reordering. A block scheme of the synchronizer is given on Fig. 3 . In order to obtain a power efficient design, the synchronizer structure was split into two mutually exclusive paths: tracking data path and processing data path [4]. The main function of the tracking data path is to detect an incoming frame by searching for the periodic stntcture of the preamble symbols and to estimate the carrier frequency offset. In our design, a wide range of frequency offsets can be estimated (f8Oppm) using only two autocorrelators. The output of one of those is also used in the frame detection mechanism. This provides a significant core area reduction in comparison with other proposed solutions, as in [SI, where the range of estimated frequency offsets is f40ppm and three autocorrelators are used for the frame detection, but only two of them for the frequency offset estimation. Frame detection is performed by a plateau detector, which has to detect a specific plateau shape in the incomming premble symbols. The activity of the processing datapath starts after the frame is detected and the estimated value for frequency offset is available. This part of the synchronizer performs the carrier frequency error correction, estimates the symbol timing and obtains the reference channel estimation. It consists of an NCO (Numerically Controlled Oscillator, in this case a CORDIC processor operating in rotational mode), FFT processor and a simplified crosscorelator based on XNOR
Other blocks on the receiving path, mainly defined by the standard, are Decimator filter, Demapper, Deinterleaver, Descrambler, Viterbi decoder, and additional buffers.
In order to simplify processing of the data and optimize power consumption to the maximal extent, the complete structure was divided into three clock domains.
Computationally complex blocks without high data throughput requirements were designed for 20 MHz and high data throughput demanding circuits were designed for 80 MHz. 
IMPLEMENTATION
The complete baseband processor was modeled in VHDL and synthesized with our in-house 0.25 pm SiGe:C BiCMOS ' standard cell library. It's cell area, including all blocks from Figure I , is 24 mm2 (equivalent transistor count is 1.55 Million). In Table I some synthesis results for our technology are given, where Tx-* indicates transmitter components and Rx-* stands for receiver components. From Table I it can he noticed that the dominant hardware part is the receiver. It uses 52 % of the chip cell area. Generally, the most silicon consuming components are Synchronizer, Channel Estimator, Viterbi decoder and FFT/IFFT. According to postsynthesis power estimation with Synupsys Digital Compiler, the expected power consumption is 860 mW in the receive direction and 795 mW in transmit direction.
TABLE 1 BASEBAND PROCESSOR SYNTHESIS RESULTS
Expected power consumption is largely attributed to the pessimistic nature of the power estimation of the used CAD tool. From our expirience we expect even lower power consumption.
Layout of the produced Baseband chip is given in Figure 4 . This chip was floorplanned into four main layout blocks of about equal size: EPP (6.7 mm2), Transmitter (11 nun2), Receiver 20 MHz block (9 mm2) and Receiver 80 MHz block (10.4 mm2). Additionally there is one small block which performs clock division and clock distribution. This Baseband chip is currently in the process of the fabrication. After layout, silicon area for this design is 59 mm2, including pads. The number of pins is 107 and the core area is only 46 mm2. building blocks of the system was not possible to be given because of the lack of space. However, the general scheme is explained and synthesis and layout results are presented. Circuitry, described here, is designed for IHP in-house 0.25~111 BiCMOS technology and is currently under fabrication.
