Abstract: A low-voltage low-power analogue-baseband chain designed for IEEE 802.11a/b/g wireless local-area network (WLAN) receivers is described. It features architecturally a 'two-step channel selection' to complement the radio front-end, and a flexible intermediate frequency (IF) reception capability to alleviate the cancellation of frequency and DC-offset. In circuit implementation, a double-quadrature downconverter based on a 'series-switching' mixerquad realises a wideband-accurate I/Q demodulation. A 'switched-current-resistor' programmable-gain amplifier (PGA) minimises the bandwidth variation and transient in gain tuning by stabilising, concurrently, the PGA's feedback factor and quiescent-operating point. An 'inside-OpAmp' DC-offset canceller creates area-efficiently a very low cut-off frequency high-pass pole at DC while providing a fast settling of DC-offset transients. Fabricated in a 0.35 mm complementary metal-oxide semiconductor (CMOS) process without resorting to any specialised device, the prototype consumes 14 mW per channel at 1 V. The transient time in a 52-dB gain step is ,1 ms and the stopband rejection ratio at 20/40 MHz is 32/90 dB. The error vector magnitudes are 227 and 217 dB for 802.11a/g and b modes, respectively.
Introduction
The widespread adoption of wireless local-area network (WLAN) in the past few years has led to the development of multi-band multi-mode WLAN transceivers [1] , such that the most accepted protocols (e.g. IEEE 802.11 family [2] ) are capable of being supported by a single terminal. Although WLAN system-on-chip (SoC) solutions [3] have been successfully developed in a complementary metaloxide semiconductor (CMOS) to address this demand, the silicon area (45 mm 2 ) and power dissipation (400 mW) are currently still sizable for many embedded systems. Dominated by the digital logic and memory that perform various system-level functions in the physical layer, WLAN SoCs are indeed capable of taking advantage of nanoscale technologies for cost and power reduction [4] . Yet, the thinner gate oxide of nanoscale transistors poses unprecedented challenges to the design of the analogue parts. While a sub-1 V supply needs to maintain device reliability, a relatively large threshold voltage is also necessary to limit the leakage current. A continuous drop of supply-to-threshold voltage ratio is therefore anticipated, rendering the re-use of conventional solutions, which were developed formerly without that precaution, inefficient.
The research basis of the present work lies on the possibility of developing multi-mode transceiver architectures as well as low-voltage operational circuits that will be useful for the full integration of multi-band multi-mode wireless systems [5] in nanoscale technologies in the future. In this paper, we report the design and implementation of an experimental 1-V receiver-analogue-baseband (BB) chain that was designed in compliance with the IEEE 802.11a/b/g standards. Novel architecture and circuit techniques are proposed for CMOS integration, to meet the standard requirements and to overcome the challenges associated with the need of low-power operation under low-voltage constraints.
Although the fabrication remains at 0.35-mm CMOS (V T,n ¼ 0.52 V, jV T,p j ¼ 0.65 V) the reinforced techniques used lead to the implementation of the lowest-voltage receiver-analogue-BB chain ever reported [6] [7] [8] [9] [10] [11] . Moreover, the achieved 14-mW power consumption is also competitive with a state-of-the-art solution [10] designed in 90-nm CMOS, which consumes 13.5 mW at 1.4 V. Consequently, the overall architectural techniques presented and experimentally demonstrated here hold the promise of continuously being effective in the upcoming sub-1 V CMOS technologies [12] .
Section 2 briefly summarises the basic features of the IEEE 802.11a/b/g WLAN standards. Section 3 analyses the standards and presents the proposed architectural techniques. The realised analogue-BB chain and its circuit implementation are described in Sections 4 and 5, respectively. Finally, the experimental results are reported in Section 6.
IEEE 802.11a/b/g WLAN standards
The IEEE 802.11 family is dedicated to high-speed WLAN communications. Currently, the most relevant physical layers are the 802.11a/b/g. The basic 802.11 mode is seldom used today, although the latest 802.11n will be ratified soon. This work is focused on the 802.11a/b/g standards. The 11a is based on the orthogonal frequency-division multiplexing (OFDM) technique. It can deliver a high data rate up to 54 Mb/s by using the 64-quadrature amplitude modulation (64-QAM) in the 5-GHz band. For the 11b, it operates in the 2.4-GHz band. The complementary code keying (CCK) modulation can deliver a maximum data rate of 11 Mb/s. A mix of the previous mentioned 11a and b is obtained with the 11 g, which supports a data rate up to 54 Mb/s by using the OFDM technique with 64-QAM modulation in the 2.4-GHz band, and it is backward compliant with the 11b for lower data-rate options. Their main characteristics are tabulated in Table 1 .
3
Proposed architectural techniques
Flexible-IF reception for multi-standard compliance
Architecturally, the receiver radiofrequency (RF) front-end experiences no difference between the zero-intermediate frequency (ZIF) and the low-IF (LIF) downconversion. Thus, the analogue-BB chain can flexibly choose the best fit IF for each mode of operation. In this paper, a ZIF-LIF mixed solution is proposed and the justifications to use it are the following. ZIF is well suited for the 11b/g-CCK mode because of the wideband nature of the channel. DC-offset and 1/f noise can be simply removed by using high-pass filters (HPFs) throughout the BB chain. However, it is not that straightforward for the 11a/g-OFDM mode. An improper choice of the highpass pole frequency may result in a significant distortion of those close-to-zero subcarriers. Alternatively, a non-zero IF (e.g. 10 MHz for 11a, 12.5 MHz for 11 g) appears to be more effective since the image-rejection ratio (IRR) requirement at such a LIF value is still practical to achieve (i.e. 30 dB) for an error vector magnitude (EVM) of 225 dB. In addition, a LIF can alleviate the trade-offs encountered by DC-offset cancellation. This idea is illustrated by comparing the ZIF (Fig. 1a) and þLIF-to-BB (Fig. 1b) downconversion of an OFDM channel. First, comparing with the ZIF-HPF, the cut-off frequency of the LIF/HPF1 can be highly increased, leading to significant area saving, while shortening the receiver settling time in DC-offset transients. Second, since the tolerable instability of the reference crystal is +20 ppm, a mixed-mode automatic frequency control (AFC) would be essential for ZIF [13] . The AFC digitally estimates the frequency error (which is as high as 214 kHz at 5.35 GHz), and then compensates it in the analogue domain by offsetting the frequency of the RF local oscillator (LORF) in the same amount. In contrast, a LIF can endure much larger frequency errors at HPF1 (i.e. 3.75 MHz). The compensation is therefore possible to be transferred to the IF LO (LOIF), which benefits from the simplicity of a much lower operating frequency ( 12.5 MHz).
BB-signal conditioning for cost-efficient reconfiguration
The efficiency of the proposed flexible-IF reception is critically determined by the permutation of the functional blocks. The system partition of the proposed receiver architecture is presented in Fig. 2a , and its block-level operation is illustrated in Fig. 2b . The RF channels after the RF-to-IF downconversion are filtered by a tunable centre frequency (i.e. at þIF, -IF or DC) complex filter [5] . Afterwards, an IF-to-BB downconversion is performed, placing the desired channel at DC before filtering and amplification. With such a permutation, the specifications of the channel-selection low-pass filter (LPF), programmable-gain amplifier (PGA) and even the analogue-to-digital (A/D) converter (that is assumed to be resided in the digital BB) can be maintained in any of the modes therefore maximising block sharing.
3.3 Two-step channel selection for radio front-end simplification
The proposed receiver has two frequency conversions (i.e. RF! Flexible-IF ! BB), allowing the use of 'two-step channel selection' [5] to simplify the RF front-end. This claim is described in Fig. 2b On the other hand, the adoption of such a technique implies a relaxed specification of the RF frequency synthesiser (assuming an integer-N type) since this is intended to cover only the frequency range of every channel pair. Applying this concept to the 11a mode, ten LO locking positions are sufficient to cover the 19 channels in the 5.15 -5.725 GHz band (Fig. 3 ). Comparing this with the conventional single-step channel selection, the number of locking positions is almost halved, whereas the LORF step size is doubled (from 20 to 40 MHz). A doubled step size permits the use of a doubled reference frequency to enlarge the loop bandwidth of the phase-locked loop (PLL), and to reduce the modulus of the division ratio. The former shortens the PLL settling time and lowers the LORF in-lock phase noise, whereas the latter lowers the LORF close-in phase noise. The trade-offs of designing an RF frequency synthesiser can be found in [14] .
4
Proposed analogue-BB chain
The functionality and feature of each block are presented briefly here first, whereas the implementation details are given in Section 5. Fig. 4 shows the block diagram of the implemented analogue-BB chain. The dual-mode preselect filter features a single/double-channel BW for the ZIF/LIF mode. Its main function is to prevent the residual in-band channels, and out-of-band white noise, from folding back into the signal band in the subsequent mixing. The doublequadrature downconverter (DQDC) is made up of a series-switching mixer-quad. The mixing signal is generated by a digital clock generator (CLKGEN), which has a clock-rate-defined output to match different IFs. The cooperation of such DQDC and CLKGEN realises a wideband and mismatch-insensitive I/Q demodulation.
In order to complete the second step of channel selection, the output of the DQDC is designed to have a sideband selection feature [i.e. switch the phase (08, 1808) of its I/Q-coupled paths]. Since the downconversion is performed preceding the filtering and amplification, the reconfiguration needed for the mode switching involves just two simple reconfigurations: that is, double/halve the BW of the preselect filter and enable/disable the DQDC and CLKGEN for the LIF/ZIF mode, respectively.
With ZIF or LIF receiver architecture, the signal levels arriving at the BB are scaled to the vicinity of 0 dBm for the A/D conversion. The dynamic-range requirement of 802.11a/b/g from the antenna to the BB is 0-80 dB, with the majority of this gain provided in the BB. Assuming the RF front-end offers a 0 -30 dB gain range, the BB, LPF and PGA have to provide another 0 -50 dB of controllable gain. In practice, although a cascade use of multiple PGAs can attain such a high-gain range, the PGA has to feature an excess BW, roughly ten times wider than that of the LPF to ensure that the selectivity is stable against the gain. The proposed switched-current-resistor (SCR) PGA is to address this issue. It realises a constant-BW transient-free gain control to minimise the BW requirement of the PGA and enhances its settling time in gain change. The reduced BW also enhances the stopband rejection, resulting in a 2-fold relaxation of the LPF's order from the fifth to third.
The DC-offset can easily saturate the LPF and the PGA because of a large cascaded gain. In the 11b/g-CCK mode, the composite high-pass pole must have a value close to tens of kilo Hertz to prevent deeply damaging the signal. Therefore the resulting chip area impact would be very large and a long DC-offset transient in the automatic gain control (AGC) can occur. To eliminate this drawback, this paper introduces an inside-OpAmp DC-offset canceller (DOC) for area savings and switchability. A switchable highpass pole is created inside each LPF's and PGA's OpAmp, by which the differential signals are locally balanced and the composite high-pass pole is agilely switchable during the AGC (just 5.6 ms is allowed in the physical layer convergence protocol preamble field). The receiver settling time is therefore only governed by the AC-coupler (not shown) that eventually interfaces the PGA to the A/D converter (mostly, 9-bit resolution [15] ). The AC-coupler has an extended lower -3 dB point of 1 MHz to receive only the pilot tones. In the ZIF mode, all DOCs are switched on but maintain a composite lower -3 dB point ,10 kHz. This lowfrequency value ensures a low intersymbol interference in processing the CCK channel.
The gain, BW and operating mode are all controlled digitally. A built-in setup and additional 50-V test buffers enable both full-chip and functional-block measurements.
5
Circuit implementation
Preselect filter, DQDC and CLKGEN
The schematics of the I-channel (Q-channel is identical) dual-mode preselect filter and DQDC are shown in Fig. 5a . The front-end resistor-capacitor (i.e. R PF and C PF ) matrix offers single-pole preselect low-pass filtering and linear voltage-to-current conversion. The value of R PF determines the noise figure (NF) of the entire analogue-BB chain. With R PF ¼ 2.5 kV, the desired NF specifications (,30 dB) are safely met. The DQDC has a double-balanced structure to cancel the unwanted I/Q demodulating carrier at the output. Using the clock phases of the timing diagram from Fig. 5b , the co-switching of the swapper and S M switches produce two quasi-I/Q sequences with a normalised
and their frequency is a quarter of the reference clock (CLK). The reset-switch S RS is activated during swapping of the differential branches for minimisation of the conversion loss and memory effect.
The conversion gain (CG) of this switching mixer is determined by the duty cycle d as given by
In this work, the duty cycle is 25%, implying a CG of 0.45. Although this attenuation can rise its input-referred NF by 7 dB, the standard-recommended NF of 14 dB can still be safely met by adopting an RF front-end with a 30-dB gain and a 5-dB NF [13] . They jointly yield an overall receiver NF of 8.5 dB. Linearity is determined by two factors: (i) the ratio of RPF to the overall on-resistance of the swapper and SM and (ii) the overdrive voltage of the switches. Here, r on is 213 V (8.5% of RPF) such that a third-harmonic distortion of 280 dB is guaranteed. Moreover, to maximise the overdrive voltage under a low-voltage supply of just 1 V, the differential virtual ground is biased to a value very close to ground (i.e. 0.1 V) by means of an input common-mode feedback circuit (CMFB).
The clock phases are generated by the digital circuitry shown in Fig. 6 . A 40/50-MHz CLK generates a 10/ 12.5-MHz pseudo-I/Q waveform through the multiplication of the main and auxiliary phases in the analogue domain. The main phases are co-generated by two D-flip-flops (D-FFs), D1 and D2, and a pair of non-overlapping clocks that have a matched duty cycle. The required 908 phase shift is generated in the auxiliary clock by inserting an inverter N1 between D3 and D5. It is noted that the N1 induces a minor phase error as the auxiliary clock swaps the inverse terminals recursively at zero-crossings. A global reset initialises all D-FFs at startup. Finally, the IF channel selection is executed by switching the phases between SW Q and SW Q 0 . In addition to the main features mentioned, other advantages of the proposed DQDC and CLKGEN are also worth emphasising: (i) the duty cycle variation of the CLK only produces an amplitude mismatch between I and Q outputs and does not affect the phase; (ii) because the waveforms of I/Q sequences are non-overlapping and are always return-to-zero, the orthogonal relationship is still exact against a large timing error (i.e. T AE1 25/20 ns and T AE2 50/40 ns for 10/12.5-MHz IF as shown in Fig. 5b ). This nature ensures that I/Q demodulation is wideband-accurate and mismatch insensitive; (iii) as the swapper is only activated when S M is in open state, the charge injection from the swapper and self-mixing (in the overall circuit) are avoided. Of course, S M itself induces charge injection, but it is out of the signal band (i.e. two times the IF); (iv) because the preselect-filter exhibits a symmetric low-pass function between its input and output terminals, the swapping-induced charge that couples back to the RF front-end can also be suppressed, improving the reverse isolation.
Although the tolerable timing error of the proposed DQDC is the same as that in [16] , the current design features a higher CG of 0.45 (0.24 in [16] ) while requesting a lower CLK rate of 4-fold the IF value (8-fold in [16] ).
Channel-selection filter and PGA
The block schematics of the implemented channel-selection LPF and PGA are depicted in Fig. 7 . A third-order Butterworth active-RC LPF (1 uniquad plus 1 biquad with Q ¼ 1.3065) in conjunction with a 3-stage 17-MHzconstant-BW PGA provides the required selectivity and a controllable gain range from 22 to 50 dB with a 2-dB step size. Two coarse-stages (6-dB step size) followed by a fine-stage (2-dB step size) gain controls optimise the gain-switching transients in the PGA. Through iterative simulations and with a positive zero (R ff and C z ) added to the PGA's third-stage, the optimised (through simulation) group-delay peaking at the band edge is 14.8 ns. The resistor RBW is a resistor array for tuning the BW digitally with a 5-bit control word. To operate the OpAmp at a minimum supply voltage, the common-mode voltage of its virtual ground (V cm,in ) has to be near to one of the supply rails (here: [17] . This constraint will create transients in gain tuning since the output commonmode voltage of the OpAmp (V cm,out ) must be in mid-rail of the supply for maximum output swing (here: V cm,out ¼ V DD /2 ¼ 0.5 V). To tackle this problem, a current-source bank [I fb,1 , . . . , I fb,n ] and a resistor bank [Rx,1, . . . , Rx,n] are added. The former replaces the OpAmp to deliver the gain-dependent DC current for the gain-tuning resistors [R fb,1 , . . . , R fb,n ], whereas the latter sinks the same current out from the virtual ground. The whole operation is governed by 
The next step is to involve [R fb,1, . . . , R fb,n ] in (3) such that I 0 fb,n / 1/R fb,n . Matching R 3 to R fb,1 with R 3 ¼ R fb,1 /4 simultaneously meets that goal and equalises the numerator of (3) to that of the second term in (2) (i.e.
Substituting (4) back to the first term of (2), and replacing R fb,n and R x,n according to a n R u ¼ R fb,n ¼ 4Rx,n for n ¼ 1, 2, 3, . . .(where R u denotes the unit resistor and an is a positive integer representing a resistive ratio), a practical expression of (2) can be obtained, that is
Recalling that Vz, V cm,out and V cm,in , respectively, are mirrors of Vx (V DD /10), V ref,out (V DD /2) and V ref,in (V DD / 10), the error voltage (VD) associated to V DD , and the error resistance (RD) associated to R u , induce no effect on the balancing of (5) as given by
As a result, the SCR is made PVT-insensitive. The static and dynamic performances of the SCR technique are further improved by applying the following circuit practices: (i) The current mirroring, The second property (i.e. constant BW gain control) of the SCR PGA is related to its feedback factor b PGA as given by
where it is possible to observe that by keeping the ratio of [R fb,1, . . . , R fb,n ] to [R x,1, . . . , R x,n ] identical to that of R fb to R ff , a stable b PGA can be achieved. Considering this request together with (2), a transient-free and constant-BW gain control can be simultaneously achieved under the following two conditions, that is
and
A simple example illustrates this concept: with a 1 V supply, the V cm,in and V cm,out are set to 0.1 and 0.5 V, respectively. To offer a gain range of 212 to 12 dB with a 6-dB step size, we set R fb ¼ 4R ff and 4R x,32n ¼ R fb,32n ¼ 2 n R ff for n ¼ 21, 0, 1, 2, resulting in a constant b PGA of 0.2 while satisfying (2) for a transientfree operation. Without applying this technique, b PGA can vary between 0.2 (at 12 dB) and 0.8 (at 212 dB), which is equivalent to a 4-fold BW variation.
OpAmp and DOC
Inherent to negative feedback is the imposition to an OpAmp in closed-loop of a BW extension [18] . Considering such a property in the design of the DOC, each OpAmp is internally made highpass prior to closed-loop use, such that its high-pass pole will be shifted to a lower frequency by the loop gain. Thus, instead of using area to realise the time constant, a very low cut-off frequency DOC can be attained.
The circuit structure of the proposed OpAmp with DOC is shown in Fig. 7 . Since the DOC loop has to be switchable for fast DC-offset transients, the feedback node is selected in between the transconductance (TCA) and transimpedance amplifiers. Together they form the first gain stage of the OpAmp to drive the second stage voltage amplifier, while offering a low-impedance level at their interface that is important to minimise the switching transient of the DOC. Moreover, since the DOC is preceded by the TCA and is operated inside the OpAmp, its induced noise can be effectively lowered by the TCA and the loop gain. The DOC is realised by two balanced current amplifiers (CAs) and a differential capacitor (C oc ). By using a resistive input (R oc ), the CAs can directly absorb the high swing output signal from the OpAmp. The transistor-level implementation is summarised next. To realise a large time constant, in the order of 0.1 ms, two circuit techniques are applied to the DOC: (i) the first comprises the use of self-biased subthreshold cascode current mirror to realise the A i (s). As shown in Fig. 9b , M oc5 and M oc6 are biased in the saturation region to absorb the DC current (10 mA) from V outp and V outn , respectively. Owing to the body effect associated with M oc9 and M oc10 , they can simply be biased into the subthreshold region by using long channel length devices for M oc13 and M oc14 . It is known that a subthreshold-biased MOS transistor offers a very high intrinsic DC gain that is independent of device geometry [20] , making it highly appropriate to realise a large timeconstant integrator on-chip; (ii) A sink-/source-exchangeable charge pump (M oc1 -M oc2 ) serves as the output stage, to relax the linearity requirement of A i (s) and to reduce the signal swing applied at C oc . Low signal across C oc allows it to be implemented by weakly nonlinear depletion-mode MOS capacitors (M oc17 and M oc18 ) for further area savings (a 9-fold reduction in area is achieved when compared with poly-poly capacitors).
6
Experimental results
The prototype is fabricated in a 0.35-mm CMOS process. The chip micrograph is shown in Fig. 10 . Dual channels (I and Q) are integrated differentially with 3-mm 2 core The dynamic performances of the implemented analogue-BB chain are characterised in three different ways to demonstrate its compliance with the standards: that is, all tuning operations must be settled within 5.6 ms out of the short preamble field. Fig. 11a shows the switching of the DOC loop, where a negligible transient settling at the start and stop slots is observed. For the gain tuning, as shown in Fig. 11b , the gain-switched transient in a 52-dB gain step settles within 1 ms. The last test is the channel-selection transient, which is measured to be 0.38 ms (not shown).
In the frequency domain, I/Q channel isolation is measured to be .60 dB (Fig. 12) by applying the test source only at one channel while measuring both. Fig. 12 also shows the achieved stopband rejection ratio at the adjacent (32 dB at 20 MHz) and alternate (90 dB at 40 MHz) channels. Both values safely meet the 11a/g-OFDM mode: 16 dB at 20 MHz and 32 dB at 40 MHz. However, additional filtering would be necessary in the digital domain for 11b/g-CCK mode [21] since the demanded adjacent channel rejection is 35 dB.
We account the image rejection by measuring the mismatches of the I and Q channels. The gain/phase mismatches are measured to be 0.17 dB/0.398 and 0.16 dB/ 0.78 for 11a/g, respectively. Those results correspond to an averaged IRR of 40 dB over the entire signal band, which is much better than the targeted 30 dB (Section 3.1).
System performances are measured with modulation signals to test the conformity of the analogue-BB chain. Fig. 13a shows the constellation diagram and EVM results of 11a/g mode by injecting a 54-Mps, -31.8-dBm, 64-QAM OFDM signal. It measures an EVM of -27.03 dB (4.45%) that meets the standard allowed -25 dB (5.6%) with a good enough margin. Similar results of the 11b mode (Fig. 13b) are measured by using an 11-Mps, -32.7-dBm, DSSS-CCK signal as the test source. It achieves an EVM of -17.04 dB (14.07%), which also satisfies well the standard allowed -9 dB (35.5%).
The main performance metrics are summarised in Table 2 , and the block-level measurement results are summarised elsewhere [22] . 
Conclusions and benchmarks
In summary, we have demonstrated the feasibility of realising the required reconfigurable analogue-BB functions of IEEE 802.11a/b/g-WLAN receivers under a low-voltage supply of 1 V. A proper permutation of the functional blocks together with the use of a ZIF/LIF-mixed downconversion has successfully optimised the reception of OFDM and CCK channels through a simple reconfiguration. In terms of circuit implementation, newly proposed functional blocks, namely series-switching mixer-quad, switchedcurrent-resistor PGA and inside-OpAmp DOC, in cooperation, have met the standard requirements with low power consumption. All circuit techniques are generally applicable for different wireless systems.
A comparison of this work with state-of-the-art implementations is made in Table 3 [6] [7] [8] [9] [10] [11] . This work exhibits the lowest-voltage standard-compliant analogue-BB chain ever reported, while measuring a competitive performance when compared with [10] that targets the same WLAN applications.
Acknowledgment
This work is funded by the Research Committee of the University of Macau and the Macau Science and Technology Development Fund (FDCT ). 
