Abstract-A 1-Mb/s 916.5-MHz on-off keying (OOK) transceiver for short-range wireless sensor networks has been designed in a 0.18-m CMOS process. The receiver has an envelope detection based architecture with a highly scalable RF front-end. Untuned RF circuits are leveraged and optimized in the receiver to achieve superior energy efficiency compared to tuned RF circuits. The receiver power consumption scales from 0.5 mW to 2.6 mW, with an associated sensitivity of 37 dBm to 65 dBm at a BER of 10 3 . The transmitter consumes 3.8 mW to 9.1 mW with output power from 11.4 dBm to 2.2 dBm. The receiver achieves a startup time of 2.5 s, allowing for efficient duty cycling.
I. INTRODUCTION
I N RECENT YEARS, the semiconductor industry has produced consistently improving wireless chipsets in terms of functionality, cost, form factor and power consumption. These advances have enabled the emergence of wireless microsensor networks, which consist of a group of sensor nodes that are deployed remotely and used to relay sensing data to the end-user. Applications for sensor networks range from military, such as target tracking, to consumer and industrial, such as hospital monitoring or distributed sensing of factory equipment. As sensor networks mature, it is expected that nodes will further reduce in size and cost, allowing for the emergence of large scale sensor networks consisting of thousands of nodes [1] .
Sensor nodes are typically deployed in remote or inaccessible locations and must be powered either by batteries or through energy harvesting. Regardless of the energy source, energy efficiency is of paramount importance to enable node lifetimes on the order of months to years. The wireless transceiver is a critical block in sensor nodes as it often consumes the majority of available energy. Typical tracking and monitoring applications require transceivers to support average data rates up to tens of kilobits per second. A key metric for measuring the energy efficiency of wireless transceivers is energy per bit, representing the average amount of energy required by a transceiver to transmit or receive a single bit of data. Early transceivers designed for sensor networks achieved an energy per bit as low as 10 nJ/bit, and recent transceivers have achieved approximately 1 nJ/bit [2] - [4] . Although the energy per bit metric is valuable for comparing transceivers, there are numerous other specifications that must be considered including receiver sensitivity, transmitter output power and data rate. For example, a passive RFID system has an energy per bit of 0 but at the cost of extremely low receiver sensitivity. This paper presents a 0.18-m CMOS wireless transceiver suitable for large scale sensor networks with closely spaced nodes ( 10 m apart) [5] . Through the use of high data rates and a scalable gain architecture, the transceiver achieves energy per bit values as low as 0.5 nJ/bit for the receiver and 3.8 nJ/bit for the transmitter. Particular focus is placed on optimizing the energy efficiency of the receiver RF front-end, as the receiver is typically active more often than the transmitter and the front-end consumes the majority of power in the receiver. Untuned RF circuits are leveraged in a scalable configuration to minimize both power consumption and die area when compared to tuned RF circuits. Due to the transceiver's scalability, energy efficiency can be maximized over a wide range of operating conditions. To further reduce energy requirements for short packets, the receiver startup time is less than three bit periods.
Architecture and system specifications of the transceiver are presented in Section II. Implementation details of the transceiver are presented in Section III, and Section IV presents measured results.
II. ARCHITECTURE AND SYSTEM SPECIFICATIONS
A block diagram of the transceiver is shown in Fig. 1 . The transceiver operates in a single channel centered at 916.5 MHz and employs on-off keying (OOK) modulation. The receiver uses an envelope detection based architecture that eliminates the need for a local oscillator. The transmitter generates the OOK waveform by modulating the output of a surface acoustic wave (SAW) stabilized oscillator. Passive SAW components allow for reduced power consumption and die area at the expense of reduced integration and flexibility. Motivations for these system 0018-9200/$25.00 © 2007 IEEE and architectural specifications are described in the following subsections.
A. Modulation
To reduce infrastructure costs and increase user capacity, traditional cellular and wireless local area network (WLAN) systems place a high value on both spectral efficiency and receiver sensitivity. Thus, transceivers employ bandwidth efficient modulation schemes like Gaussian minimum-shift keying (GMSK) or quadrature amplitude modulation (QAM) combined with coherent receiver architectures; however, these transceivers consume too much energy for sensor network applications. To address this problem, in 2003 the IEEE approved the 802.15.4 standard for low-power wireless personal area networks (WPANs). 802.15.4 supports both binary phase-shift keying (BPSK) and offset quadrature phase-shift keying (O-QPSK) modulation at a maximum data rate of 250 kb/s. Current 802.15.4 transceivers consume tens of milliwatts and have approximate energy per bit values of 100 nJ/bit, which is less than cellular systems but still too high for many sensor network applications.
To achieve an improved energy per bit and lower power levels than 802.15.4, OOK modulation with a noncoherent receiver architecture is proposed. A noncoherent, OOK receiver enables the use of an envelope detection based receiver. In contrast with a coherent receiver architecture, no oscillator is required for phase synchronization and the receiver can turn on quickly. Furthermore, when received power levels are large, the power consumption can be dramatically decreased as little RF gain is required and no RF oscillation must be sustained.
Two limitations of OOK modulation are that it is spectrally inefficient and that it is strongly susceptible to interferers. These two limitations, however, are acceptable given that energy efficiency is the primary design consideration. For sensor nodes that are deployed in remote environments, there is ample bandwidth available and few interferers. An additional limitation of OOK modulation is that the implemented energy detection based receiver requires an energy per bit to noise spectral density value approximately 9 dB greater than coherent BPSK at a given bit error rate (BER) [6] . For a short-range link less than 10 m with a maximum free-space path loss of 51.6 dB, this 9 dB performance penalty is acceptable because there is sufficient margin in the link budget.
B. Data Rate
For sensor network applications like acoustic tracking and detection, a data rate of only hundreds to thousands of bits per second is required; however, for optimal energy efficiency it is often advantageous to operate the transceiver at a higher instantaneous data rate and turn off the radio periodically. In practice, there is an upper limit above which increased duty cycling and higher instantaneous data rates decrease energy efficiency. One key problem is that the startup time associated with turning on a transceiver has an associated energy that cannot be reduced by increasing data rates [3] . A second problem is that increased instantaneous power consumption results in worse battery efficiency or requires larger decoupling capacitors [7] . Accounting for the aforementioned concerns, the transceiver operates at a data rate of 1 Mb/s, which allows many nodes to share the same channel through time division multiplexing.
C. Regenerative Versus Nonregenerative Amplification
The system architecture shown in Fig. 1 achieves frequency selectivity with a SAW filter whose output is amplified by a nonregenerative amplifier. An alternate receiver architecture that is often used for low-power applications is the super-regenerative receiver. To achieve equally precise frequency selectivity as the proposed architecture, a super-regenerative receiver requires an off-chip SAW resonator to be placed in a feedback configuration along the signal path. Such an approach was not chosen due to its high sensitivity to wiring parasitics at 900 MHz; however, a promising research direction is integrating such resonators directly on or next to the receiver's bare die [8] .
III. TRANSCEIVER IMPLEMENTATION

A. Receiver Front-End
A detailed block diagram of the receiver front-end is shown in Fig. 2 . A common-gate low noise amplifier (LNA) with a tuned LC load first amplifies the RF input. The load capacitor is implemented as an accumulation mode on-chip capacitor to allow for tuning and the load inductor is off-chip. Following the LNA are five separate signal paths, each corresponding to a different gain setting. The RF front-end startup time is less than a microsecond and is set by the settling time of bias currents.
To support gain scalability and achieve optimal energy efficiency over a wide range of operating conditions the front-end has multiple, digitally controlled gain settings. A sequential gain architecture is used, in which a variable number of gain stages can be activated [9] . At any given time, only one horizontal signal path is active. At the lowest gain setting, the LNA output is directly fed to the envelope detector, whereas at the highest gain setting, the LNA output is amplified by five resistively loaded RF amplifiers. The first amplifier in each slice acts as a single-to-differential converter. As the single-to-differential amplifier does not provide significant gain, it is always followed by at least one amplifier. Later amplifiers are differential amplifiers with capacitively coupled source terminals. This circuit topology serves to mitigate the effect of cascaded DC offsets while allowing for good high-frequency common-mode rejection.
Parallel slices are used instead of a single slice with multiple output taps so that each gain setting can have different device sizing and bias currents. To meet noise constraints at high gain settings, the single-to-differential converter load resistance is decreased and bias currents are increased from their optimal power efficiency values. An additional benefit of having parallel slices is that there is no unnecessary loading on intermediate nodes by unused envelope detectors. The area overhead of these parallel slices is minimal, due to the small size required by each untuned amplifier. The motivation for and optimization of these untuned amplifiers is described in the following section.
B. Energy-Efficient RF Gain
For the receiver envelope detector to properly function, the RF input must be amplified to approximately 60 mV . Given an input signal of 65 dBm, a voltage gain of 45 dB is required. Even if it were possible to achieve such a large gain in a single stage, it would not necessarily be the most energy efficient approach. The optimal number of stages to minimize power consumption for BJT and CMOS amplifiers has been derived for a variety of configurations [10] . This approach serves as the basis for the following general energy-efficiency metric:
This metric can be intuitively explained by noting that for cascaded, identical gain stages, the total gain increases exponentially with the number of stages whereas power consumption increases linearly. The efficiency metric is more rigorously justified with the following two tenets. The first tenet of the metric is that only the total gain and power consumption affect energy efficiency. Other factors such as the number of gain stages, power supply rejection, input-referred noise and linearity are not modeled in the metric. Given an efficiency metric , this tenet implies that (2) The second tenet of the efficiency metric is that the number of times a stage is cascaded has no effect on its energy efficiency. This allows for a fair comparison between different gain blocks, regardless of how many stages are cascaded. Thus, the efficiency metric can be used for both single-stage and multi-stage gain topologies. Given cascaded identical gain stages, each with gain of and power , the second tenet implies that (3) These two tenets combine to form (1).
The metric described in (1) is used to determine the optimal amplifier topology as well as appropriate biasing and sizing of devices. In particular, the efficiency of a tuned amplifier [ Fig. 3(a) ] is compared to an untuned, resistively loaded RF amplifier [ Fig. 3(b) ]. For the tuned gain circuit, the resonant load LC tank is assumed to have an impedance of 800 , corresponding to a tank quality factor of approximately 28 for typical tank configurations. At 900 MHz in a standard CMOS process, a quality factor of 28 would necessitate an off-chip inductor. In the untuned amplifier, a voltage drop of 0.6 V across the load resistors is assumed, and hence the amplifier operates at a higher supply voltage than the tuned amplifier. Capacitor in Fig. 3(b) is used for DC offset compensation. Capacitor is connected between the source terminals of and rather than inserting coupling capacitors at the gates of and because of two key reasons: 1) Capacitors at the gates of and would degrade perstage gain by acting as a capacitive divider between stages. Capacitor can be sized to minimize gain reduction. 2) To a first order the parasitic capacitance of does not degrade gain, but instead worsens high-frequency input common-mode rejection. In contrast, parasitic capacitance of capacitors at the gate terminals would degrade gain significantly. To accurately model the capacitive loading at the output of the untuned gain stage, identical untuned gain stages are cascaded. For both topologies, the bias current, , is swept over multiple values and the width of the input transistors is also varied. Fig. 4 presents the simulated gain for both tuned and untuned amplifiers, and Fig. 5 presents the corresponding energy efficiency using the metric in (1). We see that at 915 MHz in the given 0.18-m CMOS process, the maximum energy efficiency of an untuned, RF amplifier is superior to that of a tuned RF amplifier by approximately 20%. This conclusion motivates the RF front-end architecture shown in Fig. 2 . The first amplifier is a tuned LNA to improve noise-performance of the receiver, but later stages are all untuned amplifiers to maximize energy efficiency. Untuned amplifiers have the additional benefit of not occupying significant die area, thereby reducing the cost and size of the sensor node.
C. Analysis of the Untuned RF Amplifier
To complement the simulation results presented in the preceding section, an analytical model of the untuned RF amplifier and its optimal sizing is developed. The untuned amplifier can be represented by the following transfer function: (4) Equation (4) represents a first-order low-pass amplifier with a pole-zero pair introduced by the capacitor . To analytically solve for the energy-efficiency metric, the following approximations are made:
(5) (6) (7) where , , and are constants that scale load capacitance. Because is sized sufficiently large to ensure little gain degradation at 915 MHz, it can be assumed to be an AC short and thus the amplifier is represented by a simple first-order transfer function. As the per-stage gain is small and of the input transistors is significantly less than , the Miller effect is ignored. Assuming a voltage across the load resistor, can be represented in terms of : (8) With the above assumptions, it is possible to obtain an analytic expression for the gain and the energy efficiency in terms of , and constants.
Equations (9) and (10) are plotted in Fig. 6 , which closely matches the plots in Figs. 4(b) and 5(b). To obtain insight into the analytical model, one can derive an expression for the optimal width for a given current. Since the device width does not affect power consumption, the point of maximum gain corresponds to the point of maximum efficiency: (11) By differentiating the gain expression in (11), one can solve for the optimum width .
where (12) MHz (13) In Fig. 6(b) , the optimal width is relatively constant regardless of power consumption. This is because the numerator in (13) is dominated by the term, which represents the fixed interconnect capacitance. The term is much less significant than and at low currents MHz . Thus, for optimum energy efficiency, interconnect capacitance must be minimized through careful layout. If active MOSFET loads are used instead of resistive loads, the load resistance can be considered a independent parameter in the optimization analysis. This allows for additional control over the tradeoffs between energy efficiency, linearity, and noise. Fig. 7 presents the schematic of the pseudo-differential envelope detector. The envelope detector operates similarly to a diode-based rectifier and is a differential pair with the output at the source node of the input pair. This topology can easily support the 1-Mb/s data rate while consuming a total of 10 W. A single differential pair is enabled depending on the gain setting. For a given OOK input amplitude, the change in output voltage can be represented by the following equation: (14) where represents a modified Bessel function of the first kind [11] . Fig. 8 presents a measured plot of envelope detector output amplitude versus antenna input amplitude for each of the five gain settings. For correct operation the pseudo-differential envelope detector output must have an amplitude greater than approximately 5 mV , corresponding to the dashed horizontal line in Fig. 8 . At lower amplitudes, thermal noise and kickback from the analog-to-digital converter (ADC) introduce bit errors.
D. Envelope Detector
E. Baseband Amplifier and ADC
The baseband amplifier consists of three resistively loaded amplifiers, with the first two amplifiers employing output offset compensation. The offset compensation includes preset switches to reduce the settling time at startup. When switching between gain settings, the DC output voltage of the envelope detector can change, requiring these preset switches to be briefly closed. Later stages of the baseband amplifier can be disabled to reduce the overall gain. Given the extensive scalability preceding the ADC and the simplicity of OOK modulation, the 8-MS/s ADC needs only 3 bits of resolution. The multi-bit ADC output can be used by an off-chip digital AGC to determine the appropriate RF gain setting.
A common problem limiting the use of passive, low-frequency offset compensation in baseband amplifiers is that the capacitors and resistors required are extremely large [12] . For a 1-Mb/s, pulse-amplitude modulated signal, a lower cutoff frequency of approximately 1 kHz is appropriate [12] . To implement an RC high-pass filter with a cutoff frequency of 1 kHz and a series capacitance of 1 pF, the shunt resistor must be 160 M . By employing Manchester encoding, the cutoff frequency increases significantly; however, the required shunt resistance remains on the order of 1 to 10 M s. Implementing such a high resistance with on-chip poly or N-well resistors would result in substantial parasitic capacitance, thereby increasing the power consumption of the associated amplifier. Transistors can potentially be used to implement high-impedance resistors with low parasitic capacitance; however, their impedance is typically nonlinear over a large voltage range. To overcome this linearity problem, a high-impedance resistor is implemented using forward-biased MOS diodes [13] . For a small, positive voltage drop across a MOS diode , the I-V relationship can be approximated as (15) (16) (17) where is the thermal voltage and both and are empirical parameters. Thus, the effective resistance of a forward-biased MOS diode is . For negative values of , the current is negligible and the resistance increases quickly. To achieve linearity over both positive and negative voltages, one can connect a second diode in parallel with the original diode but with the anode and cathode nodes reversed. This technique, however, results in an asymmetric resistance due to body bias effects. An alternate technique that can reduce this asymmetry is to place a pMOS diode in parallel with a nMOS diode. Fig. 9 shows the resistance of a nMOS diode, a pMOS diode and the two diodes in parallel versus the voltage across them. When the nMOS and pMOS diodes are placed in parallel, they exhibit a resistance varying from 8 G to 17 G over a 200-mV range.
Even though a nMOS and pMOS diode in parallel can realize a relatively linear resistor over a 200-mV range, the baseband amplifier requires a linear resistor over an 800-mV range. To achieve a larger linear range, multiple diodes can be stacked in series. Fig. 10 shows the circuit that is implemented to realize a high-impedance resistance with a linear range of 800 mV. Four nMOS and pMOS diodes are stacked in series. The internal nodes of the nMOS and pMOS stacks are connected together to improve transient behavior.
Problems associated with the high-impedance diode resistor are its acute sensitivity to process and temperature variation, its slow time constant, and that the high-resistance node is susceptible to capacitive coupling. To minimize the effect of the slow time constant, a zeroing switch is placed in parallel with the effective resistor to eliminate initial offsets. This switch is a near-minimum-sized transistor to maximize its dependent OFF resistance. The resistors are laid out to minimize capacitive coupling.
F. Transmitter
The transmitter schematic is shown in Fig. 11 . The transmitter generates a 1-Mb/s Manchester encoded OOK signal. Although the transceiver can support other types of modulation, Manchester encoding is used to simplify the off-chip baseband demodulator. The SAW stabilized oscillator consists of a Colpitts oscillator coupled to a SAW resonator. The startup time of the oscillator is set by the SAW resonator and is measured to be less than 60 s. Since the oscillator startup time is longer than a bit period, the oscillator cannot be power gated during transmission of a packet.
The mixer is integrated with the power amplifier (PA), which consists of two cascaded untuned differential pairs followed by a tuned amplifier. The first two PA stages buffer the single-ended oscillator output and convert it to a differential signal. The differential signal allows for a 500-ns OOK pulsewidth with minimal switching transients. When transmitting a "0", the final two stages of the PA are disabled to reduce power consumption. The final amplifier is loaded with an off-chip balun to generate a single-ended output while attenuating harmonic distortion. The PA power output is adjustable through digital control of the final amplifier's bias current.
IV. MEASURED RESULTS
The receiver power consumption scales from 0.5 mW at the lowest gain setting to 2.6 mW at the highest gain setting. The noise figure of the RF front-end including the 3.5 dB loss of the SAW filter is between 14 dB and 15 dB for all gain settings, indicating that the tuned LNA dominates the noise figure. The tuned amplifier is supplied 0.8 V and all other receiver circuits are supplied 1.4 V. The startup time of the receiver is 2.5 s, significantly faster than most PLL based receivers.
The receiver achieves a maximum sensitivity of 65 dBm at a BER of 10 . Fig. 12 presents the measured BER of the receiver versus input power for each of the five RF gain settings. BER measurements are made by processing the receiver ADC output with a MATLAB based demodulator. At high gain settings, the RF front-end noise figure (NF) limits sensitivity whereas at low gain settings the RF front-end gain limits sensitivity. Since the envelope detector accepts a wideband input, the bandwidth of the LNA directly affects the maximum sensitivity through the following equation: (18) The receiver's required signal-to-noise ratio (SNR) for a 10 BER is approximately 16 dB, and the measured 3 dB bandwidth of the LNA is 90 MHz. The calculated sensitivity of 64.5 dBm closely matches the measured sensitivity. Due to the wide LNA bandwidth, the frequency selectivity of the RF front-end is determined primarily by the SAW filter.
The transmitter supports seven output power levels from 11.4 dBm to 2.2 dBm and achieves a peak efficiency of 6.9%. Power numbers provided include off-chip balun losses and trace losses to the antenna. The oscillator is supplied 0.8 V, the final PA stage is supplied 1 V, and all other circuits are supplied 1.4 V. The oscillator achieves a phase noise of 114.3 dBc/Hz at a 1 MHz offset from the carrier. Fig. 13 presents the envelope and frequency spectrum of the PA output when transmitting a Manchester encoded sequence of "1"s.
The energy per bit of the transceiver compares favorably with existing transceivers. does not normalize the energy per bit values to transmit power or receiver sensitivity; however, for short-range low-power applications such as sensor networks or RFID, it can be argued that receiver sensitivity is ultimately less important than energy requirements. At a link margin of 51.6 dB, which corresponds to a 10-m free-space path loss, the combined transmit and receive energy per bit is 6.3 nJ/bit. For a short packet of 50 bits, the startup energy increases energy per bit values by approximately 5% in the receiver and 25% in the transmitter. A power breakdown of the transceiver is presented in Table I . Fig. 15 presents a chip micrograph of the transceiver. The active area of the chip is 0.27 mm , and the die area is 1.3 mm by 1.4 mm as the chip is pad limited. The radio has been integrated and tested on an acoustic sensor board that includes a microphone, ADC, and DSP. A summary of results are presented in Table II . V. CONCLUSION A 1 Mb/s energy-efficient transceiver for wireless sensor networks has been demonstrated. The transceiver is highly scalable and achieves a fast receiver startup time, which allows for efficient operation in low duty cycle, energy starved scenarios. The transceiver architecture lends itself well to process and voltage scaling due to the absence of op amps and precise analog feedback loops. The radio achieves an energy per bit as low as 0.5 nJ/bit for the receiver and 3.8 nJ/bit for the transmitter. 
