Abstract-A 2.4 GHz TX in 65 nm CMOS defines three channels using three high-FBARs and supports OOK, BPSK and MSK. The oscillators have 132 dBc/Hz phase noise at 1 MHz offset, and are multiplexed to an efficient resonant buffer. Optimized for low output power 10 dBm, a fully-integrated PA implements 7.5 dB dynamic output power range using a dynamic impedance transformation network, and is used for amplitude pulse-shaping. Peak PA efficiency is 44.4% and peak TX efficiency is 33%. The entire TX consumes 440 pJ/bit at 1 Mb/s.
This output power level is also the recommended transmit power in the IEEE802.15.6 BAN standard [2] .
Hence, the transmitter architecture must be optimized considering these low radiated output power levels. A generic transmitter architecture is shown in Fig. 1(b) . It requires a Local Oscillator (LO), modulation blocks, and a power amplifier. For a 10 dBm, or 100 W output power, the PA is no longer the dominant power consumer, and the power of the remaining blocks starts to dominate. If simple binary modulation schemes are used, the power of modulation can be greatly reduced, as is done in [5] , [6] . This reduces the problem to developing efficient LO generation schemes. In this work, OOK, BPSK and MSK are implemented.
Multiple schemes have been employed in the literature to generate efficient LOs for low power transmitters. The PLL in [7] , for example, is duty-cycled and turned on only just before the packet transmission occurs. This reduces the peak power consumption of the transmitter. The PLL allows the LO to be set at any frequency, however, this comes with a slow startup time due to the finite bandwidth of the PLL. A slow frequency-correction-loop is employed in [5] , [8] . The base-station, which has an accurate frequency sends correction signals to the sensor node. The energy-penalty on the sensor node is negligible. However, for very low duty-cycle applications, this scheme is difficult to implement, since the drift in the sensor node frequency can become large.
Another approach is to use high-direct-RF resonators that provide stable oscillation frequencies at RF, eliminating the need for a PLL. Film Bulk Acoustic wave Resonators (FBAR) or Surface Acoustic Wave (SAW) resonators have been used as frequency references in [6] , [9] , [10] . The circuit model of these resonators is shown in Fig. 2 , along with the impedance plot of a representative FBAR. The device has a series resonance where it presents a low impedance and an intrinsic parallel 0018-9200/$31.00 © 2013 IEEE resonance where it has a high impedance. Most oscillators use this parallel resonance [11] , where the high impedance results in higher gain, allowing for low current consumption in the oscillator.
The parallel resonance frequency can be shifted lower by capacitive loading, but this also brings down the parallel resonance impedance, thus increasing power consumption and reducing the filtering provided by the resonator. Typically, the frequency tuning of these oscillators is limited, and one resonator operates only in one frequency channel [9] . However, in an increasingly congested band such as the 2.4 GHz ISM band, multi-channel operation is essential for reliable communication. This can be achieved by adding resonators with different parallel resonance frequencies to define channels, and having an architecture that can efficiently select/multiplex between them. Such two-channel implementations have been demonstrated in [6] , [9] , but these architectures are not as scalable to a large number of channels.
This work aims to exploit the advantages of low power, frequency stability and low noise provided by the resonators, and address the issue of single-channel operation by developing a scalable multi-resonator multi-channel architecture. FBARs are used to demonstrate the ideas in a three-channel implementation. Also, the architecture is optimized to provide high transmit efficiencies for the low 10 dBm output power levels typical in BANs.
II. TRANSMITTER ARCHITECTURE
The multi-channel FBAR-based transmitter architecture is shown in Fig. 3 [12] . Multi-channel capability is achieved through the use of multiple oscillators, each attached to one FBAR. The outputs of the oscillators are then multiplexed through transmission gates onto a low input capacitance resonant buffer. This buffer then drives an efficient push-pull type power amplifier. The architecture supports three modulation schemes, OOK, BPSK and MSK at data rates up to 1 Mbps, all with pulse-shaping capability for improved spectral efficiency. Phase inversion for the BPSK modulation is achieved through an alternate path in the buffer stage that employs matched but inverted delays.
For the pulse shaping, rather than employing a linear mixer and linear power amplifier, the architecture employs a polar scheme. Amplitude modulation is achieved through a digitally tuned impedance transformation network that is part of the PA. The digital logic implementing the pulse shaping based on the input data is implemented in an FPGA. The input data is up-sampled at a chosen oversampling speed (typically between 8 to 10 the data rate) and then passed through the appropriate filter (Gaussian for OOK and MSK, and Square Root Raised Cosine for BPSK). The outputs are then quantized to the levels that can be generated by the chip, and the corresponding digital signals are sent to the chip.
A low voltage design (0.7 V) coupled with rail-to-rail swing on all RF nodes is used for improved power efficiency of the transmitter. Full swing on the oscillator minimizes short-circuit current in the buffers, while full swing at the input of the PA maximizes overdrive. Each of the individual circuit blocks in the architecture are explained in detail in the following sections followed by the measurement results.
III. FBAR OSCILLATORS
Resonator-based oscillator designs, including with FBARS, have been studied extensively in the literature [11] , [13] , [14] . The high-and frequency stability of the resonator, consequences of their mechanical properties, enables stable LO generation. The high-also results in low phase noise, and the intrinsic stability of the resonator allows for PLL-less systems, achieving fast startup times.
An inverter-style Pierce oscillator topology is chosen in this work [13] . The schematic is shown in Fig. 3 . This topology was chosen because it provides rail-to-rail output, which is important for driving the next stage efficiently. Also, due to the current re-use in the NMOS and PMOS, the power consumption is reduced by 2 compared to an NMOS-only implementation. The inverter is biased at mid-rail through the bias resistor. The current of the circuit at this bias point should be such that the startup condition, is met, where is the critical transconductance below which oscillations cannot be sustained [14] . The oscillator has an enable signal to turn on/off the particular channel.
The oscillator is operated at a lower-than-nominal voltage in order to reduce undesired short-circuit power in the NMOS and PMOS devices. As the supply voltage lowers, the devices become biased in the sub-threshold region, which further improve their transconductance efficiency. However, biasing the devices too deep into subthreshold requires a proportional increase in their size to achieve start-up, thereby significantly increasing parasitic capacitance, which loads the FBAR and brings down the effective parallel resonance impedance. Taking these issues into consideration, this work employs a 0.7 V power supply for the oscillator.
A. MSK Modulation
One way of implementing Minimum Shift Keying (MSK) is through FSK modulation by setting the deviation to be . Hence, in order to achieve MSK modulation at 1 Mbps, center frequency tuning of 250 kHz must be provided. This is achieved by a digitally controlled capacitor bank of 150 fF, as shown in Fig. 3 . This number has been calculated, through simulations, for the worst-case resonators, which corresponds to the highest resonator-. Since the digitally switched capacitors have a finite , they present an effective resistive load on the oscillator, given by . This effective resistive load (about 5-10 k ) is designed to be much larger than the intrinsic parallel resonance impedance of the FBAR (about 2 k for the resonators used in this work).
B. Considerations for Multi-Channel Capability
Since direct multiplexing of the high-FBARs is not feasible, multi-channel capability is achieved by replicating the oscillator circuits for each resonator. However, since each resonator is about 400 m 600 m, the floor-planning of the chip becomes important. It is essential that the bondwires used to connect the resonator to the chip be as short as possible. This is because the bondwire inductance can cause parasitic oscillations with the parallel capacitance of the FBAR and the parasitic capacitors of the oscillator [15] . The shorter the bondwires, the higher the parasitic oscillation frequency, and the less likely it is to affect the circuit, since, the bondwire resistance also increases with frequency. Simulations suggest that the bondwires be under 400 m, in which case, the parasitic oscillations would have to occur at frequencies greater than 15 GHz. The floor-planning problem is solved by connecting the three resonators to three different sides of the chip, as shown in the package photo in Fig. 12(b) .
IV. FBAR OSCILLATOR MULTIPLEXING FOR CHANNEL SELECTION
Since one FBAR oscillator defines a single channel, multi-channel operation in this architecture is achieved through oscillator multiplexing. The input to this block is a rail-to-rail swinging signal generated by the Pierce oscillator. The output must be a rail-to-rail swinging signal driving the PA, which presents about 200 fF input capacitance. Two methods for multiplexing are considered. A. Direct Multiplexing to the PA Shown in Fig. 4 (a), this scheme ideally has near zero power overhead, since the PA capacitance is in resonance with the oscillator. The effective of this capacitance presents an equivalent resistive load to the oscillator, which de-s it, increasing power consumption.
The following analysis calculates the effective resistance seen by the oscillator. Let be the capacitance of the load that is being multiplexed. Let be the parasitic capacitance of each transmission gate as seen on the load side. Thus, if the number of channels multiplexed is , the total capacitance presented to the oscillator would be . Let be the series resistance of an 'on' transmission gate and note that the product is approximately a constant for a given process. The of this load to the oscillator is . The effective loading resistance presented to the oscillator is thus (assuming high-):
For a given and channel number , the maximum is obtained when . Thus, the maximum resistance presented to the oscillator is given by:
Thus, given a specification for a minimum that avoids oscillator performance degradation, the maximum number of channels possible is given by:
While multiplexing to the 200 fF PA, if an effective loading of k is desired, only 1 channel is supported in the 65 nm technology used in this work. For three channel operation, the 2.9 k only. Thus, self-loading of the switches limits scaling to a large number of channels.
In addition, the input capacitance to the PA is not a constant. It can vary through the Miller multiplication of the gate-drain capacitance,
, of the transistors in the PA (see Section V). If this capacitance is connected to the oscillator directly, it will modulate the center frequency.
B. Multiplexing With A Buffer
The above issues are solved by a different approach, shown in Fig. 4(b) . A buffer decouples the problem of multiplexing and large PA input capacitance. The buffer drives the large capacitance, while presenting a much reduced load to the multiplexing circuit, making it much easier to size the transmission gates and scale to a large number of channels.
However, if a simple inverter buffer chain is used, the switching power consumption, is 250 W. This is prohibitive from an efficiency standpoint, considering that the target output power of the PA is 100 W. In addition, the power consumption is sensitive to layout parasitics.
C. Resonant Buffer
This buffer power consumption problem is solved through the use of a resonant buffer topology, shown in Fig. 5 . The load capacitance is resonated with an on-chip inductor, similar to techniques used in resonant clock distribution networks [16] . The inverter only needs to provide for the losses in the tank. Since it is driven by rail-to-rail swinging output of the oscillator, no special biasing is required for the circuit. The power consumption of the resonant buffer is:
The equivalent resistive load is only a very weak function of the capacitance. The larger the value of the capacitance to be resonated, the smaller is the value of the inductor used, which limits the value of the effective load resistance, since . Hence, the inductor with the largest possible impedance, after allowing for process variation of the inductor and layout parasitics is chosen. In the process used, the value obtained was about 1 k .
Since the inverter only needs to drive the effective load resistance, and not the total capacitance, the size of the inverter can be very small. A sizing ratio of 1:10 was achieved, with the input capacitance of the buffer being just 20 fF. With this, the maximum number of channels increases to for k . The buffer is tristated with the signal. Since the resonant network is low-, and limited by on-chip inductors, the buffer turns on/off within a few RF cycles. The buffer circuit consumes W from the 0.7 V supply, a 2.5 improvement over the inverter buffer chain.
D. Alternate Buffer Paths for BPSK Capability
Additional parallel paths in the buffer are designed to enable BPSK modulation. Matched but phase-inverted delays are used to generate BPSK modulation in an energy efficient, and digital fashion [17] , [18] . The implementation is shown in Fig. 6 . The delay of an inverter is nominally matched with the delay of the transmission gate. The final resonant inverter stages in this circuit share the resonant load with the buffer stage described previously in Fig. 5 . The signals and , which are independent of the signal of Fig. 5 , enable/disable the two paths separately. With this circuit, the input capacitance presented to the multiplexer is only 1 to 2 fF, and hence this scheme can extend to an even larger number of channels than the 17 for the buffer of the previous section. In this case, the performance will be limited by routing capacitance, which is not taken into account in the analysis of (4).
Tristating logic is used as opposed to the power supply cut-off method in order to improve speed, since the enable signals for the buffers switch at the data rate and need to settle in . This leads to slightly higher power consumption because the 20 fF inverter is driven by logic and is not resonant with the oscillator tank. The overhead is about 30 W.
A Monte-Carlo process variation simulation is performed to calculate the phase mismatch variations between the two paths. A standard deviation of 2 is observed. This accuracy is sufficient for BPSK modulation. The IEEE 802.15.6 standard requires the phase error . However, in order to reduce the phase error further, a fine delay adjustment scheme is used. It is achieved through independent tunable resistive degeneration of the inverters and gates in the and . This finely adjusts the delay of the gate, and the final implementation achieves overall phase tuning in steps of 1-2 .
V. INTEGRATED PULSE-SHAPING POWER AMPLIFIER
The power amplifier must be optimized for high efficiency at an output power of 10 dBm, while operating from a 0.7 V power supply. On a single-ended 50 antenna, this amounts to a swing of 200 mV . An impedance transformation network is therefore necessary for a high-efficiency power amplifier. The rest of the section describes the choice of topology, device sizing, impedance transformation network type, biasing and pulse-shaping capability. Fig. 7(a) shows a typical NMOS-only power amplifier topology. The choke inductor could be replaced by a smaller inductor that is part of the matching network [6] . Rather than using it as a linear power amplifier with output proportional to the input, it is driven strongly by the rail-to-rail swinging output of the resonant buffer for maximum efficiency operation. In this mode of operation, the drain node has a swing of . If is the impedance provided by the matching network, the output power is given by For the PA to operate at peak efficiency for 10 dBm, 2.45 k . The required impedance transformation ratio of 50 is very high, and is not practical for on-chip implementations, with typical inductor-values less than 10 [19] .
A. NMOS-Only versus Push-Pull Topology
On the other hand, for the push-pull PA shown in Fig. 7 (b) the drain node swings from 0 to only when operating at its peak efficiency, thus effectively reducing the output power delivered for the same load impedance. The output power is given by (6) Hence, for 10 dBm output, 612 . This impedance transformation ratio of 12 is amenable to on-chip implementations.
Typically, power amplifiers also include a series LC network resonant at the center frequency in order to provide dc-blocking, filter harmonics and improve efficiency. But, in a high-impedance system with a 612 load, the parasitic capacitance of the inductor can provide a lower impedance path, affecting the overall efficiency. For example, 100 fF at 2.5 GHz is an impedance of . The series LC filter is hence replaced with a DC-blocking capacitor.
B. Tunable Impedance Transformation Networks
Once the supply voltage and load impedance are fixed, the power amplifier has a constant output power given by (6) . In order to change the output power, either the supply voltage or the effective impedance seen by the PA must be changed. For pulse shaping using the former method, a supply modulator would be required. Since 1 Mbps data rates are considered, the bandwidth of the supply modulator would need to greater than 1 MHz. Class-S supply modulators are ideally 100% efficient [20] , however, the use of external components and the overhead of the control circuits and other losses can be prohibitive, especially for low output power applications.
An alternate method of adjusting the effective impedance provided to the PA is explored in this work. For this, a tunable impedance transformation network is designed. The network must achieve resonance at multiple settings, while transforming the impedance to different values. The settings of the network are to be made digitally tunable in order to enable streamlined pulse shaping. At each of the settings, the PA operates at its peak efficiency, with the drain swinging rail-to-rail. Amplitude modulation is achieved even with a constant envelope input, thus simplifying the design of the previous stages.
Since on-chip inductors are low-and have a large area, it is undesirable to add switches in series with the inductor in order to select from a set of inductor values. Thus, fixed-inductor designs are explored. This eliminates the a simple L-match since a fixed valued inductor can only provide a fixed impedance transformation. Similarly, matching networks with two inductors are avoided because of the increased area. Thus, a pi-matching network and a tapped-capacitor matching network are considered and shown in Fig. 8 .
The design equations for the tapped capacitor match are given below, derived through narrow-band parallel-to-series transformations [19] : (7) (8) (9) (10) (11) is the loaded quality factor of the capacitor , is the series impedance seen by the effective series capacitance, formed by and .
is the loaded quality factor of this effective series capacitance, and also the loaded quality factor of the inductance. Thus, for a desired value of , the required values of and can be calculated. A similar analysis can be done for the pi-match.
Since the analysis above assumes ideal passives, the "loaded quality factor" indicates the ratio of energy stored in the element to the energy radiated by the antenna per cycle [19] . But, in reality, the passives have a finite intrinsic quality factor ( ), which implies that a fraction of the stored energy is also dissipated by these losses every cycle. The power loss in a passive is: (12) This indicates that for a given intrinsic , the efficiency of the matching network reduces as the loaded seen by the passive element increases. Thus, matching networks that minimize loaded are more preferable.
The results for the two matching networks using a fixed 6.44 nH on-chip inductor are shown in Tables I and II for impedance transformation from 300 to 1.2 k . The capacitance tuning required is similar in both the cases. However, the loaded of the inductor in the pi-match is larger, leading to higher matching network losses. Fig. 9 plots the phase of the RF signal at the antenna relative to the phase of the RF input into the PA. The variation of this phase difference for the pi-matching network is much higher as compared to the tapped-capacitor one, and is therefore once again the more undesirable topology. Hence, the tapped-capacitor match is chosen for this design. It should be noted that even at the highest of 12 in the tapped capacitor match, the network has a bandwidth 200 MHz, and hence, for each impedance transformation ratio, a single capacitor bank setting is sufficient over the entire ISM band.
C. Design of Capacitor Banks
The capacitor banks and require a wide tuning of about 2-3 , while also maintaining a high . They are designed with some fixed capacitance and a binary-weighted capacitor bank built with MIM-capacitors. A 5-bit capacitor bank is designed for is designed with 7 bits since it requires a larger range of values. The bottom plates of both the capacitors see a DC voltage of 0 V due to the 50 antenna. Also, since the voltage swing is a few hundred mVs, the bank design is identical for both and is presented below. Fig. 10(a) shows a capacitor with a switch and its parasitics. , , which are dependent on the capacitor value and switch size. The switch is sized based on the equations below: (13) (14) (15) (16) (17) part of also scales with if the switch size is scaled along with it. This leads to a constant-switched capacitor. The off-state capacitance is and adds to the explicit fixed capacitor. The effective of the capacitor bank is thus highest at small digital codes and degrades at higher digital codes.
In order to generate a wide tuned capacitor, must be minimized for a given switch resistance since it contributes to the fixed capacitance. A boosted voltage of 1 V is hence used to drive the switches, allowing smaller switches to be used. Fig. 10(b) shows the simulated capacitance and for , with being quite similar. These capacitor banks, being digitally switched, can be changed at rates 10 MHz, sufficient for up to 10 oversampled pulse shaping of 1 Mbps OOK and BPSK modulations.
In this work, the logic for driving the capacitor banks to apply pulse-shaping is implemented off-chip on an FPGA, as indicated in Fig. 3 and Section II. However, it is important to estimate the power overhead of this operation. In [7] , the digital baseband including packet generation and raised-cosine pulseshaping consumes only 62 W. Based on this and power scaling in digital circuits [21] , the pulse shaping logic required for this work is estimated to result in less than 2% degradation in overall system efficiency for 10 dBm operation.
D. Final PA Design
The final design of the proposed push-pull PA is illustrated in Fig. 11 . Transistors M1 and M2 are biased at and respectively with on-chip DACs to trade-off short-circuit current and on-resistance. This is done to make the most use of the 0.7 V swing from the resonant buffer. Resistivedivider DACs, designed with high-density poly resistors, have a total resistance of 1 leading to a power consumption of W. The DACs do not need to drive strongly since they are set only once and not changed at the data rate.
An ac-coupling capacitor connects the drain node of the PA to the tapped-capacitor matching network. Though a series resonant LC tank could be used instead to improve efficiency by presenting a high-impedance at the higher harmonics [19] , capacitive parasitics of the on-chip inductor would create shunt paths due to the high impedance of the subsequent matching network (as described in Section V-A).
The biasing resistor sets the DC voltage on the transistors, while a coupling capacitor connects the RF from the resonant buffer. The coupling capacitor and gate capacitance forms a voltage divider, and the bias resistor acts as an additional load impedance to the resonant buffer. The coupling capacitor is sized to keep the voltage division close to 1. A total of 7.5 dB output power tuning range is achieved in the final design through the tunable impedance transformation network by implementing a slightly wider capacitance tuning range as compared to Table I . Further tuning of output power, if required, can be achieved by statically varying the supply voltage of the PA alone through high-efficiency DC-DC converters. Hence, in the most flexible system, a DC-DC converter sets the average power of the transmitter while the pulse shaping is provided by the matching network.
VI. MEASUREMENT RESULTS
The transmitter is fabricated in a 65 nm CMOS process and is co-packaged with three FBARs in a 40-pin QFN package, as shown in the package and chip photographs in Fig. 12 . The TX core occupies an area of 0.324 mm . All the RF circuits are nominally powered from a 0.7 V supply, while the digital switches in the multiplexers and capacitor banks are powered with a 1 V supply. An FPGA is used to configure the serial interface of the chip, and to provide data and pulse shaping information. The measurements are done using a Agilent MXA N9020A Spectrum Analyzer and a Agilent DSO90254A Digital Storage Oscilloscope.
A. Oscillators
Each FBAR oscillator consumes 150 W. The center frequencies of the three channels as defined by the FBARs in one of the measured chips are at 2.421 GHz, 2.480 GHz and 2.491 GHz, as shown in Fig. 16 . Through the capacitor bank, , the oscillator center frequency can be tuned over a 600 kHz range with a 9.5 kHz step-size. The intrinsic phase noise of the oscillator is not measurable in this design. However, the phase noise at the output of the transmitter PA is measured, and plotted in Fig. 13(c) . At a 1 MHz offset, the phase noise is measured to be 132 dBc/Hz.
After the oscillator enable signal is turned on, the RF output at the antenna stabilizes in under 4 s as shown in Fig. 13(a) . The Frequency accuracy at 4 s is better than 20 ppm. The startup time corresponds to only 4 bit periods, ensuring efficient operation of the transmitter for even very short-length packets.
B. Power Amplifier
The power amplifier is characterized for a nominal supply voltage of 0.7 V as well as for 0.5 V and 1 V. When the PA supply voltage is changed, the rest of the circuits are still operated at 0.7 V. At each PA supply voltage, the DAC voltages are re-adjusted to optimize the short-circuit current versus on-resistance trade-off. The bias voltages and currents in the oscillator and buffer circuits are not changed. The output power is swept only by varying the impedance transformation network setting. Fig. 14 shows the efficiency of the PA alone. At 0.7 V, the output power tuning is centered around 10 dBm and the peak PA efficiency is 43% for an output power of 7 dBm. At 0.5 V, the peak efficiency increases to 44.4% at 9.5 dBm output power and at 1 V, the peak efficiency goes to 40% for 2.5 dBm. 1 The peak efficiencies remain approximately constant since the losses in the tank and switches scale with the square of supply voltage, which is the same as the scaling of output power with supply. Overall, with supply voltage adjustment and the tunable impedance transformation network, the PA achieves 14.5 dB tuning range from 17 dBm up to 2.5 dBm.
In Fig. 15 , the overall efficiency is plotted as the ratio of radiated power to power consumption of the entire transmitter. At 0.7 V PA supply, peak TX efficiency is 28.6%. At 1 V PA supply, it increases to 33%, while at 0.5 V, it drops to 23%. Unlike the PA efficiency, the peak transmitter efficiency does not remain constant at each PA supply voltage because of the constant fixed power consumed by the oscillators and buffer stages.
The figure also compares the transmitter to other frequency-stable transmitters with sub-mW outputs operating at GHz frequencies. This work has the highest transmitter efficiency at each of the output power values considered. For the moderate data rates and simple modulation schemes considered in similar designs, this metric is a fair comparison because it eliminates other variables like the data rate and range of the communication. 1 With the supply voltage increased to 1 V, the peak output power increased by 4 dB instead of the expected 3 dB. This is attributed to a change in PA bias voltages which are adjusted to re-optimize short-circuit current and on-resistance.
C. Modulation and Pulse Shaping 1) OOK With Gaussian Pulse Shaping:
OOK modulation is first considered. Gaussian pulse shaping with 0.3 is used with an over-sampling rate of 10 . The spectrum of OOK data has a feed-through component, which results in a spur at the center frequency. When the pulse-shaping is done through the switches, this spur is also modulated to the harmonics of the 10 switching frequencies. This is undesirable, since it falls in the adjacent channels.
In order to avoid this, the data is also phase-scrambled through the BPSK path in the buffer with a pseudo-random sequence [17] , thus eliminating the spurs. The phase-scrambling, however, causes a reduction in the average output power since the instantaneous power goes to 0 during phase transitions. Fig. 16 shows the superimposed spectra for 1 Mbps OOK of the three channels measured on the chip. This shows the multi-channel capability of the architecture. Spurs from the 10 oversampling are all below 30 dBc. A zoomed-in version of the spectrum is shown in the inset, comparing it to the no-pulse-shaping case. Pulse-shaping reduces the first sidelobe by 6 dB and the second sidelobe by 9 dB. Fig. 17 shows the time domain waveform of the pulse-shaped OOK data, with the phase scrambling instances marked. Overall, for a 12.5 dBm average output power, the entire transmitter consumes 440 pJ/bit.
2) BPSK With SRRC Pulse Shaping: Square Root Raised Cosine (SRRC) pulse shaping with 0.3 and oversampling of 8 is used for BPSK modulation. The first side-lobe is reduced by 13 dB, effectively reducing the 20 dBc bandwidth of the signal from 6 MHz down to only 1.5 MHz. Fig. 18 shows the time-domain and frequency spectra. The transmitter consumes 530 pJ/bit at 1 Mbps while transmitting an average of 11 dBm output power in this mode.
A Vector Signal Analyzer (VSA) is used to measure the modulation characteristics of the transmitter while sending random data. The rms phase error is 4.36 for 1 Mbps BPSK without pulse shaping. When the SRRC pulse shaping is applied, the rms phase error is 5.3 . This shows the reliable phase modulation and pulse shaping capability of the TX.
3) MSK With Gaussian Pulse Shaping: 1 Mbps GMSK modulation requires a frequency deviation of 250 kHz. The Gaussian pulse shaping (10 oversampled) is applied to the tuning capacitor banks of the oscillators. As shown in Fig. 19 , the pulse shaping reduced the first sidelobe by 7 dB and the second sidelobe by 20 dB as compared plain MSK. In this mode, the TX consumes 550 pJ/bit while delivering 10 dBm output power.
The Error Vector Magnitude (EVM) for MSK modulation is measured on a VSA to be 2.14% rms. For GMSK modulation, the EVM is 5.94%. Fig. 19(b) shows the 1 Mbps GMSK eye diagram as measured by the VSA, illustrating a wide eye.
D. Measurement Summary and Comparison
The performance of the chip is summarized in Table III . Fig. 20 plots an important metric of comparison, the energy-per-bit versus the output power of previous low power transmitter designs. The data rate of each design is annotated.
It can be noted that some of the resonator-based transmitters have had much higher energy-per-bit, while the overall efficiency is favorable in many cases. This is because most of the systems that used the resonators have implemented relatively high output powers ( 5 dBm) and low data rates ( 330 kbps) for achieving long distance communication, as opposed to the 1-2m Body Area Network scenario considered in this work. The transmitter has also been used to successfully send packetized ECG data to a commercial receiver, TI's CC2500 [22] . The packet structure of the CC2500 including the preamble, sync word and FEC are implemented on the FPGA driving the transmitter. 
VII. CONCLUSION
A transmitter architecture optimized for the short-distance link budgets of Body Area Networks has been presented. Specifically, the power amplifier has been optimized for operation at 10 dBm, and the tunable impedance transformation network results in efficient integrated pulse shaping, achieving high spectral efficiency. Further, high transmitter efficiency is achieved at these low output power levels by the use of a high-FBAR-based LO generation scheme. Multi-channel operation has been achieved using inherently single-channel resonators through efficient oscillator multiplexing. Future development of integrated on-chip resonators [23] will enable area-efficient expansion of this architecture to a larger number of channels. In addition, low voltage operation at 0.7 V and maximum use of the swing available at all RF nodes improves energy efficiency. Overall, the transmitter has been measured to consume 440 pJ/b for 1 Mbps Gaussian pulse-shaped OOK at an average output power of 12.5 dBm.
ACKNOWLEDGMENT
Chip fabrication was done through the TSMC University Shuttle Program and FBARs were provided by Avago Technologies. The authors would like to thank S. Bandyopadhyay, M. Qazi, and B. Ginsburg for useful discussion and feedback.
