I. INTRODUCTION
T HE Achilles' Heel of today's energy scavenged wireless sensor nodes (WSNs) is the unequal balance between the power consumed by the WSN and the power generated by the energy harvester. On one side of the scale, the power consumption of integrated circuits decreases over time with more advanced CMOS technologies. The energy harvester on the other side of the scale may also benefit from these improvements. It may thus be argued that today's unequal power balance can be solved with future technology improvements. This, however, might not be the case if the difference between the required power and the available power to the harvester simply is too big for a specific application. A second reason is that the scaling of CMOS devices is not necessarily beneficial for the performance of analog circuits [1] . Hence, innovations at both the energy harvester and the WSN are needed.
Radio frequency (RF)-powered WSNs have the distinct advantage over other energy harvesting systems that they are low cost and can operate wirelessly in a large variety of applications, even in cold, dark, and static environments [2] .
Furthermore, a dedicated RF source can provide strong and reliable power and can even serve as a data and clock distribution hub such that the complexity and power consumption of the WSN can be greatly reduced [3] . Typical applications where these WSNs can be deployed are dynamic and fault tolerant self-organizing network used for industrial manufacturing, agriculture, and inventory control [4] .
The absence of a stable reference frequency in low-cost WSNs makes it very challenging to implement a low-power wireless communication architecture. Passive RF identification (RFID) backscattering, therefore, has been a popular choice because of its simplicity and low power consumption [5] . The functionality and operating range, however, are limited and the system can suffer from reader selfjamming [6] . Other solutions have been proposed that utilize a local oscillator for RF carrier generation [7] . This not only requires significant power, but also is very challenging to realize with sufficient accuracy over process-voltage-temperature variations. In [8] , a crystal-less RF-powered transceiver has been implemented that uses the received 915-MHz carrier frequency as phase-locked loop (PLL) reference frequency in order to realize a 2.4-GHz RF carrier to be used for wireless data transmission. A similar architecture has been proposed in [9] , where the received signal is fed to an injection-locked frequency multiplier to generate a 402-MHz carrier.
Some fully integrated solutions have been proposed using on-chip antennas. An RFID tag harvesting at 5.8 GHz with a 3.1-10.6-GHz ultra-wideband transmitter (TX) has been proposed in [10] . Although the sensitivity of the tag itself is −14.22 dBm, the on-chip antennas greatly limit the wireless operating range to 7 cm.
In this paper, we demonstrate an RF-powered delay locked loop (DLL)-based 2.4-GHz TX in 40-nm CMOS technology. Frequency synthesis is realized using a dedicated RF signal that serves as input to a DLL and logic XOR-based frequency multiplier, which thereby eliminates the need for inductors and enables a low-complexity, low-power, and area efficient solution.
A nanowatt power management unit is proposed to enable excellent sensitivity and long wireless range RF-powering. A system level description and its key design considerations are given in Section II followed by the circuit design in Section III. Experimental results are discussed in Section IV and the conclusions are given in Section V.
II. SYSTEM LEVEL
A system level description of the proposed RF-powered TX is shown in Fig. 1 . The required energy of the system is 0018-9480 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. supplied wirelessly by an RF energy harvester that converts the captured electromagnetic (EM) energy by the antenna into electrical dc (direct current) power. In this paper, a dedicated RF source is assumed within the vicinity of the WSN that provides strong and reliable power in the 902-928-MHz band. The effective isotropic radiated power in this unlicensed band is limited to 4 W by the Federal Communications Commission [11] , therefore allowing for a long wireless range.
A. RF Energy Harvesting and Power Management
The harvested energy is first locally stored in a capacitor until enough energy is accumulated to initiate wireless data transmission. The voltage across the storage capacitor C store and the supply voltage V D D is sketched in Fig. 2 to illustrate the power management functionality. Here, V L and V H are defined as the low and high voltages that determine the amount of stored energy available to the system. When the system is in harvesting mode, the rectifier charges C store to V H while only the voltage reference and voltage detector are enabled. When V store ≥ V H , the voltage detector enables the voltage regulator and voltage-to-current (V -I ) converter, which provides a stable V D D and bias current I bias while C store is discharged from V H to V L . The active time T active is determined by C store , V H , V L , V D D , and the total current drawn by the system. Once V store ≤ V L , the detector disables the voltage regulator and the system returns to harvesting mode.
The charging time T charge is determined by the available RF power at the antenna, the RF rectifier efficiency, the storage capacitor, and the power consumption of the voltage reference and detector. To determine the system design variables, we first consider the voltage regulator efficiency of this system.
The energy efficiency of an ideal linear voltage regulator (neglecting bias current) with a constant input and output voltage is simply given by η = V D D /V H . However, in this system, the input voltage of the voltage regulator decreases linearly with time when assuming a constant current sink I load as load. The energy at the input of the voltage regulator for
The energy delivered to the regulator load is given by
The voltage regulator energy efficiency η regulator = E out /E in then can be expressed as
The second term in (2) indicates the relative improvement in efficiency compared with a voltage regulator with constant input and output voltage. The required values for V D D and V L are determined by the circuit's minimum supply voltage and the voltage regulator implementation. The value of V H on the other hand depends on the type of modulation and system parameters like power consumption, the amount of data that needs to be transmitted and the value of C store .
The required energy for wireless transmission is found by relating the amount of data to be sent, Data [bit] , and the Bitrate [bit/s] to η regulator and the TX power consumption
ON-OFF keying (OOK) modulation is realized by enabling and disabling the power amplifier (PA) with a NAND gate, the remaining TX core circuits are always on during modulation to minimize the start-up time. The average dc power consumed by the TX during transmission thus is written as P DC,Tx = m P DC,PA + P DC,core , where m is the probability of transmitting a "0" and "1" data bit and P DC,PA and P DC,core are the dc power consumption of the PA and the remaining core circuits during transmission, respectively. Note that P DC,Tx is calculated with respect to V D D and not V H , since the power loss due to the regulator voltage drop is already included in η regulator .
The energy available from the storage capacitor is given by
To obtain a high efficiency, the difference between V H , V L , and V D D needs to be as small as possible as indicated by (2) while V D D sets the minimum voltage in order to meet the circuit supply specifications. Lowering V H increases the required C store as evident from (4), which in turn improves the system sensitivity (i.e., the minimum available power to reach V H ), since the rectifier now requires a lower minimum power at the antenna to charge C store to V H . For a given value of C store , the minimum V H is found by substituting (2) into (3) and equating this with (4). Rewriting for V H gives
The active transmission time T active is given by
As an example, a WSN containing 128 b of information with C store = 1 μF, 500-kb/s data rate, P DC,PA = 1.5 mW, P DC,core = 0.5 mW, m = 0.5, V D D = 1 V, and V L = 1.1 V requires a minimum V H of 1.42 V and allows for a transmission time of T active = 256 μs. Using (2), a theoretical voltage regulator efficiency of 79.3% is found. Higher efficiencies can be achieved by increasing C store such that V H can be lowered. A practical upper limit for the capacitor selection is set by the leakage current. Large (super) capacitors tend to have a relatively high leakage current compared with the current output of the RF energy harvester and thus reduce the harvesting efficiency. In this paper, the storage capacitor is implemented off-chip with a low leakage 1-μF polyester film capacitor.
B. Frequency Synthesizer and Power Amplifier
Using the received dedicated RF signal as a reference frequency for frequency synthesis is a low cost and low complexity way of realizing an RF carrier for wireless data transmission when no other stable reference frequency (like a crystal resonator) is available to the WSN. By extracting the input frequency and applying a frequency multiplication of ratio 8/3, a TX carrier between 2.405-2.47 GHz can be realized that covers almost the entire unlicensed 2.4-GHz band [11] . The 8/3 architecture first has been proposed in [8] and has been implemented using a frequency divider and PLL. In the proposed RF-powered TX shown in Fig. 1 , the RF input signal frequency is first divided by three and subsequently used as DLL reference signal. The DLL consists of a phase detector, charge pump (CP) with low-pass filter (LPF), and a voltage controlled delay line (VCDL) that produces eight evenly spaced signals that are fed to an eight-time frequency multiplier. For a 915-MHz input, the majority of the circuits thus operate at 305 MHz while only the PA operates at 2.44 GHz. Since the DLL mainly consists of logic gates and does not require any inductors, it allows for a compact and area efficient solution. The DLL also is a single-pole system and thus inherently stable.
The limited power budget and short-range application of a WSN usually require a small output power (<0 dBm). An RF PA with high global efficiency (P out /P DC,total ), therefore, is required as the output power, which becomes comparable to the total power consumption of the WSN.
III. CIRCUIT DESIGN

A. RF Energy Harvesting and Power Management
The RF rectifier circuit design consists of a five-stage cross-connected differential rectifier, as shown in Fig. 3 . The rectifier is optimized for a capacitive load and a high input impedance (12-j263 at < −20 dBm), which enables a large passive voltage boost. A 50:50-balun, a capacitor, and two high-Q off-chip inductors are utilized to provide an impedance transformation to the 50-signal generator. Readers interested in the codesign principles of antenna-rectifier interfaces are referred to [12] .
Only the voltage reference and the voltage detector are enabled in the harvesting mode and, therefore, sink a continuous current from the RF energy harvester. These two circuits together with the rectifier determine the overall system sensitivity and thus are designed for minimum power consumption. The CMOS voltage reference shown in Fig. 4 is utilized, since all its transistors can be biased in the subthreshold region to realize a low voltage and current design. Self-cascoded transistors (denoted by subscript a and b) are used to reduce sensitivity to supply voltage variations without requiring additional biasing or increased supply voltage. The reference voltage V ref, 1 can be approximated to the threshold voltage difference between M 8 and M 6 [13] . This difference is increased by using a high threshold voltage transistor for M 8 and a low threshold transistor for M 6 . This allows for a stable reference voltage, since both transistors experience the same variation over process corners. Device mismatch is minimized by using large transistor sizes and ensuring identical orientation and surrounding environment. Transistor M 9 is an exact copy of M 10 and generates a second voltage V ref,2 = 2V ref, 1 , which is used for various circuit blocks.
Transistor M 11 is biased in the triode region and creates an LPF with the node capacitance at its drain, which improves the power supply rejection ratio (PSRR) at high frequencies. Simulations indicate an additional 30-dB improvement in PSRR The bias currents for the various circuit blocks are generated with a self-biased negative feedback V -I converter (not shown). The V -I converter is disabled during harvesting mode and consumes 3 μA when enabled by the voltage detector.
B. Voltage Regulator
The circuit implementation of the voltage regulator is shown in Fig. 5 . A two-stage opamp with pMOS input stage (M 6 and M 7 ) and pMOS pass transistor (M 12 ) is utilized to accommodate the input voltage and to provide a large dc gain. The system load impedance is represented by C L and R L . Note that C L includes the decoupling capacitors of various circuit blocks and needs to be relatively small compared with C store otherwise a significant amount of energy is lost by simply transferring charge from C store to C L .
Frequency compensation is done by connecting C comp from the drain of M 11 to the low impedance intermediate node of the self-cascode transistor consisting of M 9,a and M 9,b . Besides providing pole splitting, an additional left-half plane zero is created that is used to stabilize the amplifier [14] . Transistors M 1 , M 5 , and M 10 are used to minimize current leakage by defining critical floating nodes when Enable = "0" (I bias is also switched OFF for Enable = "0.") The total current consumption of the voltage regulator equals 53 μA. For a compensation capacitance of C comp = 5.2 pF, a dc gain, unity gain frequency, and phase margin are found to be 75.8 dB, 18.84 MHz, and 67°, respectively.
C. RF Extraction and Frequency Divider
The received RF signal is ac coupled and amplified by the first inverter-based amplifier consisting of M 3 and M 4 , as shown in Fig. 6 . Large transistors with high W/L ratios are used to obtain high gain and low noise. A small-scaled inverter replica (M 1 and M 2 ) is sized to provide a gate bias voltage for the first amplifier while consuming low static current. The input impedance is made high enough not to degrade the RF rectifier performance. The second amplifier (M 5 and M 6 ) is also ac coupled to avoid duty cycle distortion and provides additional driving capability with fast edge transitions that ensures a rail-to-rail input for the frequency divider. Simulations show that a minimum input power of approximately −26 dBm is required for RF extraction when including the passive voltage boosting obtained in the antennarectifier interface.
The frequency divide-by-three circuit converts the 915-MHz rail-to-rail signal into a 305-MHz signal and is based on a digital logic divider to obtain a large locking range [15] . The top and bottom transistors are controlled by the input signal and determine when M 7 -M 12 can go to the next transition. Inverter I 1 and I 2 are dummy cells to ensure equal capacitive loading. The frequency divider is followed by a single-todifferential converter (not shown) based on [16] combined with an edge aligner and pseudodifferential buffer for minimum skew and duty cycle distortion.
D. Delay Locked Loop and Frequency Multiplier
The DLL shown in Fig. 7 consists of an eight-stage VCDL with dummy delay cells at the input and output for equal loading. A phase detector [17] detects the phase difference between the delayed signal at the last stage to the signal at the first stage and subsequently drives a single-ended sourceswitching CP [18] . The CP output is filtered and controls the delay cells with V cntrl such that eight evenly spaced signals are produced. Each delay cell is implemented with a current-starved pseudodifferential inverter and an additional The operating principle of the frequency multiplier is based on an XOR logic gate with two 90°out-of-phase signals applied to its input that generate an output signal at twice the frequency. By distributing the eight different phases available from the VCDL as shown in Fig. 8 , an eight-time frequency multiplier can be realized. Monte Carlo simulations show that duty cycle distortion increases with each frequency multiplication. This distortion can be reduced by using duty cycle correction circuits after each frequency multiplication [19] . This, however, significantly increases the power consumption and is considered not to be a feasible solution for the limited power budget. Therefore, both the frequency multiplier and XOR implementation [20] are designed fully symmetrical to reduce mismatch between the different signal paths. Moreover, the 2.44-GHz output signal also does not require exactly 50% duty cycle, since the proposed PA uses a duty cycle calibration loop to obtain high drain efficiency, which will be discussed next.
E. Power Amplifier
The PA in this paper is based on the tuned switching PA, as shown in Fig. 9(a) . The transistor acts as a switch with resistance R SW that is controlled by the input signal with duty cycle d = α/2π. The dc supply voltage is fed through a choke inductor L DC and the load is dc-blocked with C DC . A high-Q harmonic tank filter at the output filters the fundamental RF signal. The load resistance R L ,eff represents the effective resistance seen from the PA.
For the following analysis, it is assumed that the voltage across the switch is a sinusoidal signal oscillating around V D D due to the high-Q harmonic tank filter [21] . This is a fairly good approximation for high load resistances and duty cycles below 0.5. Furthermore, the current waveform through the switch is assumed to be a square wave with peak current I p .
By inspecting Fig. 9(b) , the dc current I DC for a given duty cycle is simply given by I DC = d I p . The fundamental current component for R SW = 0 is given by
Given that the fundamental voltage component is given by V 1 = V D D and that the fundamental RF power equals P RFout = (1/2)V 1 I 1 , then the drain efficiency is given by
The reduction in drain efficiency due to the power loss in the switch resistance R SW can be estimated, as described in [22] 
The average dissipated power in the switch resistance is given by
while the fundamental RF output power can be expressed as
Substituting (7) into (11), rewriting for I 2 p and substituting in (10) yields Finally, the drain efficiency is found by substituting (12) and (8) into (9) η drain ≈ sin(πd) πd
(13) Fig. 10 shows a surface plot of (13) for R L ,eff = 1 k where the duty cycle is varied between 0 and 1 and the switch resistance varies between 0 and 500 . For R SW = 0, the drain efficiency always increases for lower duty cycles. However, this is no longer the case when R SW = 0. When the duty cycle decreases, the power loss due to R SW becomes relatively large, because the fundamental RF output power approaches zero. Therefore, for a non-ideal switch, an optimum duty cycle for maximum drain efficiency is found roughly between 0.2 ≤ d ≤ 0.3, depending on R SW and R L ,eff . Simulation results (not shown) indicate that this simplified model describes the drain efficiency with an inaccuracy of 12% or better for duty cycles below 0.5 and R SW below 500 . Since no higher frequency signals are available to derive a reduced duty cycle from, the on-chip duty cycle calibration loop is used, as shown in Fig. 11 . The PA driver consists of a cascade of tapered inverters, where the first inverter has an additional pMOS transistor that controls the inverter's rise time. The voltage at the PA input (M 1 ) is sensed with a small inverter that drives an LPF to obtain the dc voltage, which is an indication of the duty cycle. An opamp subsequently compares the dc voltage to a reference voltage and regulates the gate of the pMOS transistor. The reference voltage is set to V REF = (3/4) V D D to obtain a 25% duty cycle. The PA transistors M 1 and M 2 are sized to obtain R SW ≈ 220 and a total gate capacitance of approximately 36 fF. When using R L ,eff = 1 k , it follows from (13) that η drain ≈ 58% for P RFout = −3 dBm.
IV. EXPERIMENTAL RESULTS
The TX is fabricated in TSMC 40-nm CMOS technology and is bond wired to a 24-lead QFN package, which is mounted on a PCB for testing. The active area occupies 0.16 mm 2 , as shown in Fig. 12 .
The RF rectifier performance is first evaluated by disconnecting the power management circuits and measuring the steady-state dc output voltage for different load conditions and available power. An off-chip picoampere input bias opamp is used to minimize the resistive loading effect of the measurement equipment. A 50-signal generator is used to supply a 915-MHz RF continuous wave. The losses of the off-chip matching network (Fig. 3) are included while the insertion loss of the off-chip balun is excluded, since a balun is not required when replacing the 50-source with a differential antenna. Fig. 13(a) shows the measured dc output voltage versus P av for different load conditions. For the highest load resistance (a purely capacitive load, R load = ∞ and C store = 1 μF), the load current is the lowest and thus the rectifier requires only −22.6 and −18.6 dBm to generate 1 and 1.5 V, respectively. The power conversion efficiency (PCE) is determined by measuring the output voltage versus P av for different load resistances and subsequently calculating PCE = V 2 out /(R load P av ). The PCE presented in Fig. 13 (b) peaks around −11.47 dBm with a maximum of 36.83% for a load resistance of R load = 88 k . The optimum load resistance is a function of the antenna Q-factor and the number of rectifying stages, but also varies with P av due to the nonlinear input impedance of the rectifier [12] . The measured −3-dB bandwidth at P av = −18 dBm equals approximately 60 MHz. The effect of temperature variations on the voltage reference is shown in Fig. 14(b) . The voltage reference changes 9.6 mV between −20°C to 120°C, resulting in a temperature coefficient of 104.5 ppm/°C. Over the same temperature range, the total current drawn by the voltage reference and detector changes from 80 to 500 nA. Although the current consumption increases at high temperatures, the rectifier output current also increases, because the transistor threshold voltage decreases with increasing temperature and thus compensates for this effect. Note that there is a very small difference for both the voltage and the current characteristics when V store is changed from 1.2 to 1. after which the detector again disables the circuit blocks. The charging time to reach V H from V L for P av = −18.4 dBm and C store = 1 μF equals 936 ms. For higher power levels, T charge decreases such that a higher sensor update rate can be realized. Fig. 15(b) shows a zoomed-in-view of the active period where the TX is enabled continuously. While the storage capacitor is being discharged from V H to V L , the voltage regulator stabilizes V D D and the TX outputs a continuous 2.44-GHz RF signal at −2.57 dBm for T active = 128 μs, which corresponds well with (6) for m = 1. During continuous transmission (no modulation), the voltage regulator consumes 54 μW, the frequency synthesizer 637 μW, and the PA driver consumes 105 μW. The PA drain current equals 1516 μA, resulting in a drain efficiency of 36.5% from a 1 V regulated supply. This is lower than the theoretical 58% predicted by (13) . The main loss mechanisms not included in the analysis are the resistive and mismatch losses of the matching network and the nonideal square wave voltage at the PA input.
The total power consumed by the TX is 2.312 mW, resulting in a global efficiency P out /P DC,total of 23.9%. The regulator efficiency is measured indirectly by calculating η regulator from the measured quantities in (3) and (4). It was found that the measured η regulator was 81.3%, which is 5.7% higher than predicted by (2) . This discrepancy is caused by the fact that the analysis in Section II assumes a constant current sink as source, whereas the obtained measurement results include the complete radio with start-up and settling effects. Measurements at P av = −18.4 dBm show a residual rms jitter of 0.9 ps at 2.44 GHz and a phase noise of −112.5 dBc/Hz at 1-MHz offset. The measured second and third harmonic tones are both 47 dB below the fundamental tone, while the closest spurious tone is −23 dBc and is located 305 MHz away from the carrier.
The start-up behavior of the complete TX at P av = −18.4 dBm input is shown in Fig. 15(c) . Note that this includes all cold-start settling effects like voltage and current biasing and internal and external capacitors charging effects. The DLL is activated after approximately 500 ns and after 800 ns, and the complete TX is settled with a regulated supply. The minimum measured power for RF extraction is −25.47 dBm across the entire 902-928-MHz unlicensed band.
The measured TX output spectrum for a 0.5-Mb/s pseudorandom OOK modulated signal is shown in Fig. 16(a) . The measured power consumption of the complete TX during OOK modulation equals 1.46 mW. Measurement results showed, however, that the large PA start-up current causes a small voltage dip at the output of the voltage regulator. Since all circuit blocks share the same supply voltage, this voltage dip also detunes the DLL and causes a dip in the RF output during transmission of a logic "1," as shown in Fig. 16(b) . As the duration of this voltage dip is approximately 75 ns, this limits the attainable bitrate. An external low dropout regulator was used to test this hypothesis, which indeed eliminated the dip in the RF signal and allowed for higher bitrates. Fig. 16(c) gives an indication of the required receiver (RX) SNR when a continuous interference signal is present near the desired dedicated RF signal, which is utilized for frequency synthesis. This continuous interference signal can, for example, be considered to belong to a neighboring dedicated RF source that extends the area of wireless powering. The TX output signal is measured for a consecutive "0101" series at the RF input while a continuous wave interferer is added at a given frequency offset from the dedicated RF signal. The dedicated RF signal is set to the minimum start-up power of −18.4 dBm. Subsequently, the required SNR for a 0.1% bit error rate (BER) at the RX is determined using an optimum threshold noncoherent demodulator, which is implemented in MATLAB. When the interferer power level is below the dedicated RF power level, no significant performance reduction at the RX is observed. When the interferer power level is increased further, it shows that a higher SNR at the RX is required to maintain a 0.1% BER. A higher power interferer with small frequency offset does not degrade the required SNR, because the stronger continuous wave interferer simply determines the reference signal for the frequency synthesizer. Note that this does not hold in case of, for example, a frequency modulated interference signal. A high power interferer with larger frequency offset will more frequently generate a destructive signal at the input of the RF extraction circuit and hence results in more edge misalignments at the DLL and frequency multiplier and also impacts the fundamental signal frequency and amplitude. In order to demodulate the desired signal with a −10-dBm continuous wave interferer at 13-MHz offset, the RX requires an increase of 5 dB in SNR, thereby limiting the maximum wireless range that can be achieved. More filtering at the antenna input in this case could be added to prevent this degradation. Table I summarizes the measured experimental results and compares it with prior art. This paper shows a very small active area of only 0.16 mm 2 due to the compact DLL and frequency multiplier circuits. Only the work in [5] shows a smaller die area, but is based on backscattering and thus is much more limited in output power and also requires 12.5 dB more RF power for start-up. Also, a TX global efficiency of 23.9% shows favorable compared with its competitors. The minimum required start-up power shows excellent sensitivity to enable long wireless range operation.
V. CONCLUSION
A compact RF-powered DLL-based 2.4-GHz CMOS TX has been presented. The received dedicated RF signal is used for both RF energy harvesting as well as frequency synthesis by using a nanowatt power management circuit combined with a DLL and XOR-based frequency multiplier. A tuned switching RF PA with 25% duty cycle input is utilized in order to obtain high global efficiency for <0-dBm output power.
Experimental results of a 0.16 mm 2 40-nm CMOS prototype show a maximum rectifier efficiency of 36.83% and a power management circuit with 120-nA current consumption during harvesting mode. For a 1-μF storage capacitor and −18.4-dBm minimum available power at 915-MHz RF input, the TX outputs a continuous 2.44-GHz RF signal at −2.57 dBm for 128 μs with 36.5% PA drain efficiency and 23.9% global efficiency. The complete TX consumes 1.46 mW during OOK modulation at 0.5 Mb/s.
