Abstract-This paper presents an FSK wake-up receiver with a -80 dBm sensitivity using a packet structure and a duty cycling scheme compliant with the Bluetooth Low Energy (BLE) protocol trading off power with latency. Event-driven applications achieve power lower than 240nW from a 0.75V supply while latency-critical systems wake up in almost 200μs at a 230μW consumption. A within-bit LC oscillator duty-cycling scheme is proposed to provide an extra 24% power reduction. Additionally, a custom FSK transmitter can trigger wake-up at 17nW only for an average latency of 5 seconds.
INTRODUCTION
With billions of devices expected to be connected together through the internet of things (IoT), ultra low power communication becomes the main challenge for such energy constrained devices. Ubiquitous IoT nodes usually employ a duty-cycling scheme in order to increase their life expectancy to tens of years through long periods of ultra low power sleep with short power hungry active intervals. Wake-up receivers (WuRx) act as the interface between the IoT nodes and the users where they monitor the medium for on-demand wake-up commands while the IoT node is asleep.
Simple energy detection architecture appears as the best candidate for low power wake-up receivers especially in the nanowatt power consumption. However, Bluetooth Low Energy (BLE)-compliant wake-up systems utilizing energy detection such as [1] and [2] are limited in sensitivity. The work in [1] achieves 236nW consumption while the sensitivity is limited to -56dBm. It is targeted for latency in the order of seconds where each packet constitutes only one symbol of information. On the other hand, the work in [2] achieves around 100μs latency for a sensitivity of -58dBm. It consumes 164μW with a 2-dimensional wake-up scheme. Several other designs achieve sub-μW consumption but inconveniently dictate the use of custom On-Off keying (OOK) transmitters in order to wake up the sleeping nodes [3] [4] .
This work shows a BLE-compliant wake-up receiver with a -80dBm sensitivity consuming power down to 240nW and latency as low as 200μs depending on system configuration. This is achieved through a) a receiver employing passive mixer-first architecture, low power bandpass filter-based FSK demodulators and digital correlators consuming a total of 230μW for -80dBm sensitivity, b) a duty-cycling scheme and packet structure built around BLE advertising channels tradingoff latency (up to 12s) and power (down to 240nW) while maintaining low false alarm rates below 1 in 1000 seconds, and c) a bit-level duty cycling of the local oscillator to further save 24% of the active power. If operated without BLE standard compliance constraints, the average power drops to 17 nW and is almost limited by the 10 nW leakage power of the design.
Section II describes the system architecture and the dutycycling protocol. Section III provides the circuits of the main building blocks of the receiver. Section IV illustrates the measurement results while section V provides a conclusion.
II. PROPOSED WAKE-UP RECEIVER
This section describes the system architecture as well as the system-level duty cycling scheme.
A. Receiver Architecture
The wake-up receiver adopts a mixer first architecture where a passive mixer filters and downconverts the input signal from the off-chip antenna and matching network, as shown in Fig. 1 , to the intermediate frequency (IF) . The IF signal then undergoes filtering, amplification and demodulation in order to produce the wake-up (WU) signal to the sleeping IoT node upon the reception of a pre-defined wake-up pattern (WuP).
An off-chip calibration engine initially measures the local oscillator (LO) frequency and performs one-time calibration to tune the receiver to the ISM band. In addition, an off-chip security engine can be used to provide one-time WuP for the receiver correlator after each wake-up. Fig. 2 shows the format of the BLE undirected advertising packet indicating the possibility of embedding a wake-up sequence of up to 31 octets inside the advertising data (AdvData) of the packet. Although BLE uses a deterministic whitening pattern on the payload, a user can counteract this by pre-coding the AdvData as shown in Fig. 2 . Thus, there is full control over 31-octets modulated with ±250 kHz Frequency Shift Keying (FSK) at 1 Mbps. Since wake-up radios are triggered on a correlation with a full sequence of symbols, bitrepetition could be used to effectively slow-down the rate of individual symbols, thereby easing circuit design and improving sensitivity. This does trade-off false-alarm rates (FAR) due to the finite payload length. For instance, 40 kbps allows only 10 bits of WuP, and leads to unacceptable FAR, while 80 kbps works well in duty-cycled operation and 166/333 kbps work well in low-latency operation.
B. BLE-compliance and Duty-cycling
Different duty-cycling schemes can be employed for the FSK wake-up receiver. For a custom transmitter continuously transmitting the WuP repeatedly, then the receiver has to be on for a time (TON) equal to at least two packets to guarantee correlating to the full sequence. In contrast, a standard BLE transmitter only sends advertisements periodically, with intervals of 20ms up to 10.24s. The WuRx now needs to be on for TON of at least the advertising interval to catch the signal. Therefore, both schemes allow duty cycling to trade-off power with latency as shown in Fig 2. III. CIRCUIT DESCRIPTION Fig. 3 shows the system architecture with detailed circuit diagrams. A free-running on-chip LC oscillator drives the mixer switches to downconvert the BLE advertising packets (at channel 37) to the IF blocks where the signal undergoes amplification and a 1 MHz band-pass filtering. The filter is a 4-path switched capacitor filter such that each path is driven by a single phase of four non-overlapping clocks generated by an 8-bit digitally controlled ring oscillator fine tuned according to the system IF, nominally at 4 MHz. Then an FSK demodulator decodes the input bits while a comparator with an oversampling ratio of 3 feeds three banks of correlators to search for the device WuP. If the input matches the sequence, the correlator output then exceeds its threshold and produces a WU signal to the sleeping IoT node. The 3x oversampling ratio accounts for baseband synchronization with the transmitter. In order to ease the design and absorb the LO uncertainty, the WuRx employs bit repetition and de-whitening to transform the Gaussian FSK (GFSK) packet bits into 83 kbps signals with a frequency deviation of 500 kHz, then, the receiver chain simply incorporates two narrow band pass filters centered around frequencies of ±250 kHz from the IF. The N-path filters include Gm-cells along their four phases [5] to alter the impedance seen at the switching frequency and hence, move the center frequency of the filter upwards or downwards according to the Gm polarity as illustrated in the simulation results in Fig. 4 . A positive shift is used for the bit '1' bandpass filter (BPF) while a negative shift is used for the '0' BPF where these shifts are given by (1) where Gm is the shifting transconductance and CFSK is the FSK filter capacitance. The center frequency of each BPF can be independantly finely tuned using a 4-bit control over the transconductance of the corresponding Gm cell.
The WuRx has a highly programmable receiver chain blocks. Therefore, all the IF blocks can be tuned to track the input signal frequency with a simple free running LC oscillator which doesn't employ a phase locked loop to control its frequency. However, one-time calibration is performed at startup to coarsely tune the frequency of the IF signal within the ISM band of BLE transmission. Then, the ring oscillator as well as the Gm-cells perform the fine tuning on the N-path IF filter and the FSK demodulation filters respectively. Measurements show that the local oscillator exhibits a stable frequency over hours of operation. Such tunability allows for an IF bandwidth of only 1 MHz offering almost 1000x frequency selectivity over other wideband energy detection architectures which translates to a lower noise bandwidth and higher receiver sensitivity.
B. Within-bit duty-cycling
Operating at the 2.4 GHz band, the WuRx consumes an active power of 230 μW from a 0.75 V supply. The LC oscillator and its buffers consume 192 μW which corresponds to about 83% of the total active power of the receiver. One merit of using a lower data rate is that each bit is transmitted for a longer time. In this system, each bit can be as long as 12 μs for a data rate of 83 kbps. At such low data rates, the LC oscillator has enough time to be turned off and on during one bit transmission while still resolving the correct output at the correct sampling instances.
Within-bit duty-cycling technique is illustrated in the measured waveforms of Fig. 5 where the oscillator is turned on for 66% of the cycle in order to guarantee it has enough time to settle. Such technique provides 33% savings in the oscillator's average power consumption when the receiver is active and reduces the total active power to 175.2 μW. Despite having unreliable output during the oscillator off-time within each bit where the output might randomly toggle as shown in the RXout trace in Fig. 5 during some of the '0' transmissions, the 3x oversampling guarantees that two samples lie within the active time of the oscillator. Hence, one of the sampled outputs (Dout in Fig. 5 ) correctly follows the input bits while reducing the overall average power of the receiver.
As depicted in Fig. 3 , the IF ring oscillator generates the required enable signals for the LC oscillator in order to control the interval and duty ratio of the within-bit duty-cycling scheme. Such technique trades-off sensitivity with power consumption where a 24% power saving is attained at a sensitivity of -74dBm. Hence, it proves to be useful in high signal to noise ratio environments. 
IV. MEASUREMENT RESULTS
The BLE WuRx chip was fabricated using a 65 nm LP CMOS technology and tested using a smartphone's BLE advertising packets. Fig. 6 shows the measured waveforms of the duty-cycled operation for both BLE and custom FSK packets. For BLE packets, the WuRx is active only for a 25ms interval in a duty cycle period of 25s yielding an average power of 240nW. During its active time, it detects the BLE packet from an android cell phone almost after 14ms of turning on. Such packet contains a predetermined WuP embedded in its AdvData. Then, as shown in the zoomed-in waveform of the received packet, once a 31-bit correlation is detected from the highlighted sequence 0x0C4E 4EA1, a wake-up signal is generated from the receiver to trigger the sleeping IoT node.
With a chip leakage power of only 10 nW, a custom FSK transmitter unconstrained by the BLE advertising interval can bring down the average power to 17nW only while maintaining an average latency of 5 seconds as shown in Fig. 6 where the WuRx is turned on for only 300μs in a 10s interval. Fig. 7 shows that the measured raw sensitivity using continuous transmission is -80 dBm at a bit error rate of 0.1%. This translates to a miss rate of less than 1%, demonstrating reliable operation. Since BLE standard specification requires better than -70 dBm operation, this design will work well in typical IoT environments. As the 83 kbps rate can only accommodate a 20-bit WuP while the 333 kbps can correlate to an 82-bit sequence, then different data rates can be used to trade off the sensitivity with FAR which decays exponentially with the sequence length. Additionally, the correlator threshold can also be reduced trading off sensitivity with false alarms. For instance, Fig. 7 shows that for a 166kbps rate, a 38-bit correlation threshold yields a sensitivity of -82dBm for a miss rate of 1% and a FAR of about 1 per 30 minutes in an always on operation. Fig. 8 shows the power-latency trade-off where the average power can be scaled down to 240nW at an average latency of 12.5s for an advertising interval of 20ms. On the other hand, a critical application can continuously operate the receiver at its active power of 230μW to trigger a WU signal within a BLE packet-size latency of almost 200μs. The receiver can be programmed to move from one mode to the other according to the demand of the ongoing application. In addition, Fig. 8 illustrates that a custom FSK transmitter can scale the WuRx power to the 10nW leakage limit at few seconds of latency. Fig. 9 shows the die photo of the chip with a receiver active area of 0.48 mm 2 in a QFN-56 package. Table I summarizes the performance and provides a comparison with other low power BLE compliant wake-up radios.
V. CONCLUSION
In conclusion, a scalable 2.4 GHz wake-up receiver was designed and fabricated in 65 nm LP CMOS process. The wake-up receiver achieves a sensitivity of -80dBm with an active power of 230μW from a 0.75V supply and at a latency of 200μs. A BLE-compliant advertising and system-level duty cycling allows power scaling down to 240nW with latency of 12.5s. A custom FSK transmitter allows 17nW operation with a latency of 5 seconds. In addition, the receiver employs several system-level knobs such as data rate, WuP length and threshold, duty-cycling interval, and within-bit duty cycling in 
