I. INTRODUCTION
Remote wireless sensor nodes for the Internet of Things (IoT) rely on duty-cycling to achieve extremely low average power consumption but this requires an accurate wakeup timer. Such timer must avoid off-chip components, such as quartz crystals, and occupy minimum area to save costs and module size. It must consume ultralow power (<1 μW), since it is continuously active, while operating at a low supply voltage for compatibility with a wide range of energy sources (e.g., button batteries and energy scavengers) and to simplify power management [1] , [2] . Because of size and power limitations, RC oscillators are a preferred choice. However, the frequency stability of conventional RC relaxation oscillators is limited by the delay of power-hungry continuous-time comparators, which are vulnerable to PVT variations [3] - [5] . Oscillators based on frequency-locked loops (FLL) circumvent such limitations, but they heavily rely on analog-intensive circuits, which require significant power, area and a high supply voltage [6] , [7] . Hence, they are not friendly to technology scaling in terms of area and required supply voltage.
Alternatively, this letter presents a wakeup timer employing a digital-intensive FLL (DFLL) architecture to fully exploit the advantages of advanced CMOS processes, thus allowing low area, low power, and low supply voltage. The proposed timer achieves the best energy efficiency (0.43 pJ/cycle) at the lowest supply voltage (0.7 V) over the state of the art [2] , [3] , [5] , [8] - [11] , while maintaining excellent on-par long-term stability (Allan deviation floor below 20 ppm) in a small area (0.07 mm 2 in 40-nm CMOS). These advances are enabled by the use of a bang-bang DFLL architecture employing a chopped dynamic comparator and a low-power high-resolution self-biased digitally controlled oscillator (DCO). The rest of this letter is organized as follows: the architecture of the proposed oscillator and the circuit implementation are introduced in Sections II and III, respectively; Section IV describes the measurement results; the conclusions are drawn in Section V.
II. TIMER ARCHITECTURE
The architecture of the proposed DFLL and its timing diagram are shown in Figs. 1 [7] . At the end of 2 , the output of the The only analog components in the DFLL are a switching passive RC network for the FD, a comparator and a DCO. As shown in Section III, such analog circuits can be implemented using switches and inverter-based structures, so that they can be easily integrated in a nanometer CMOS process with a low power consumption, a low supply voltage, and a small area. The DCO output frequency is fed into a multiphase clock divider to provide all the clocks required in this self-clocked FLL (Fig. 2) . The large adopted frequency division factor (32×) is advantageous: (≈ 4.8 μs) can be allowed compared to the ∼ns delay of continuoustime comparators [3] , thus enabling the comparator to be optimized for power instead of speed. A longer comparator delay is allowed in this architecture, since f osc only depends on the duration of 2 . The main drawback of running the loop filter at a lower frequency is an increase in the loop settling time.
Unlike traditional RC relaxation oscillators requiring continuoustime comparators, the comparator is implemented as a dynamic StrongARM latch. However, the offset of the comparator may degrade the accuracy of the wakeup timer and introduce a temperaturedependent frequency error, while its flicker noise directly affects the long-term stability of the timer. To suppress the effect of both offset and flicker noise, the dynamic comparator is chopped at a frequency of f osc /256 by means of an analog and a digital modulator at the comparator input and output, respectively.
The digital loop filter (Fig. 1 ) comprises a configurable gain (K DLF in Fig. 1 ) and a digital accumulator which, thanks to the comparator output being single-bit, are implemented in a compact and low-power form by a bit-shifter and a up/down counter, respectively. By changing the digital filter gain, the overall bandwidth of the DFLL can be easily configured and more reliably predicted than in conventional analog FLL's, which are more vulnerable to PVT variations. This feature allows the DFLL to flexibly tradeoff bandwidth and noise for different IoT scenarios. For example, applications dealing with fast temperature or supply changes prefer a higher loop gain, which results in wider loop bandwidth; instead, applications requiring a lower period jitter need a lower loop gain to minimize the DCO step that would otherwise show as additional jitter. Due to the bang-bang operation of the DFLL, the DCO output frequency will continuously toggle in the steady state. If the random noise in the loop is neglected, the DCO control word will toggle between two consecutive values corresponding to the frequencies f 1 and f 2 that straddle f nom , as shown in Fig. 3 . Since such locking condition is satisfied for any f nom between f 1 and f 2 , this would result in a worst-case frequency offset |f os | < f 1 − f 2 /2 = f res /2, where f res is the DCO resolution. Although this source of inaccuracy is partially mitigated by the dithering effect of random noise, care has been taken to maximize the DCO resolution not to degrade the timer accuracy. Moreover, although the DCO noise is high-pass filtered by the loop and hence does not affect the timer long-term stability, the longterm stability is also affected by the DCO resolution. Fig. 4 shows the simulated Allan deviation for different DCO resolutions, thus demonstrating that a lower f res leads to a lower Allan deviation floor, i.e., a better long-term stability. This effect can be intuitively explained as follows. A higher DCO resolution causes less quantization noise to be injected in the loop. In the equivalent linear model of the loop, this directly affects the equivalent gain of the single-bit comparator. A decrease in noise implies a smaller signal at the comparator input and, hence, a higher equivalent comparator gain. The higher comparator gain reduces the output jitter due to noise in both the comparator and the DCO. Consequently, f res = 250 Hz was chosen for the DCO. Meanwhile, sufficient tuning range for the DCO is required to tackle its frequency drift over PVT. Therefore, a high-resolution DCO is required, which is challenging with the very limited power budget in the wakeup timer ( 1 μW). To address this challenge, two techniques are employed (Fig. 5 ): temperature compensation facilitated by a local proportional-to-absolute-temperature (PTAT) current bias, and a DAC to improve the DCO resolution. A 4-stage differential ring oscillator employing an ultralow-power leakage-based delay cell is adopted to keep the oscillator power below 60 nW (Fig. 5) [8] . A subthreshold PTAT current bias is used to lower the DCO temperature drift while exploiting a nW oscillator topology. This effectively reduces the oscillator temperature drift by 5×, thus relaxing the DAC design. The self-clocked DAC consists of 255 + 7 = 262 unarycoded elements driven by an 8-b integer thermometric DAC clocked at f osc /32 and a 3-b fractional DAC processed by a third-order digital modulator. Thanks to the feedback loop, no strict linearity requirements are required for the DAC other than the monotonicity necessary for loop stability. Monotonicity is ensured by the unary nature of the DAC. The modulator is clocked at f osc /2 (16× oversampling ratio) to further improve the DCO resolution from 2 kHz to below 250 Hz. The enhancement in resolution given by the operation improves the Allan deviation floor in the same way as a standard DCO with the same equivalent resolution, as illustrated in Fig. 4 .
IV. MEASUREMENTS
The 0.07-mm 2 wakeup timer was fabricated in a standard TSMC 1P5M 40-nm CMOS process (Fig. 6 ) and draws 259 nA from a single 0.7-V supply (power breakdown: 32% FD/comparator, 38% digital, 30% DCO). This corresponds to a state-of-the-art energy efficiency 0.43 pJ/cycle.
Once enabled, due to the bang-bang operation, the frequency of the DFLL increments or decrements toward the steady-state frequency [ Fig. 7(a) ]. The locking of the FLL can be observed in Fig. 7(a) , in which the DCO output frequency in open-loop and closed loop configuration are compared. Although long-term stability is one of the critical performance for wakeup timers, it is interesting to observe that, thanks to the digital-intensive nature of the architecture, the settling time can be easily configured and traded off for jitter if required by the target application. This can be accomplished by tuning the digital loop-filter gain K DLF (Fig. 8) . The period jitter is 15.2 ns rms and slightly improves (14.5 ns rms ) disabling the modulation and hence its quantization noise. Thanks to the self-clocked and the chopped comparator, the long-term stability (Allan deviation floor) improves by 10× down to 12 ppm beyond 100 s integration time [ Fig. 9(a) ]. The long-term stability is relatively stable against temperature and supply voltage variations [ Fig. 10 ].The temperature sensitivity of the output frequency improves from 134 ppm/ • C to 106 ppm/ • C when activating the chopping and the modulation, thanks to smaller errors due to a smaller DCO step and the mitigation of comparators offset (Fig. 11) . The timer operates over the 0.65-0.8-V supply range with a deviation of ±0.6% (Fig. 11) . Although such temperature and supply sensitivity are sufficient for typical IoT applications and are on par with state-of-the-art designs (see Table I ), simulations shows that they are limited by the on-resistance of the FD switches at such low supply and can be improved by proper redesign.
The performance is summarized and compared with other sub-μW state-of-the-art designs in Table I . Being integrated in the most advanced CMOS process (40 nm) among nW timers to show its scaling advantages, the presented timer achieves the best power efficiency at the lowest operating supply voltage among state-of-the-art sub-μW timing references. 
V. CONCLUSION
A ultralow-power wakeup timer employing a bang-bang DFLL has been integrated in a 40-nm CMOS process. Thanks to the highly digital architecture, this timer achieves the best power efficiency (0.43 pJ/cycle) at an extremely low-supply voltage and in a low area, while keeping on-par long-term stability and on-par stability over supply and temperature variations. This demonstrates that the proposed architecture is suitable for IoT applications requiring accurate ultralow-power timers integrated in advanced CMOS processes.
