ABSTRACT This paper proposes a low-power fractional-N all-digital PLL (ADPLL) for the narrow-band Internet-of-Things applications. Multi-step lock controlling and oscillator tuning word coarse prediction algorithms help to accelerate the locking process to less than 20 µs. A digital-to-time converter (DTC) is used with a phase prediction algorithm to minimize the detection range of the time-to-digital converter for low power consumption. The dither block is designed to improve the nonlinearity of DTC resulting in a −40-dBc in-band spur suppression. A low supply of 0.6-V digitally controlled oscillator has the frequency coverage of 1.5-2.05 GHz. Fabricated in a 55-nm CMOS, the ADPLL occupies an active area of 0.88 mm 2 and consumes 4 mW at 1.8 GHz. With a 24-MHz reference clock, the measurement results show that this work achieves better than −94-dBc/Hz in-band phase noise and −120.5 dBc/Hz at 1-MHz offset of 1.83-GHz carrier frequency. The FoM value is −233.4 dB, and the reference spur is −76.3 dBc.
I. INTRODUCTION
NB-IoT is a novel protocol intended to support low power devices in wide area network (WAN) for cellular data connection, and has been included as one important part of the future fifth-generation (5G) mobile communications [1] - [2] . The frequency band of NB-IoT follows LTE definitions. The downlink frequency range is from 729MHz to 2200MHz, and the uplink frequency range is from 699MHz to 1980MHz. In this work, our ADPLL supports the low band frequency range, which is from 699MHz to 915MHz. According to [10] , the LO phase noise and spur at 1MHz offset should be less than −110dBc/Hz and −57dBc for the NB-IoT transceivers. So a high performance frequency synthesizer with low-power comsumption is a must for NB-IoT wireless transceivers. ADPLLs have gradually begun to replace conventional charge-pump phase locked loops (CPPLL) in CMOS process due to its small area, flexible re-configurability, amenability to technology porting [3] - [5] . Counter-based ADPLL provides a phase estimate with an accuracy of one oscillator period, and the dynamic range of TDC only needs to cover the residual phase error of within one cycle of output clock [6] , [7] . Moreover, DTC-assisted TDC and snapshot TDC are widely used to reduce power consumption, for which these techniques were adopted in these papers [4] , [8] , [9] . Several spur mitigation techniques based on correlation have been introduced to pre-distort the non-linearity in the PLL [11] , [12] . However, because of the relatively high-power consumption and long correlation time, these solutions are not very applicable in NB-IoT applications. Fast locking time is another key specification for frequency synthesizer used in NB-IoT transceiver.
This paper implements a low-power ADPLL with in-band fractional spur suppression for NB-IoT applications. Section II explains the proposed ADPLL architecture, including MSLC and OTW_C prediction algorithm and spur suppression technique. Section III shows the detailed circuit implementation. Section IV gives the measurement results and Section V concludes this work. 
II. THE PROPOSED ADPLL
The proposed low power ADPLL architecture is shown in Fig. 1 . The frequency command word FCW is accumulated by the reference phase accumulators (RPA) to generate the integer and fractional part of the reference phase, PHR_I and PHR_F, respectively. Phase prediction algorithm [13] is adopted and the gain of DTC is estimated by the gain estimation algorithm in [14] to dynamically track delay estimation errors due to process-voltage-temperature (PVT) variations and reduce the influence of the gain mismatch of DTC. DTC is cascaded with dither block to suppress the spur induced by DTC quantization. A self-gating snapshot block sub-samples the DCO output edges CKV with a low-frequency reference clock, in order to reduce the sampling rate from the GHz-range down to the reference frequency FREF which is 24MHz here. The signal CKR is produced from snapshot block to serve as a synchronous sampling clock of the ADPLL loop (at the reference clock rate), to resample PHV signal. PHV stands for the variable phase, which is the output of a synchronous 10-b counter (VPA) acting as the variable accumulator. The delayed FREF signal FREF_dly and the CKVS are inputted into TDC module for quantification. A time amplifier (TA) is used for improving TDC's quantization resolution to optimize in-band phase noise of ADPLL. To achieve a stable gain, dynamic calibration for the TA is used in this design. The digitized output of TDC represents the fractional part of the phase error Eps, which is combined with PHR_I and PHV to obtain the total phase error, PHE. A reconfigurable proportional-integral controller is followed by a 4 th -order IIR filter to optimize the phase noise. Normalized Tuning Words (NTW) controls the DCO capacitor arrays.
The digital loop includes MSLC, OTW prediction and gear-shifting techniques.
A. MSLC AND OTW_C PREDICTION ALGORITHM
In this work, ADPLL is programmable for different periods of the locking process. Here the locking time can be dramatically shortened by the proposed MSLC algorithm. The closed loop transfer function from ADPLL's reference phase input to the variable phase output is
where ω n is the natural frequency, ζ is the damping factor, f R is the frequency of reference clock. The locking time of constant loop bandwidth ADPLL can be expressed as
f e is the frequency error, f is the output frequency change of ADPLL.
In order to speed up the locking process, we employ different loop bandwidths at different periods of the process. There are three steps here, namely coarse, medium and fine tuning. When the frequency error of f occurs, ω n is set to ω c , ω m and ω f (ω c > ω m > ω f ) one after another to decrease the locking time according to equation (3) . And they are chosen according to the frequency error and frequency resolution of their corresponding capacitor arrays.
Therefore, the whole locking time can be calculated as shown in equation (4), ω c and ω m are set wider than ω n to realize fast locking. ω f is set lower than ω n to reduce the phase noise and spurs of the reference. Fig.2 shows the comparison simulation results between constant loop bandwidth and diminishing loop bandwidth, which shows that diminishing loop bandwidth has a faster locking speed than the constant loop bandwidth. In this design, ω c is set to 1.5MHz, ω n is set to 1MHz, ω f is set to 250KHz. The MSLC and OTW _C prediction algorithm is shown in Fig.3 . ϕ E is loop phase error. The postfix _C, _M and _F mean coarse, medium and fine band, respectively.
In this work, the DCO has a constant frequency tuning step of 24MHz/LSB, which is same as FREF clock. During the coarse tuning period, the oscillator output frequency can be given through equation (5), where CFW is the center frequency word.
Hence, according to the equation (5), OTW_C is given by equation (6) 
The predicted OTW _C is very close to the final OTW _C with the error less than 1LSB, so that the coarse tuning period will be pretty fast and it means the value of the first part in equation (4) is very small. The settling time is further decreased.
After the OTW _C is predicted, coarse tuning period starts, the MSLC starts to monitor the absolute value of phase error (|ϕ E |). When |ϕ E | is less than the coarse tuning threshold value |ϕ E _C| and this condition lasts k_C periods, OTW _C is latched and coarse tuning period ends. Here k_C is set to the threshold of period numbers for stable locking. Then medium tuning period begins. The locking process of medium tuning and fine tuning is similar to the coarse tuning. When the tuning period is switched, gear-shifting technique [6] is used to maintain the NTW when the loop parameters change so as to guarantee no frequency perturbation in the oscillator.
During the process of locking, MSLC will also monitor the absolute value of NTW to judge whether the loop is locked. If |NTW| is less than |NTW_M | and it maintains certain periods, which means loop is unlocked during the medium tuning period, it will go back to the coarse tuning period to modify the OTW _C.
B. SPUR ANALYSIS AND SUPPRESSION
Spur is generated in PLLs by FM signal in the system as shown in Fig.4 . They degrade the receiver's adjacent-channel interference (ACI) tolerance and introduce unwanted emission in the transmitter. For the proposed ADPLL in Fig.1 , spurs include reference spur from reference clock, and the fractional spur which introduced from TDC and DTC block. Reference spur can be suppressed by the loop character of ADPLL as it lies out of band. However, it's very hard to suppress the in-band fractional spur according to the loop. Here, the integral non-linearity (INL) of the DTC is the dominant source of fractional spurs in the proposed circuit. The resolution of DTC module is set to 5.1ps, in order to make the influence of DTC quantization error for the fractional spur as low as possible.
During the fractional operation, the DTC control code ramps up or down with a period of 1/f frac , where 1/f frac = FCW frac * F ref . The non-linearity of DTC introduces fractional spurs at the offset frequency of f frac and its harmonic in the output spectrum. Equation (7) is used for estimating the corresponding in-band fractional spur level of each harmonic [11] , where t INL,i is the DTC's INL error.
As shown in Fig.4 , the periodic phase increment leads to the spur appearing in the output spectrum. In order to break the periodic pattern of DTC non-linearity, an efficient DTC linearization technique with dither block is proposed. As inspired by the reference spur suppression, if the in-band fractional spur can be pushed to the out-of-band of ADPLL's loop, it can be suppressed easily. As shown in Fig.5 , the periodic low-frequency sawtooth waveform is changed to disorganized waveform after the dither block turns on, VOLUME 7, 2019 and the spectrum changes from a low-frequency single-tone to spread high-frequency components which can be filtered easily by the loop filter. It means the INL is decreased and the spur is suppressed. The dither block is realized by a 3-bit pseudo-random number controller, which will increase the overall phase noise of the system. However the influence is negligible because the quantization resolution of -DTC is relatively high. The final time quantization resolution of DTC and -DTC are 5.1ps and 0.45ps, respectively.
III. CIRCUIT IMPLEMENTATION A. DTC-ASSISTED TDC
The block diagram of the proposed DTC-assisted TDC is shown in Fig.6 . It consists of DTC, snapshot and TA-TDC. By changing the control word with the phase-prediction technique [8] , [13] , DTC delays the FREF so that the delayed reference clock FREF_DLY is almost aligned with DCO's output variable clock CKVD, which reduces the dynamic range of TDC. Snapshot circuit is triggered by FREF_EN to reduce the sampling rate of TDC. An offset compensation circuit is used to guarantee the proper timing relationship in the snapshot circuit. Time amplifier (TA) is used to improve the TDC resolution, whose architecture refers to [16] . It is necessary for DTC to have a total range as large as one CKVD period at least. In addition, high resolution is needed for DTC in order to decrease the range of the TA-TDC. It seems that the challenge of a high performance TDC is just moved to DTC design. But fortunately, it is much easier to achieve both wide range and high resolution in DTC than TDC with low power consumption. An 8-bit DTC is implemented using a cascade of eight identical DTC cells as shown in Fig.7 [15] . Each delay cell consists of a CMOS inverter loaded with a tunable 32-unit capacitor bank. A second inverter is used to restore fast rising and falling time. The DTC is designed to cover the period of CKVD across PVT variations. The gain calibration circuit of DTC is designed as Fig. 8 . As the error of K DTC performs as a low frequency sawtooth wave, the fractional part of reference phase, PHR_F, is subtracted from 0.5 and rectified by a sign function [14] . After being multiplied by PHE_F, which is the phase error detected by TDC, the estimation error is filtered by an IIR filter and integrated, then the right K DTC is obtained. Here, K DTC can be fed back into the system for iterative operations [4] . The circuit is trigged by CKR and has an open-loop transfer function as equation (8) in s domain when we take the estimation error as a small signal:
Here µ is integral factor and λ is the filter factor. In this feedback loop, larger µ and smaller λ can provide a wider bandwidth and faster locking process, but degrade the stability. Carefully balance has been done here.
Because high resolution DTC has alleviated the dynamic range of TDC, an input linearity range of 20ps is large enough to cover jitter in the reference, CKVD and the non-zero DTC INL.
To achieve both high gain and easy calibration, an 8× time amplifier is designed here as given in Fig.9 . The two 5-bit TDCs are implemented using a conventional delay line architecture. Sense-amplifier based DDFs are used to minimize the metastable effect.
The gain of TA will change with PVT variation which will degenerate the noise of system. To achieve a stable gain, dynamic calibration for the TA is used in this design. Fig. 9 also shows the dynamic calibration scheme. The gain of TA is adjusted by changing the load capacitor which is implemented as a 4-bit MOS varactor bank. A mirror TA which is designed to maintain the gain of 8, is placed to track the PVT variation of the gain as shown in Fig.9 . Signal FREF_Cal'which is delayed after single time unit and FREF_Cal are two inputs of the mirror TA. After the amplification, there are 8 time-units delay between FREF_TA and FREF_TA', so 8 time-units delay module is added to make the two signals aligned with each other. If not aligned, negative feedback loop is started. The code VCAL [0:15] , which is created by a counter, is connected to the main TA by feedback to make the gain stable.
B. LOW-SUPPLY DCO
DCO is implemented with a NMOS LC-DCO as shown in Fig.10 and it is supplied with 0.6V voltage for low power. The 0.6V voltage also provides a common-mode voltage for the oscillation output because the output buffer is supplied with 1.2V voltage. A 7.3nH inductor is chosen to offer a high R in . The DCO includes coarse, medium and fine varactor arrays to cover the frequency tuning range and improve the frequency resolution. Coarse bank is designed with unit frequency step of 24MHz/LSB, which can be used in the OTW _C prediction to fasten the coarse tuning period. The Frequency Tuning Range (FTR) of medium and fine banks respectively cover several LSBs of coarse and medium banks in order that the OTW of current bank will not overflow due to the process, voltage and temperature (PVT) variations. The phase noise of ADPLL output is affected by the finite frequency tuning resolution of DCO. Therefore, a small tuning step is always desired. However, the smallest frequency tuning step is limited by the smallest C in 55nm process, a MASH 1-1-1 10-bit SDM is used to dither 6-bit fractional capacitor bank to improve the frequency resolution and decrease the noise contribution.
IV. MEASUREMENTS
The proposed fractional-N ADPLL has been designed and fabricated in 55nm CMOS process. The chip micrograph is shown in Fig.11 . The chip area is 1.1 * 0.8mm 2 and it consumes 4mW at 1.8GHz under the power supply of 1.2V,where DCO consumes 0.84mW, DTC-assisted TDC consumes 0.8mW and digital loop consumes 2.36mW. The measured DCO tuning range is from 1.5GHz to 2.05GHz, which is wide enough for Sub-GHz NB-IoT applications. Fig.12 shows the phase noise measurement results. It is −120.5dBc/Hz@1MHz offset at the 1.827GHz. Since the LO frequency is only half of the test point, the actual LO phase noise will be around 6 dB better. The measured integrated rms jitter is 1.07ps from 1kHz to 100MHz.
The measured close-in spur suppression is shown in Fig.13 , in-band spur suppression is about −40dBc@23.4KHz. Comparing with dither block off, the in-band spur can be further suppressed 20.9dB with dither block on, which is consistent with the theoretical calculation in Section II. As Fig.14 shows, when the DCO output frequency is 1.851GHz, the measured reference spur is −76.3dBc@24MHz, and the out-of-band fractional spur is −60.1dBc@6MHz over a 50MHz span. The locking time is about 20us illustrated in Fig.15 , which is measured with Agilent Technologies E5052B Signal Source Analyzer. The piece-wise linear result is due to limited machine measurement accuracy, etc. However, the process curve of frequency transfer is not the focus of our attention and the settling time from one frequency point to another is what really counts. Table 1 gives the comparison results between this work and other ADPLLs. It shows this work has got the phase noise of −120.5dBc/Hz@1MHz, which is about 10dB lower than Ref [18] , and the reference spur of −76.3dBc. 
V. CONCLUSION
A low-power high in-band spur suppression all-digital PLL (ADPLL) is implemented for IoT applications in this paper. The combination of snapshot TDC and DTC is used for reducing the power consumption. The locking time is reduced by MSLC and OTW_C prediction algorithm. dither block is cascaded with DTC achieving the in-band spur suppression of 21dB improvement. MSLC and OTW_C Prediction algorithm help to reduce the locking time. The ADPLL occupies an active area of 0.88mm 2 and consumes 4mW at 1.8GHz. With a 24-MHz reference clock, a 1.8-GHz output RF clock, and a loop bandwidth of 100kHz, this design achieves 1ps rms jitter, integrated from 1-kHz to 10-MHz offset and a FoM value of −233.4dB. 2006 , where his team developed the first contact-less smart card chip which was successfully applied to the Shanghai Public Transportation System and designed the chip for the second-generation ID cards of China. He is currently the Director of the Auto-ID Lab, Fudan University. He is also a Co-Founder of Shanghai Quanray Electronics. He has contributed over 80 papers and over 20 patents, authorized or pending. His research interests include SoC architecture, software-defined technologies, integrated transceiver designs, such as RFID, Zigbee, and GSM/TDS-CDMA, RF/analog mixed signal integrated circuits designs, nonvolatile memory designs, digital signal processing, and image processing.
