Index Terms -Clock generation, clock multiplier, frequency multiplication, frequency synthesizer, phase locked loop, low jitter, low phase noise, low power, sub-sampling phase detector, sub-sampling PLL, PLL FOM.
I. INTRODUCTION
Timing generation is an indispensable function in electronic systems, and the phase-locked-loop (PLL) is a ubiquitous component in modern ICs due to its versatility. It can for instance be used for clock generation, frequency synthesis, frequency modulation and demodulation, clock and data recovery, synchronization and spread spectrum signal generation.
which we call the "classical PLL" architecture. It consists Ref "loop components": a phase detector (PD), a charge pump N PLL's jitter performance for a given power can be evaluated with the PLL Figure-Of-Merit (FOM) [1] .
the PD and CP noise is multiplied by N 2 and often dominates the in-band phase noise, thus limiting the achievable PLL FOM. The sub-sampling PLL (SSPLL) proposed in [2] uses a PD that sub-samples the high frequency VCO output with the reference clock. The PD and CP noise is shown to be not multiplied by N 2 , and greatly attenuated by the high phase detection gain, leading to lower in-band phase noise and better PLL FOM. This article reviews the development of the sub-sampling PLL techniques and their applications in recent PLL architectures [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] . Section II discusses the classical charge pump PLL. Section III reviews the development of the PLL FOM and Section IV the SSPLL architecture. Power and spur reduction techniques for SSPLL are discussed in Section V. Finally, Section VII draws conclusions and discusses the recent application of sub-sampling PLL techniques.
II. CLASSICAL CHARGE PUMP PLL 
Defining a CP feedback gain β CP as the gain from the PLL output to the CP output current, the closed loop CP noise transfer function can be calculated as:
where G(s) is the PLL open loop transfer function. The inband phase noise due to CP can be approximated as:
with S iCP,n the power spectral density of CP current noise. Equation (2) indicates that the CP noise is suppressed by (β CP ) 2 when transferred to the PLL output. A larger β CP is desired for more CP noise suppression. In a classical PLL design, the 3-state PFD/CP in Fig. 2 In other words, it will be harder in practice to achieve the same FOM with a lower f ref f VCO . into current, so that a traditional current driven loop-filter can still be used. This g m can be implemented in a timecontinuous way, different from a duty-cycled CP. Here we still call it "CP" to simplify the notation, and implementing the gm as a real CP is actually useful for gain control as we will see later.
In contrast to a traditional CP, the output current is not proportional to Δt/T ref , but rather amplitude controlled by the difference of V sam and V DC,VCO . The SSPD/CP transfer characteristic has the same shape as the waveform to be sampled, see Fig. 4 (c). The ideal locking point is the zero crossing where the SSPD/CP gain can be calculated as:
with SR sam the slew rate of the waveform to be sampled at the zero crossing locking point. In the case of LC VCO, SR sam =A VCO ·2πf VCO , usually a well defined value since f VCO is known and VCO amplitude calibration over corners is often performed in practice. We can thus re-write (10) Viewed from a different angle, the N factor difference between the SSPLL and the classical PLL can be understood if we look at the PLL as a simple loop-back transceiver. The 'signal' is now the VCO phase noise and the function of the PLL loop is to receive this signal, process it and apply it to the VCO to cancel/suppress the VCO phase noise. In a SSPLL, as shown in Fig. 5(a) , the SSPD with an LO (Ref clock) acts as a down converter that directly aliases back the VCO phase noise to around DC. The loop filter acts as a base band circuit that processes the received signal and applies it to the VCO input. In other words, the SSPLL is analogous to a direct conversion receiver. The SSPD down-converter has no loss but a gain of 1. Therefore, there is no amplification to the SSPD/CP noise. The aliasing of high frequency VCO noise to low frequency is not a problem because the VCO noise has a steep roll off. In a classical PLL, the receiver chain consists of a divider and a PFD/CP as shown in Fig. 5(b) . The divider firstly down converts the signal to an intermediate frequency f VCO /N. The PFD/CP together with the Ref clock acts as a second down converter and converts the signal to around DC. In other words, the classical PLL is analogous to a superheterodyne receiver. The divider acts like a very lossy down-converter with a gain of 1/N, and thus a noise figure of 20logN even if it has no noise. As a result, any noise in the receiver chain after the divider, like the PFD/CP noise, is amplified by N 2 , when referred to the PLL output.
The architecture of a SSPLL utilizing the SSPD/CP is shown in Fig. 6(a) . It works without using a divider as soon as the ratio f VCO /f ref is an integer. A linear phasedomain model for the SSPLL is shown in Fig. 6(b) . There is no classical divide-by-N on the feedback path but instead a virtual frequency multiplier "uN" on the reference clock 
S E E
The advantage of the SSLL in terms of the detection gain can be more than just a factor of N, as 4π >>1 and usually A VCO >V gs,eff . Thus, the SSPLL has a much larger β CP than the classical PLL and much more suppression for the CP noise. Thanks to the high β CP , the CP noise in a SSPLL is greatly suppressed and would have negligible contribution to the total loop noise. In such a case, having an "unnecessarily high" β CP wouldn't further improve the loop noise but does require a large filter capacitor to stabilize the loop. Fig. 7 shows the SSPD/CP with gain control [2] . Differential VCO outputs and differential sampling are used so the differential zero-crossing can A classical PLL with divider is thus used as a frequencylocked-loop (FLL) to ensure correct locking. The key for the FLL is to dominate the loop control when the phase/frequency error is large but avoid adding noise once the phase is locked. This can be achieved e.g. by intentionally adding a large dead zone (DZ) to the FLL PFD/CP (see [2] for the implementation), so that it will inject no current once the phase error is small and fall within the DZ. However, the work in [3] shows that Fig. 8 would also work with no DZ because around the locking point β CP,SS can be much larger than β CP,PFD anyway. The overall characteristic of the combined SSPD/CP and PFD/CP is shown in Fig. 9 . With no DZ, the PFD/CP in the FLL will inject noise even in the locked condition, but it's contribution can be small as it is attenuated by (β CP,SS +β CP,PFD ) 2 . After locking, the FLL can be disabled to save power or it can remain on to constantly monitor the phase/frequency error and improve the SSPLL's robustness against disturbances [3] . , a majority of power, as much as 90% [2] , could be wasted by the "short-circuit" current due to simultaneous conduction of the NMOS and PMOS during switching. [4] that can largely eliminate the short-circuit current and drastically reduce the buffer power. The core is an inverter with an NMOS N1 and a PMOS P1. N1 is directly connected to XO while a timing control circuit (TCC) is inserted between P1 and XO. The TCC generates a narrow pulse V GP from the XO and controls the gate of P1. The delay Δt 1 and Δt 2 are set such that the time when V GP is low (P1 conducts) and the time when XO is high (N1 conducts) are non-overlapping. Since f ref is low, this timing plan can be easily met. In this way, N1 and P1 will not conduct simultaneously thereby eliminating the short-circuit current. Moreover, for low noise sampling, only the Ref sampling edge (rising edge in this example) needs to be clean while the other edge's noise is not relevant. Therefore, N1 can be sized big to maintain low noise, while the P1 and TCC can be sized small to save power. This buffer thus greatly reduces power while maintaining the critical edge's noise performance. It also offers the flexibility of tuning the Ref duty cycle without impacting the critical edge.
One straightforward way of reducing the VCO sampling buffer power is to do buffer-less direct VCO sampling as shown in Fig. 11 . Then the sampling buffer power is simply eliminated. However, the concern is the disturbance of the sampling process to the VCO operation, causing reference spurs. Different spur mechanisms can be identified, namely charge injection, charge sharing and VCO load/frequency modulation when the sampling switch is turned on/off. The load modulation can be alleviated by adding a complementary switched dummy sampler as shown in Fig. 11, so sampling point is at the peak of a sine-wave, the gain would even be zero. One way to handle this issue as shown in [16] is to first convert the sine-wave into a more linear waveform like a sawtooth and then use an SSTDC to digitize the entire waveform. Digital background calibration can then be applied to linearize it. An alternative is adding a digital-totime-converter (DTC) to the reference clock path [17] [18] [19] . This effectively adds a frac-N multiplier before the SSPD/CP so that the sampling point would still be around zero crossings, as if it is still in int-N mode. Frac-N SSPLLs using these concepts have achieved good FOM among frac-N PLLs as shown in Fig. 3 . The development of very linear DTCs [19] , [25] should help to further improve the performance.
