Abstract-Generation of low jitter, high frequency clock from a low frequency reference clock using classical analog phaselocked loops (PLLs) requires large loop filter capacitor and power hungry oscillator. Digital PLLs can help reduce area but their jitter performance is severely degraded by quantization error. Specifically, their deterministic jitter (DJ), which is proportional to the loop update rate becomes prohibitively large at low reference clock frequencies. We propose a scrambling TDC (STDC) to improve DJ performance and a cascaded architecture with digital multiplying delay locked loop as the first stage and hybrid analog/digital PLL as the second stage to achieve low random jitter in a power efficient manner. Fabricated in a 90 nm CMOS process, the prototype frequency synthesizer consumes 4. 
I. INTRODUCTION
H IGH frequency clock generation from a low frequency reference clock is needed in many of the general purpose application specific integrated circuits (ASICs). Phase locked loops (PLLs), that take an off-chip crystal oscillator as a reference clock and generate a high frequency on-chip clock are most commonly used in such frequency multiplication applications. The design of fully integrated PLLs with an acceptable noise performance is a significant design challenge. To elucidate the main challenges, consider the design of a classical Type-II analog charge-pump PLL shown in Fig. 1(a) [1] that generates a 2 GHz output from a 1 MHz reference clock. Since the PLL bandwidth, F BW , must be less than 1/10 th of the reference frequency (F BW < F REF /10) for stable operation, maximum PLL bandwidth can be at most 100 kHz. This small bandwidth is not sufficient to adequately suppress the phase noise from a ring oscillator. Additionally, achieving this bandwidth with a reasonable phase margin also requires a loop stabilizing zero frequency, F Z , to be about 10 kHz (≈ F BW /10) [2] . With a loop filter resistor of 15 k , the loop filter capacitor needs to be at least 1 nF, which is prohibitively large to implement on-chip. 1549 -8328 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
A digital PLL (DPLL) shown in Fig. 1(b) offers an attractive alternative to reduce the area of a traditional charge-pump based PLL [3] - [5] . A DPLL is composed of mostly digital circuits such as the time to digital converter (TDC), digital proportional-integral loop filter, digitally controlled oscillator (DCO), and a feedback divider. Replacing the analog loop filter with a digital loop filter obviates the need for a large loop filter capacitor, thus resulting in significant area savings. However, DPLLs suffer from a conflicting bandwidth tradeoff to suppress the TDC quantization error and DCO phase noise simultaneously. A low PLL bandwidth is needed to suppress the TDC quantization error while a wide bandwidth is desirable to suppress the DCO phase noise [6] . As a result, the DPLL bandwidth is typically much lower than F REF / 10 . This constraint further exacerbates the random jitter (RJ) issue present in analog PLLs.
Additionally, the DPLLs also suffer from a unique trade-off between the reference frequency and deterministic jitter (DJ) resulting from the accumulation of phase/frequency quantization error [7] . Both of these issues are elaborated in Section II. The focus of the paper is to address RJ and DJ performance issues associated with a low frequency input reference clock. To this end, we propose a cascaded digital MDLL and digital PLL frequency synthesizer architecture with a scrambling TDC (STDC). The proposed STDC alleviates the trade-off between the deterministic jitter accumulation and the input reference frequency and achieves reference frequency independent deterministic jitter.
Fabricated in a 90 nm CMOS process, the prototype frequency synthesizer consumes 4.76 mW power from a 1.0 V supply and generates 2.56 GHz output clock with a long term absolute jitter of 4.18 ps rms from a 1.25 MHz crystal reference frequency.
The rest of the paper is organized as follows. After describing the RJ and DJ issues in Section II, we evaluate different possible architecture options in Section III and present the proposed architecture in Section IV. The implementation details of key building blocks are discussed in Sections IV and V. Experimental results from a prototype implementation fabricated in 90 nm CMOS process are shown in Section VI, followed by conclusions in Section VII.
II. JITTER IN LOW REFERENCE FREQUENCY DPLLS
The DPLL output clock jitter can be decomposed into two categories: i) random jitter (RJ) resulting from thermal/flicker noise sources, and ii) deterministic jitter (DJ) caused by quantization error sources. RJ is typically dominated by the phase noise of the ring oscillator, which can be only reduced either by increasing the PLL bandwidth or by increasing the power consumption of the oscillator. Since the phase noise only improves by 3 dB with doubling the power, it is more power efficient to suppress the phase noise by increasing the PLL bandwidth to the extent possible. Therefore, for a given power budget, RJ is dictated by the maximum allowable bandwidth, which is F REF /10 , where F REF is the PLL reference update rate [8] .
To quantify this, consider a 2.5 GHz VCO designed with a power budget of 3 mW to achieve a spot phase noise Effect of PLL loop bandwidth on the oscillator phase noise suppression.
of −98 dBc/Hz at 1 MHz offset frequency. When this VCO is embedded in a PLL, the resulting RJ is plotted as a function of the PLL bandwidth in Fig. 2 . To achieve less than 1% UI rms of RJ, the PLL bandwidth must be at least 700 kHz, which translates to a lower limit on F REF to be about 7 MHz. Alternatively, the plot also reveals that RJ can be at best 5% UI rms when F REF = 1 MHz (100 kHz bandwidth). This issue of large RJ is further exacerbated in DPLLs because the PLL bandwidth has to be further lowered to suppress the TDC quantization error. Assuming the PLL bandwidth is reduced to F REF /20 (50 kHz in this example), RJ due to the VCO phase noise alone increases to about 10% UI rms (see Fig. 2 ), which is unacceptable for most applications.
In addition to the RJ issues discussed thus far, DPLLs also suffer from a large DJ. The hard non-linearity of the TDC around the lock point (zero phase error) makes the DPLL behave like a non-linear system even in the steady state. As a result, the steady state of a DPLL is a bounded limit cycle whose frequency and magnitude are governed by noise and the loop delay. The DJ resulting from such a limit cycle behavior has been calculated in [9] . However, to highlight how a low frequency reference exacerbates the limit cycle induced DJ we simply assume that the TDC output dithers only between two states (±1) at F REF /2 as depicted in Fig. 3(a) . 1 Under this assumption, the DCO output frequency dithers between ±(K P + K I ) · K DCO , where K DCO is the DCO gain and K P and K I are the proportional and integral path gains, respectively. The output phase accumulates as illustrated in Fig. 3(b) , which appears as DJ in the time-domain or as large spur at F REF /2 in the frequency spectrum. The magnitude of DJ is proportional to the limit cycle period, 2T D , the proportional path gain, K P , the integral path gain, K I , and the DCO gain, K DCO . Because K P K I to ensure an overdamped response, output DJ is dominated by the proportional path. In other words, the TDC quantization error appears at the output mostly through the proportional path while a majority of it is suppressed in the integral path. Thus, DJ is proportional to K P K DCO and the limit cycle period, 2T D . As a result, lowering K P K DCO can reduce DJ, but it also reduces the loop bandwidth and exacerbates RJ caused by the DCO phase noise. Therefore, new techniques are needed to eliminate the DJ contribution from the proportional path without degrading the RJ performance and to reduce the jitter accumulation time to much less than 2T D .
A hybrid-PLL reported in [10] provides a means to eliminate the limit cycle induced DJ in the proportional path. By using an analog proportional path, the TDC quantization error and its associated DJ are eliminated. The integral control, however, was implemented in digital domain using a simple TDC to obviate the need for a large loop filter capacitor. Because the proportional path governs PLL loop dynamics, the hybrid-PLL exhibits linear loop dynamics and the DJ performance is greatly improved compared to a DPLL implemented using a TDC in the proportional path. However, much like in an analog PLL, the bandwidth is limited to F REF /10 , which is shown to be inadequate to meet the RJ performance using a low power DCO (see Fig. 2 ).
Multiplying delay locked loops (MDLLs) are shown to achieve superior RJ performance compared to the PLLs [11] - [14] . By replacing the noisy VCO edge with a clean reference clock edge, MDLLs periodically reset the VCO jitter accumulation at a rate faster than in PLLs. In frequency domain, this translates to an equivalent noise suppression bandwidth of about F REF /4, which is 2.5 times larger than the maximum allowable PLL bandwidth. As a result, MDLLs exhibit nearly 8 dB superior phase noise compared to a PLL using the same oscillator [14] . Referring to Fig. 2 , the RJ reduces to about 2% UI rms with F REF = 1 MHz, which represents an improvement of 4.5× and 2.5× compared to a DPLL and an analog PLL, respectively.
More importantly, resetting of the DCO phase by a periodic reference injection obviates the need for an explicit proportional path in a MDLL. As a result, digital MDLLs also exhibit superior DJ performance because jitter accumulation is proportional to K I K DCO instead of K P K DCO as in DPLLs (see Fig. 3(c) ). These features make digital MDLLs particularly suitable for high frequency clock generation using very low frequency reference clocks. Next, we take a closer look at the DJ performance limitations of a digital MDLL. Assuming the output frequency F OUT dithers between two quantization levels, the deterministic jitter (DJ) generated by a digital MDLL can be calculated as [10] :
When the integral path step size K I K DCO F OUT , (1) can be simplified to
From (2) we can see that DJ of the output clock generated by a digital MDLL is proportional to K I K DCO and reference clock period T REF . Hence, lowering DJ requires reducing either K I K DCO or the limit cycle period (or both). To reduce K I K DCO a very high-resolution DCO is required. Fig. 4 shows the effect of DCO frequency resolution on DJ with a limit cycle period of 2 μs (equivalent to a 1 MHz reference clock) and N = 2048. For DJ to be less than 0.25 UI of the output clock requires a DCO resolution better than 150 ppm/LSB. This translates to very small current step size for the DAC used in a DCO (<1 nA/LSB). On the other hand, the limit cycle period depends upon the loop latency and the input clock period and cannot be shorter than 2T REF . Thus, for a given DCO resolution, the reference frequency sets the lower bound on DJ. In view of this, we present a cascaded frequency synthesizer that uses the proposed scrambling TDC, to decouple DJ from the reference clock period. The proposed scrambling TDC facilitates the design of a frequency synthesizer with excellent DJ performance even when operating with a very low frequency reference clock.
III. EVALUATION OF FREQUENCY SYNTHESIZER ARCHITECTURE OPTIONS
MDLLs offer superior jitter performance compared to PLLs, but they are susceptible to non-idealities of the circuit responsible for injecting a clean reference edge (select logic) in place of a noisy VCO edge [11] . Typically, such imperfections manifest as increased DJ, negating some of the lower DJ benefits of MDLLs. The strict waveform shape matching and timing requirements to reduce select logic induced DJ get exacerbated at low reference frequencies and large divide ratios, thus limiting MDLLs to low to moderate output frequencies and small division ratios (<50) [11] , [14] .
A cascaded frequency synthesizer in which the multiplication factor is split between two stages reduces the division ratio in each stage and eases the design of the low frequency first stage. For comparison, we consider the noise performance of 4 possible combinations namely, DPLL-only, DMDLL-only, cascaded DMDLL+DPLL, and cascaded DMDLL+DMDLL as shown in Fig. 5 . We assume that the VCO is the dominant source of noise and keep the total VCO power the same in all four architectures to 3 mW. In case of the cascaded architectures, the VCO power is optimally split among the two stages. Since the first stage dominates the overall noise performance, majority of the power is used in the first stage VCO. The second stage VCO (LP-VCO) is designed to guarantee sustained oscillation. With these considerations, the first stage VCO is designed with 2.7 mW, while the second stage VCO consumes only 325 μW.
Partitioning of multiplication factors between the two stages is an important design consideration. Ideally, a large multiplication ratio in the first stage helps minimize noise from the second stage. However, a large multiplication factor in the first stage makes the MDLL select logic design more complex and increases power consumption. Mismatch in rise/fall times of selection MUX inputs increase reference spur and degrade deterministic jitter. Further, the second stage loop components such as PFD and digital loop filter also consume higher power because they operate at higher frequency. In view of these power/jitter trade-offs, we evaluate different design choices illustrated in The simulated output phase noise plots shown in Fig. 6 indicate that cascaded architectures with a digital MDLL (DMDLL) as first stage show superior RJ performance compared to both DMDLL-only and DPLL-only topologies. Among the two cascaded architectures, using a DMDLL in the second stage offers a slightly better phase noise performance compared to using a DPLL. However, as discussed earlier, additional power and design effort is needed to design a high speed select logic for the second stage DMDLL. Thus, the DPLL second stage is chosen in the proposed architecture for simplicity. This configuration is previously used in fractional-N frequency synthesizers to reduce the fractional divider quantization noise [15] , [16] . In our work, it is used to improve power-jitter (RJ) trade-off. For the cascaded architecture to be effective, compared to a single-stage topology, the first stage must have lower phase noise and deterministic jitter. Alternate architectures such as cascaded PLL+DLL [17] provide higher order suppression of the VCO noise. However, because both PLL and DLL are updated at the reference frequency, they suffer from large deterministic jitter when a low frequency reference clock is used (see Section II for detailed discussion on this trade-off).
IV. PROPOSED ARCHITECTURE
The block diagram of the proposed frequency synthesizer is shown in Fig. 7 [18] . It is composed of a cascade of a digital MDLL (DMDLL) and a Type-II DPLL. The DMDLL operates with a 1.25 MHz input reference clock and provides a 160 MHz output, OUT 1 , which is then multiplied using a DPLL to generate the 2.56 GHz output, OUT 2 . Fig. 8 shows the simulated phase noise contributions of the different stages of the proposed frequency synthesizer architecture.
As discussed earlier, because of the absence of the proportional path, the DMDLL DJ is dictated by the integral path gain and the jitter accumulation period, which is about 1 μs. In this work, the proposed scrambling TDC (STDC) minimizes DJ by reducing the jitter accumulation time as described later. An accumulator (ACC) integrates the STDC output and drives the multiplexed ring oscillator through a current-mode digital to analog converter (DAC). The control signal to the edge replacement multiplexer is generated by the select logic. Details of each of these building blocks are provided next starting with the STDC.
A. Scrambling TDC
The block diagram of the proposed scrambling TDC is shown in Fig. 9 . It consists of a sub-sampling bang-bang phase detector (BBPD) followed by gain scaling second order deltasigma ( ) modulator. The BBPD is implemented using a cascade of two flip flops, FF1 and FF2. The flip-flops are designed using symmetric sense amplifier architecture reported in [19] . FF1 sub-samples the VCO output with a reference clock and detects the sign of the phase error. FF2 re-samples FF1 on the negative edge of the reference clock to reduce the output state dependent hysteresis [14] . The output of FF2, denoted as FF2 OUT , is a single bit representation of the sign of the phase error and is directly used to drive the loop filter accumulator in conventional DMDLLs. Because FF2 OUT gets updated only at every reference clock period T REF , the jitter accumulates in proportion to T REF , as illustrated in Fig. 10 .
The STDC reduces the jitter accumulation period by scrambling FF2 OUT such that the output rate is increased (to 64F REF in our implementation) without altering the mean of the BBPD output. A digital modulator is used to perform scrambling as shown in Fig. 9 . The magnitude of FF2 OUT is first scaled by a factor of 2 −12 and the resulting 12-bits are truncated to 1-bit using a second order modulator, implemented using an error-feedback architecture.
Clocked at a frequency of 64F REF , STDC OUT is a 1-bit sequence whose mean is equal to that of a conventional BBPD while the gain and output rate are scaled by a factor of 2 −12 and 64, respectively. As a result of the increased output rate, the jitter accumulation time is reduced, resulting in a much smaller DJ as illustrated in Fig. 10 .
Note that, simply clocking the STDC at a higher frequency (without scaling the BBPD output) reduces the jitter accumulation period but also increases the overall TDC gain, potentially increasing the DJ to an unacceptable value. To mitigate this undesirable effect, the BBPD output FF2 OUT is scaled by 2 −12 before feeding it to the modulator. Fig. 11 shows the block diagram of first order STDC. It consists of a BBPD followed by gain scaling first order delta-sigma modulator. Unlike the second order modulator, the error-feedback architecture of this first-order STDC consists of a simple digital accumulator. While the first-order STDC also provides gain scaling, it introduces limit cycles of its own and degrades jitter performance. Fig. 12 shows the time domain waveforms of first and second-order STDC. The limit cycle behavior of the first-order STDC can be clearly seen in Fig. 12(a) even when it is clocked at 64F REF . On the other hand, the limit cycle behavior is greatly suppressed in secondorder STDC as shown in Fig. 12(b) .
Based on these observations, a second-order (as opposed to first-order) modulator is used in the STDC to minimize jitter accumulation caused by limit cycles of the modulator itself. The simulated spectrum of the STDC output, STDC OUT , shown in Fig. 13 , reveals large spectral peaks when a firstorder modulator is used. These spectral peaks indicate the presence of low-frequency limit cycles in the MDLL loop, Scrambling TDC output spectrum using 1 st and 2 nd order modulator.
which increase the jitter accumulation time and degrade DJ. When a second order modulator is used, no spectral peaks are observed, resulting in a limit cycle free behavior of the MDLL loop.
The shaped truncation error at the output of the STDC is filtered by the DMDLL's low-pass jitter transfer characteristic. Because the oscillator phase noise suppression is mainly from the feed-forward reference edge injection, the DMDLL input jitter transfer bandwidth can be reduced to adequately suppress the STDC quantization error. This is in contrast to a DPLL, which suffers from a conflicting noise bandwidth trade-off to simultaneously suppress the TDC quantization error and oscillator phase noise [10] . It is also important to note that modulators are commonly used in PLLs to truncate the digital loop filter output into fewer number of bits [10] , [14] . However, this approach needs larger size accumulator and modulators. For example, to achieve 2 −12 gain scaling with 8-bit DAC the conventional architecture will need at least 20-bit accumulator and a 20-to-8 bit modulator. On the other hand, with STDC based architecture same gain scaling can be achieved with 12-to-1 bit modulator followed by an 8-bit accumulator. Additionally, to achieve high DAC resolution such modulators are often followed by a low-pass post filter [10] , [14] . However, this approach is not suitable for applications where the reference frequency is low, i.e., few MHz, because suppressing the limit cycle requires prohibitively large post filter. In contrast to this, the proposed STDC uses a modulator as an efficient means of scaling the BBPD gain by 2 −12 (as shown in Fig. 10) .
B. Accumulator and DAC
The block diagram of the integral path accumulator along with the DAC schematic is shown in Fig. 14 . It consists of a 8-bit accumulator, a binary to thermometer converter, and a 255 unit element based current mode DAC. The fullysynthesized accumulator integrates a single-bit STDC output and generates a 8-bit output. Since the STDC output is updated at 64F REF , the accumulator and the subsequent DAC are also updated at 64F REF . Clocking them at a lower rate is equivalent to down sampling the STDC output, which corrupts the noise shaping, and severely degrades the DJ performance benefits offered by the STDC. The 8-bit output of the accumulator is passed through a binary to thermometer converter whose output is used to drive a 8-bit current-mode DAC consisting of 255 identical stages of PMOS unit current sources (see Fig. 14) . The output current of the DAC, I OUT , is directly used to drive the multiplexed ring oscillator.
C. Multiplexed Ring Oscillator
The schematic of the multiplexed ring oscillator along with the select logic is shown in Fig. 15 . The core ring oscillator is implemented using 44 voltage-controlled CMOS inverter based delay stages. An external bias (I BIAS ) is used to bring the oscillator frequency to be within the MDLL pull-in range. In practice, a frequency locked loop (FLL) as demonstrated in [14] can be used to achieve frequency acquisition automatically. Since the oscillation frequency of the ring is relatively low, a large number of inverter stages are used to achieve sharp rise and fall times, which helps minimize pattern jitter during reference injection. A divider along with the select logic is used for periodically injecting a reference clock in to the ring oscillator. The select logic is similar to the one employed in [13] and is implemented using standard cells. Using the divider output and the output of an intermediate delay stage in the oscillator, the select logic generates a select signal, SEL, for the multiplexer. A NAND-gate based multiplexer is used for selecting between the buffered reference clock, REF B and oscillator output, OUT B . To minimize clock feed through, both REF B and OUT B signals are connected farthest away from the multiplexer output (see Fig. 15 ).
The static phase offset (SPO) in the DMDLL must be minimized as it directly appears as pattern jitter at the output [14] . The three main sources of SPO are: a) the voltage offset of FF1 in the STDC (see Fig. 9 ), b) the rise/fall time mismatch between REF B and OUT B signals, and c) the periodic voltage ripple on the DCO supply node (V CC in Fig. 15 ). The voltage offset of FF1 is reduced by up-sizing the transistors that contribute to voltage offset and the impact of the voltage offset on SPO is minimized by reducing the rise/fall times of both the reference clock and the VCO inputs of FF1. Monte-Carlo simulations indicate that the standard deviation of the input referred voltage offset of FF1 is 3.7 mV, which translates to an SPO of only 0.14 ps with a rise/fall time of 30 ps as shown in Fig. 16 . The rise/fall time mismatch between REF B and OUT B is minimized by buffering the reference clock using replica delay stages. Using a large number of stages in the oscillator lowers the rise/fall time, which helps to reduce the impact of the rise/fall time mismatch on the static phase offset. Fig. 17 shows the effect of reference injection on the VCO supply node V CC . Because the reference clock swing is from 0 to V DD while that of the VCO output is only 0 to V CC , reference injection causes periodic ripple on V CC . Fig. 17(a) shows the effect of reference injection on the VCO supply node V CC when the MDLL ring and the multiplexer supply are connected to the same node (V CC ). In this case, the voltage swing difference between the injected reference REF B (V CC1 ) and the ring output OUT B (V CC ) causes a periodic ripple on the VCO supply node V CC . This ripple changes VCO output clock period (T SHORT ) and appears as DJ at the DMDLL output. To minimize DJ, the ring VCO control is split into two parts: variable control voltage V CC and a fixed voltage V DD . The multiplexer along with the delay stages connected to its input and output are connected to V DD while the rest of the delay stages are connected to the control node V CC as shown in the Fig. 15 . Separating the VCO control voltage from the circuitry responsible for reference injection minimizes reference clock feed though the supply node and helps reduce DJ (see Fig. 17(b) ).
V. SECOND-STAGE DPLL
The second-stage of the proposed frequency synthesizer is implemented using a hybrid-PLL as shown in Fig. 18 [10] . It uses a 160 MHz DMDLL output as the reference clock and generates a 2.56 GHz output signal (multiplication factor of M = 16). Classical TDC-based digital PLLs suffer from a coupled noise bandwidth trade-off, which severely limits their jitter performance. In other words, the need to suppress the TDC quantization by low-pass filtering and the DCO phase noise by high-pass filtering mandates either a high-resolution TDC or a low noise oscillator both of which increase power dissipation. By using an analog proportional path, a hybrid-PLL eliminates the TDC quantization error and alleviates the coupled noise bandwidth trade-off of digital PLLs. Furthermore, an analog proportional path also helps ease the resolution requirements of the digital integral path. As a result, a hybrid-PLL is capable of achieving a low jitter with a low power consumption. Other than the analog proportional path, the hybrid-PLL resembles a conventional digital PLL with the digital integral path controlling the current controlled oscillator (CCO) through a digital to analog converter.
Analog proportional control is implemented using a 3-state PFD that directly drives the CCO through a 3-level currentmode DAC, denoted as PDAC in Fig. 18 [10] . Because the PFD produces an output proportional to the input phase error without any quantization error, the hybrid-PLL behaves like a linear system in steady-state and exhibits well controlled loop dynamics. Integral control is implemented by digitally accumulating the error generated by detecting the sign of the phase error between UP/DN outputs of the PFD. A flip-flop (FF) performs sign detection and its output is integrated in a 18-bit accumulator. The four least significant bits (LSBs) are dropped to reduce the gain of the integral path and the rest of the 14-bits are fed to a DAC. A second-order digital modulator operating at twice the reference frequency truncates 14-bits to 8-bits and generates an integral control using a 8-bit current-mode DAC.
The schematic of the current controlled ring oscillator used in the DPLL is shown in Fig. 19 . The delay cells are implemented using current starved pseudo differential CMOS inverters coupled in a feed-forward manner using transmission gates. The oscillation frequency is controlled by varying the input current, I CTRL , with the help of a current source. An ac coupled inverter biased with a replica inverter is used as an output buffer to achieve rail-to-rail output swing with low power and minimal duty cycle distortion.
VI. EXPERIMENTAL RESULTS
A complete block diagram of the proposed frequency synthesizer is shown in Fig. 20 . Using a 1.25 MHz reference clock, the first stage DMDLL and the overall DMDLL+DPLL provide 160 MHz, OUT 1 (multiplication factor of N = 128), and 2.56 GHz, OUT 2 (multiplication factor of N × M = 2048), outputs, respectively. A prototype of this frequency synthesizer is implemented in a 90 nm CMOS logic process and occupies an active area of 0.16 mm 2 . In the prototype chip, Fig. 19 . Schematic of the ring oscillator used in the second-stage DPLL the divide ratio is fixed to the maximum. However, different multiplication factors can be achieved by changing either first stage multiplication ratio (M) or second stage multiplication ratio (N) or both. The die micro-graph of the prototype is shown in Fig. 21 . The die is packaged in a standard 48 pin plastic QFN package and characterized using a four-layer FR-4 printed circuit board. The reference clock is provided by a 1.25 MHz crystal oscillator. Operating from a 1.0 V supply, the prototype chip consumes 4.76 mW of power, out of which 3.56 mW is consumed by the DMDLL and the rest (1.2 mW) is consumed by the second-stage DPLL.
The measured voltage spectrum of the DMDLL output is depicted in Fig. 22 when the STDC is not enabled. In this case, 1-bit TDC output, FF2 OUT , is directly fed to the accumulator. As expected, large limit cycles appear as spectral peaks at various locations in the output clock spectrum. When the proposed STDC is enabled the jitter accumulation period reduces and the magnitude of limit cycles is greatly suppressed as shown in Fig. 23 . In this case, the dominant spurious tone is the reference spur of magnitude -53 dBc caused by the residual static phase offset. Fig. 24 shows the effect of STDC update rate when it is reduced from 64F REF to F REF . In-band quantization error increases with lower sampling frequency, resulting in poor in-band noise performance.
Time domain measurement of long-term absolute jitter of the DMDLL output at 160 MHz is performed using a digital sampling oscilloscope (Tektronix DSA8300). When the STDC is OFF, the long-term jitter is 20.07 ps rms and 99 ps peak-topeak as shown in Fig. 25(a) . When the STDC is turned ON, the jitter reduces to 2.4 ps rms (22.1 ps peak-to-peak jitter) limited only by the random noise component (see Fig. 25(b) ). This represents approximately an 8× improvement in rms jitter and a 4× improvement in the peak-to-peak jitter.
To quantify the performance of the stand-alone secondstage DPLL which provides a 2.56 GHz output, OUT 2 , its jitter performance is measured by providing a 160 MHz external reference clock generated using an arbitrary waveform generator (Tektronix AWG70002A). The measured long-term absolute jitter is 3.4 ps rms and 26.1 ps peak-to-peak, as shown in Fig. 26 . Fig. 27 shows the time domain measurement of the complete 1.25 MHz to 2.56 GHz frequency synthesizer in the cascaded DMDLL+DPLL configuration. When the STDC is OFF, the final 2.56 GHz output, OUT 2 , has a long-term absolute jitter of 30.2 ps rms and 154.8 ps peak-to-peak (see Fig. 27(a) ). The poor jitter performance is mainly attributed to the large deterministic jitter introduced in the DMDLL. This large peakto-peak jitter is approximately 40% of the output time period (≈ 390 ps), which makes the 2.56 GHz clock output, OUT 2 , unusable for any practical application. When the STDC is enabled, the jitter reduces to 4.18 ps rms and 35.2 ps peak-topeak, as shown in Fig. 27(b) , which represents approximately a 7× improvement in rms jitter and a 4× improvement in the peak-to-peak jitter. Fig. 28 shows the measured phase noise plots of the DMDLL and cascaded DMDLL+DPLL outputs using an Agilent E4440A spectrum analyzer. At a 100 kHz offset frequency, the DMDLL and DMDLL+DPLL achieve an inband phase noise floor of −106.2 dBc/Hz and −81.8 dBc/Hz, respectively. The difference between the two noise floors is 24.4 dB, which is only 0.4 dB larger than the ideal value of 24 dB (10 log(M 2 )). Thus the in-band phase noise contribution at the output is dominated by the DMDLL phase noise. Table I shows the performance summary of the proposed frequency synthesizer at both the 160 MHz MDLL and 2.56 GHz MDLL+DPLL outputs. Performance of the state-of-the-art frequency synthesizers operating with a low frequency reference is also presented in this table. In comparison to the existing frequency synthesizers, the proposed architecture achieves the best power efficiency of 1.86 mW/GHz and lowest long-term absolute jitter of 4.18 ps rms and 35.2 ps peak-to-peak.
VII. CONCLUSIONS
Synthesizing high frequency clock from a very low frequency reference clock using a classical phase-locked loop has two main drawbacks. First, as the loop bandwidth cannot be larger than 1/10 th the reference frequency, the oscillator phase noise is not adequately suppressed, thus mandating a power hungry low noise oscillator. Second, the loop filter capacitor needed to realize the loop-stabilizing zero is prohibitively large and, therefore, difficult to integrate on-chip. While DPLL offers an attractive alternative for implementing fully integrated clock multipliers with significant area savings, it also cannot suppress the oscillator phase noise adequately due to the low frequency reference. More importantly, due to its bang-bang behavior the DPLL suffers from an additional tradeoff between reference clock frequency and deterministic jitter accumulation. This trade-off results in severe deterministic jitter degradation.
In this paper, a cascaded digital MDLL and digital PLL frequency synthesizer with a scrambling TDC (STDC) is presented to achieve optimal random jitter and deterministic jitter performance with a low frequency input reference clock. The proposed STDC alleviates the trade-off between the deterministic jitter accumulation and the input reference frequency and achieves a reference frequency independent deterministic jitter. The cascaded architecture with first stage as digital MDLL and a second stage as digital PLL provides wide bandwidth for improved VCO phase noise suppression and achieves optimal random jitter. Fabricated in a 90 nm CMOS process, the prototype frequency synthesizer consumes 4.76 mW power from a 1.0V supply and generates 160 MHz and 2.56 GHz output clocks from a 1.25 MHz crystal reference frequency. The measured results show that with the new scrambling TDC, the long-term absolute jitter of the 160 MHz digital MDLL and 2.56 GHz digital PLL outputs are 2.4 ps rms and 4.18 ps rms , while the peak-to-peak jitter are 22.1 ps and 35.2 ps, respectively. The proposed frequency synthesizer occupies active die area of 0.16mm 2 and achieves power efficiency of 1.86 mW/GHz. The measured results show that with the new scrambling TDC, the MDLL has a 8× improvement in rms jitter and a 4× improvement in peak-topeak jitter. While the cascaded output has a 7× improvement in rms jitter and a 4× improvement in the peak-to-peak jitter.
