Abstract-A monolithic and self-referenced radio frequency (RF) LC clock generator that is compliant with USB 2.0 is demonstrated in a system-on-chip (SoC). This work presents the first successful approach to replacing an external crystal (XTAL), the crystal oscillator (XO) and the phase-locked loop for clock generation in an IC supporting USB 2.0 using a standard CMOS fabrication process. It is shown that the primary design challenges with the implemented approach involve maintaining high frequency accuracy and low jitter. Techniques for addressing both are shown. In particular, the presented architecture exploits the effects of frequency division and low far-from-carrier phase noise to achieve low jitter. 
I. INTRODUCTION

M
UCH recent work has focused on eliminating crystal (XTAL) frequency references in clock and frequency generators. The advantages of doing so include cost, form factor and component count reduction, along with potentially increased reliability. The most formidable challenge in such efforts is achieving the high accuracy and stability that are intrinsic to high quality ( ) factor XTALs in crystal oscillators (XOs). RF MEMS microresonators [1] and film bulk acoustic resonators (FBARs) [2] are alternative high-frequency references that have received attention recently and have been targeted at RF frequency synthesis. However, these approaches present integration challenges with standard microelectronic technologies, such as CMOS, thus compromising the utility of the approach for many applications.
The CMOS process incompatibility of RF MEMS microresonators and FBARs has stimulated a resurgence of interest in alternative clock generation approaches such as RC oscillators (RCOs) which can be implemented in a CMOS process. In recent work, the minimum achievable phase noise in RCOs has been investigated [3] . However, related work has shown that even with frequency trimming and temperature compensation, the typical total frequency inaccuracy of RCOs is 2% [4] . Another well-known approach to CMOS clock generation includes the use of temperature compensated relaxation oscillators, such as those reported in [5] and [6] . The voltage and temperature sensitivity of the relaxation oscillator reported in [5] are 65 ppm/ C and 0.4%/V, respectively, and 60 ppm/ C and 0.5%/V in [6] . Neither approach achieves frequency accuracy that rivals XTALs or presents sufficient accuracy for most clock generation applications despite the fact that the frequency accuracy requirements, in many such applications, are substantially lower than the accuracy achieved with an XO. In such cases, frequency accuracy requirements can be as loose as 1.25% while the accuracy achieved with an XO is typically within the range of 50 ppm to 200 ppm. Consequently, XTAL frequency references continue to be utilized in applications requiring low to moderately high accuracy because an alternative approach to clock generation that can achieve the required performance does not exist. This paper presents a CMOS approach that can meet the specifications for many such applications while not requiring a XTAL frequency reference.
One particularly relevant application is the serial wire data transfer protocol USB 2.0 which supports low-speed (LS), full-speed (FS) and high-speed (HS) data rates at 1.25 MHz, 12 MHz and 480 MHz, respectively. The master USB node must maintain 500 ppm total frequency accuracy for all modes and over all operating conditions ( 10% and 10 C to 85 C), while the accuracy required for the hub node must be 1.25%, 2500 ppm and 500 ppm for LS, FS and HS rates, respectively [7] . In this work, we demonstrate a clock generator suitable for the master node at any USB data rate and which has been integrated into an SoC including an FS-USB PHY.
In addition to frequency accuracy requirements for FS-USB, and referring to Fig. 1 , the time between any set of data transitions must be within , where is the number of bits between the transitions, is the period of the 12 MHz clock and is the peak-to-peak jitter. Referring to the nomenclature illustrated in Fig. 1 , the maximum jitter for any consecutive (or cycle-to-cycle) differential data transition must be less than 2 ns and less than 1 ns for any set of paired differential data transitions. This data jitter specification includes timing variations due to differential buffer delay, rise and fall time mismatches, internal clock source jitter, noise and other random or deterministic effects [7] . Thus, the clock jitter can account for only a fraction of the total data jitter budget, which is 1 ns at its minimum. Consequently, a typical specification for the peak-to-peak clock jitter, , is 200 ps , which is difficult to achieve without a high-frequency reference.
In this work, we build on the results presented in [8] and [9] where monolithic and self-referenced RF temperaturecompensated LC, or harmonic, oscillators (RF-TCLCOs or RF-TCHOs-RF-TCHO will be the nomenclature used in this paper) were explored for digital clock generation. Particular attention is focused on the challenges of achieving both the required frequency accuracy and jitter. This work presents the first demonstrated approach to replacing the external XTAL, XO and PLL for clock generation in an SoC that supports USB 2.0 while using only devices that are available in a standard CMOS fabrication process.
II. BACKGROUND
The two most significant challenges in replacing an XO with an RF-TCHO are achieving high frequency accuracy (low drift) and high frequency stability (low jitter). To justify the approach taken in this work, first we present the factors contributing to frequency drift and review the definitions of phase noise and period jitter. Next, the effects of frequency multiplication and division and the relationship between phase noise and period jitter are explored. Finally, a qualitative description of the performance that can be achieved with the demonstrated approach is described in relation to a typical XO-referenced and phase-locked clock generator.
A. Temperature-and Bias-Dependent Self-Oscillation Frequency Drift of an LCO
The self-oscillation frequency of an LCO drifts in a predictable manner due to variations in temperature and bias conditions, the latter of which includes variations in the power supply voltage and bias current. Consider the generalized schematic for a negative transconductance LCO as shown in Fig. 2 . The natural resonant frequency of the LC network is . Considering the parasitic losses of the coil, , and of the tank capacitance, , the oscillation frequency must be redetermined by solving for the zero phase of the lossy network, which yields
In most integrated LCOs, the coil loss is substantially larger than the loss in the equivalent tank capacitance, so (1) can be approximated by (2) For the purpose of predicting the self-oscillation frequency drift, both and exhibit negligible bias and temperature coefficients. The former was shown in [10] , and the latter is true as long as thin-film capacitors are utilized. With as the only temperature-dependent parameter in (2), integrated LCOs exhibit a substantially linear negative temperature coefficient (TC), as illustrated in Fig. 3(a) . To stabilize the TC, a method of reactive compensation of the temperature-dependent coil loss must be introduced. Additionally, bias-dependent drift, shown in Fig. 3(b) and Fig. 3(c) , must be addressed.
Expressions (1) and (2) are valid only for a amplifier that exhibits a linear current-voltage working characteristic; a response illustrated in Fig. 4(a) . Practical amplifiers exhibit nonlinear behavior at the extremes of the voltage excursions. A treatment of the effects of a nonlinear negative transconductor working characteristic was presented in [11] and it was shown that an increase in the harmonic content of the current, , driven into the LC network results in a decrease in the oscillation frequency relative to . Physically, this effect arises from a work imbalance between the inductor and capacitor in the presence of a driving signal with high harmonic content [11] . Because the capacitor presents a substantially lower impedance at high frequencies, in comparison to the inductor, it absorbs the majority of the spectral content of . Consequently, the oscillation frequency is reduced to the point at which the work between the two balances. Considering this work imbalance, a method of reactive power balance of harmonics can be employed to determine the oscillation frequency [11] . For the LCO shown in Fig. 2 , if the working characteristic is loopless, as shown in Fig. 4(b) , the oscillation frequency is given by (3) where and where is the th Fourier coefficient of . A loopless working characteristic indicates that the path integral around the working characteristic for a single and complete oscillation cycle is 0. Referring to Fig. 4(b) , this implies that the integral around the path is 0. As shown in Fig. 4(c) , if the working characteristic is loop-shaped, then the path integral will be nonzero. Physically, a loop-shaped working characteristic indicates that the transconductor stores energy. For simplicity, it is assumed that the working characteristic of the negative transconductor in this work is loopless, which is valid if the reactance in the transconductor is negligible or is absorbed into the LC network.
Expression (3) enables prediction of the self-oscillation frequency response as a function of bias conditions. As the bias changes to increase the harmonic content of , the oscillation frequency decreases. However, the response will saturate as changes in the bias eventually do not substantially increase the harmonic power injected into the LC network. The bias current in the amplifier has a substantial effect on the drift of the self-oscillation frequency. Its response, shown in Fig. 3(b) , corresponds to the description just presented. Similarly, the power supply voltage, , will affect the harmonic content. As is lowered, the available oscillation headroom is decreased, thus less harmonic power can be injected into the network and the frequency increases, as shown in Fig. 3(c) . Lastly, in both Fig. 3(b) and Fig. 3(c) there exists a region for which no oscillation occurs. In these regions, insufficient current is present for start-up.
This qualitative analysis illustrates that the self-oscillation frequency of an integrated free-running and uncompensated LCO typically drifts with a linear negative frequency temperature coefficient , an effect which must be compensated to ensure sufficient frequency accuracy over temperature. Additionally, it shows that both the bias current and the power supply voltage modulate the harmonic power injected into the LC network causing self-oscillation frequency drift. Although the latter effects could potentially be utilized to self-compensate , a phase noise penalty exists at higher bias currents [12] . In terms of performance, it is best to bias the oscillator with a method for constant spectral content over variation of environmental conditions and stabilize with a method of reactive compensation of the temperature-dependent coil loss.
B. Phase Noise and Period Jitter
Oscillators exhibit both phase noise and jitter due to noise and other environmental effects. To motivate the approach taken in this work, these phenomena are defined and discussed here briefly. In [13] a simple model to predict the single sideband (SSB) phase-noise power spectral density (PSD) of a generalized resonant oscillator was presented and is given by (4) where is the circuit noise factor, is Boltzmann's constant, is temperature, is the oscillation power, is the resonant device quality factor, is the oscillator fundamental frequency and is the frequency offset from the fundamental. Expression (4) is useful as it models the "white of frequency" or region of the SSB phase-noise PSD, which is typically the broadest. One of the most important insights obtained from (4) is the inverse-square relationship between phase noise and , an observation which will be discussed subsequently.
In digital systems, clock jitter metrics are of primary interest in contrast to communication systems where the stability metric of interest is phase noise. Jitter metrics quantify the time domain uncertainty in the oscillation period. The most common metrics for digital clock generation include period and cycle-to-cycle jitter. Define the ideal oscillation period as , the absolute instant in time of the th positive voltage transition of the clock signal as and the period of the th cycle as where . Assuming that the phase noise is a zero-mean stochastic process, the expected value of the discrete random sequence is . Period jitter is defined in relationship to any th and subsequent edge as (5) is the standard deviation of a single period, which is equivalent to the root mean square period jitter, , for a large sample size. Cycle-to-cycle jitter is defined in relationship to adjacent cycles as (6) and if each cycle is independent, then by definition it can be shown that . Typically, the timing budget for the clock signal is determined with respect to the peak-to-peak period or cycle-to-cycle jitter. These metrics vary over the observation interval because the distribution of the clock period for a signal exhibiting exclusively random jitter yields a zero-mean distribution with unbounded tails. Consequently and for example, peak-to-peak period jitter is determined from by selecting a bounding bit-error rate (BER) and multiplying by the appropriate scale factor, . A BER used commonly for clock specifications is 10 , indicating that statistically only 1 in 10 edges will exceed . Assuming a zero-mean Gaussian distribution, the scale factor is determined by solving for given the BER where erfc BER where erfc
For a BER of 10 , 14.069 and for a BER of 10 , 16.444.
C. Frequency Multiplication and Division Effects on Phase Noise
It has been shown in [14] that linear frequency multiplication and division results in a quadratic change in phase noise power. If a signal at frequency is multiplied by to a frequency , then the phase noise power of the signal at is , or 20 on a logarithmic scale, greater than the phase noise power at . Similarly, if the signal is divided in frequency by , the phase noise power decreases by or 20 . Both results are illustrated in Fig. 5(a) . Because both frequency multiplication/division and have quadratic relationships to the SSB phase-noise PSD far-from-carrier, it can be considered that frequency multiplication and division effectively degrade and enhance the of the reference oscillator, respectively, yielding an effective quality factor at the new frequency [15] . Therefore, it is possible for a low-frequency reference oscillator with very high ( ) multiplied in frequency by the factor to have the same or similar SSB phase-noise PSD at the new frequency as a high-frequency reference oscillator with a very low ( ) divided in frequency by the factor . for each case is given by and (8)
The relationships in (8) ignore the noise contribution of additional circuit components and, referring to (4), assume that and are the same for both reference oscillators, though it is trivial and algebraic to account for initial discrepancies in and between two different circuits. Consider the following quantitative and practical example. Gigahertz clocks in microprocessors are typically derived from reference clocks at approximately 100 MHz. If the XTAL for these reference clocks is 1 MHz and the loaded quality factor is 10 , then to synthesize 100 MHz, and by (8), . Now consider a 2 GHz LCO with a loaded quality factor of 5. To generate 100 MHz, and . This suggests that the SSB phase-noise PSD of both clock signals at 100 MHz would be similar. The analysis presents an opportunity to utilize the effects of frequency multiplication and division to address the issue of substantially differing -factors between LC and XTAL resonators, thus motivating the approach presented in this work.
D. Frequency Multiplication and Division Effects on Period Jitter
The effects of frequency multiplication and division on period jitter can be considered by analyzing the case of frequency division. Asynchronous frequency division of a periodic signal causes the magnitude of the period jitter to increase. A divider circuit will output one pulse for input pulses, thus the variance of the output period is the sum of the variances of the input periods due to the statistical independence of the position of each pulse in time. Because the period jitter is the standard deviation of the period, the period jitter of a signal divided in frequency by is given by where is the period jitter before frequency division. A converse argument can be presented for the case of frequency multiplication which, in contrast, reduces the period jitter by the same factor. This may seem counterintuitive as the SSB phase-noise PSD is reduced by frequency division. Consider the fact that although the period jitter has increased by , the period has also increased by . Therefore, it is instructive to describe the fractional period jitter in ppm as given by where is the period and is either or . Thus, the fractional period jitter for a signal divided in frequency by is given by (9) which is reduced by frequency division. The fractional period jitter is increased by the same factor for the case of frequency multiplication. However, if a divider stage is resynchronized by a flip-flop and a faster clock, then the jitter at the output of the stage will match that of the synchronizing clock signal. Both asynchronous and synchronous frequency division are utilized in this work.
E. Period Jitter and the Relationship to Phase Noise
It has been shown in [14] and [16] that the period jitter can be related to the phase-noise PSD by the following expression: (10) where is the oscillation frequency, is the phasenoise PSD at frequency offset from the carrier frequency, is the transfer function with equivalent cut-off frequency and . The expression shows that is determined by the integral of the projection of onto a trigonometric function with period , the latter of which can be considered as a mask of
. Equation (10) shows that the trigonometric mask reduces the close-to-carrier contributions of the phase-noise PSD to the period jitter. As increases, the trigonometric mask also increases and the phasenoise PSD contributes more substantially to the period jitter. In RF systems, the close-to-carrier phase noise is often of substantial concern as it can cause reciprocal mixing in receivers that can mask downconversion of the desired signal [17] . However, when considering the period jitter for digital clock signals, the close-to-carrier phase noise is less significant. In fact, as shown in (10), the far-from-carrier phase noise contributes more significantly to the period jitter, representing another significant observation to the approach taken in this work.
Consider these observations further and in relationship to a typical XO-referenced PLL clock synthesizer with a divider ratio of . For such a clock synthesizer, the output phase noise path will track the reference oscillator for offset frequencies within the loop bandwidth (BW) of the PLL but shifted by 20 due to frequency multiplication, as shown in Fig. 5(b) . Outside the loop BW of the PLL, the output phase noise path will track the VCO, which is typically a ring oscillator exhibiting relatively high phase noise, as shown in Fig. 5(b) . Now consider the total output phase noise path in relationship to the period jitter integration mask. As shown in Fig. 5(b) , not only does frequency multiplication and division affect the phase-noise PSD and period jitter, but so does the high far-from-carrier phase noise due to tracking the ring VCO outside of the PLL loop BW. In fact, the difference in the far-from-carrier phase-noise PSD between phase-locked ring oscillators and LCOs can be as high as 50 dB [18] . In comparison, a free-running LCO with a high frequency division ratio may exhibit lower period jitter than an XO-referenced PLL due to differences in both frequency multiplication/division and far-from-carrier phase noise. These observations further motivate the approach taken in this work as well as justify the fact that self-referenced and relatively low-RF LC resonators can be used to generate very low-jitter clock signals which are likely to achieve the performance required for a broad range of applications. The analysis also illustrates that high-frequency references for ring-PLLs are unlikely to achieve low jitter at the output, thus nullifying the benefits of a high-frequency reference. 
III. RF-TCHO CLOCK GENERATOR IMPLEMENTATION
A. RF-TCHO Clock Generator Architecture
The architecture of the developed RF-TCHO clock generator macro is shown in Fig. 6 , where each functional block is shown in simplified form as a module. The entire macro is powered by a 3.3-V rail which is derived from a 5-V bandgap referenced voltage regulator. The LC reference oscillator operates at 1.536 GHz, which facilitates frequency division by powers of 2, thus simplifying the divider implementation and reducing the likelihood of introducing deterministic jitter. A divide-by-2 current-mode logic (CML) stage buffers the oscillator from the frequency divider stages while providing a constant and state-independent load. The next frequency divider is also a CML divider which is followed by a CML-to-CMOS converter. The next five divider stages are implemented with static D-flip-flops which ensure 50/50 duty-cycle. The first two static dividers are asynchronous and the remaining three are synchronous. Frequency division by 2 yields 96 MHz for the SoC logic and an additional division by 2 yields 12 MHz for the FS-USB. Clock signals at 48 MHz (serving as a reference for an off-chip HS-USB PHY) and 12 MHz (for the on-chip FS-USB PHY) are the only clock signals that are driven off-chip. A temperatureand supply-independent current reference, , biases the amplifier and maintains constant harmonic content in the LC network, thus minimizing bias-induced frequency drift. Temperature compensation is achieved open-loop with a reactive compensation method using a programmable array of accumulation-mode MOS (A-MOS) varactors that are biased by either a temperature-dependent voltage , or by a temperature-independent voltage , both of which are derived from temperature-dependent current sources, and , along with resistors. The uncompensated is expected to be negative due to the coil loss . Consequently, must increase over temperature to reduce the net tank capacitance and maintain the self-oscillation frequency. To account for process variation, programmable discrete frequency calibration was implemented as a switched-capacitor array with poly-insulator-poly (PiP) capacitors and nMOS switches, as shown in Fig. 6 . The physical design of the nMOS switches employs a ring structure, identical to that used in [19] , which minimizes the parasitic device capacitance across the switch.
A schematic of the 1.536 GHz reference oscillator is shown in Fig. 7 . It is a complementary and cross-coupled topology where the pMOS and nMOS devices are sized such that the rise and fall times of the oscillation cycle are nearly equal, a design objective which is known to reduce the phase noise [12] . Additionally, transistors were sized to provide sufficient transconductance for start-up and to ensure oscillation over design corners. The oscillator is biased from a 500 A temperature-and bias-independent reference current that is scaled by a factor of 11 through a cascoded current mirror, providing 5.5 mA to the amplifier and yielding an overdesign factor of approximately 5. Transistors and are reset devices used to disable the reference oscillator and place it into a minimum current standby state.
The bias current generation circuitry is shown in Fig. 8 . Both proportional to absolute temperature (PTAT) and complementary to absolute temperature (CTAT) current references are generated. The former is derived from a -referenced self-biased topology that includes nMOS devices in weak inversion. The latter is derived from a standard -referenced self-biased topology. Both reference circuits contain start-up circuitry and reset transistors, the latter of which are required to place the bias generators in a minimum current standby state. From these current references, the temperature-and bias-independent reference current is derived by summing appropriate weights of the PTAT and CTAT references, providing constant spectral content operation in the amplifier.
and are used to generate and , respectively. Fig. 9 illustrates the programmable binary-weighted PiP capacitor array for discrete and monotonic frequency calibration. Eight calibration bits were implemented, enabling a 2.56% frequency calibration range with 200 ppm increments, yielding 100 ppm as the worst case initial frequency inaccuracy. The MSB switches 1.2 pF and the LSB switches 25 fF into the tank. As shown in Fig. 9 , dummy capacitor structures, which were constructed of minimum size PiP capacitors with a nominal capacitance of 10 fF, were required because it was determined that differences in the number of active switches without dummies modified slightly. The dummy structures ensure that the same number of switches are active for all calibration states and that the difference between any two states constitutes differences in the net capacitance exclusively. The calibration byte (CB) is determined post-fabrication with a module that is described in the next section. Once determined, CB is stored in one-time-programmable (OTP) nonvolatile memory (NVM) and then loaded thereafter upon power-on reset (POR).
The open-loop reactive temperature compensation circuitry is shown in Fig. 10 . The PTAT current is driven into one of three selectable resistor banks, all of the same magnitude, but of differing TCs depending on the type of resistors in the Fig. 9 . Programmable binary-weighted capacitor array for discrete and monotonic self-oscillation frequency calibration.
bank. The programmable resistor bank includes n-well, n-diffusion and poly-Si resistors which exhibit positive TCs with decreasing magnitude, respectively. The PTAT voltage is filtered across a large PiP capacitor and sets the backgate voltage on a programmable array of A-MOS varactors. Three binary-weighted programmable varactor states exist. When a bit is set, the back-gate of the corresponding varactor is connected to and when it is cleared, the backgate is connected to the temperature-independent voltage , which is derived by sinking into an n-well resistor with a high positive TC. The array allows the oscillation frequency to remain constant, but the ratio of the temperature-dependent to temperature-independent capacitance to be programmed, thus modifying . Six bits program the open-loop TC circuitry, three of which control the resistor bank and three of which select the size of the A-MOS varactor. These bits are determined post-fabrication via a one-time full temperature characterization effort, stored in NVM for all devices, and loaded thereafter upon POR. TC trimming for each device is not required. Results reported subsequently utilized the state where the net resistor included 80% n-diffusion and 20% unsalicided poly-Si resistors.
B. Nominal Self-Oscillation Frequency Calibration Module
The nominal self-oscillation frequency of the clock generator will inevitably be insufficiently accurate due to process variation. Consequently, a digital and synthesizeable frequencylocked loop (FLL) was developed to determine the CB required to recenter the self-oscillation frequency automatically on the tester. This calibration routine is executed for every device. Once CB is determined, the coefficient is stored in NVM and loaded thereafter upon POR.
A functional diagram of the FLL calibration circuit is shown in Fig. 11 . The FLL inputs include a 48 MHz output of the RF-TCHO clock generator, CLK_IN and an external 48 MHz reference clock, CLK_REF, which is included on the load board of the tester. EC enables the calibration routine. 12-bit and 11-bit counters are utilized to determine which input signal is at a lower frequency by evaluating timing races between the two signals. The counter size determines the minimum frequency difference that can be resolved. With an 11-bit counter, the worst case minimum frequency discrepancy that can be resolved is given by 1/2 or 488 ppm. This worst case occurs only when the frequencies of the two signals are very close, the faster clock edge occurs immediately before the calibration routine is initiated, and the slower clock edge occurs immediately after it is initiated. In such a case, it will take all 2 cycles for the phases of the two signals to drift past each other before the minimum frequency discrepancy can be resolved. Away from this boundary condition, much finer tolerance can be resolved. Nevertheless, to ensure convergence to the optimal state, five calibration routines are run per device and the statistical mode of the results is selected. The results of the timing races are determined by the time order of the transitions of the MSB of the reference clock counter (REF_MSB) and the terminal count of the RF-TCHO clock counter (CLK_TC). These signals drive an up/down counter and state machine that determines whether to increase or decrease the frequency of the RF-TCHO clock by decrementing or incrementing the current CB value respectively. The calibration module can be bypassed and data can be written directly into the CB register either from NVM or externally. Fig. 12 illustrates the state diagram for the calibration module. The default state is IDLE. Once calibration is enabled by setting EC , the calibration routine is initiated. The routine stalls until the REF_MSB is set before beginning to track the counters, a technique which provides stabilization time between the last CB change and the next race. Once a race begins, if the RF-TCHO clock counter reaches its highest state (CLK_TC ) and REF_MSB has not been reset, then the RF-TCHO clock is faster and CB is incremented to add more capacitance into the LC network and reduce the frequency. Conversely, if REF_MSB is reset and CLK_TC is still low, then the reference clock is faster and CB is decremented. When both conditions occur, the routine completes. Two flip-flops, as shown in Fig. 12 , form the synchronization boundary between the two asynchronous clock domains. One flip-flop synchronizes REF_MSB with CLK_IN. The resulting signal, SYNC, has known timing with respect to CLK_IN which is used to clock the state machine. The other flip-flop is used to generate a synchronized signal for initializing the reference clock counter properly. The reference clock counter is initialized when the state machine is in states other than START or TRACK. The RF-TCHO clock counter is initialized when the state machine is in states other than TRACK. Each increment or decrement requires up to 2 cycles of the 48 MHz reference clock or approximately 42.7 s. At most, 128 changes to CB are required, yielding a maximum calibration latency of 5.47 ms per routine. Thus, the maximum calibration latency for each device is 27.35 ms because five routines are executed per device. Typical measured calibration latency was less than 10 ms.
IV. MEASUREMENT RESULTS
A micrograph of the RF-TCHO clock generator macro, which occupies 0.22 mm in a dual-gate (DG) two-poly four-metal (2P4M) 0.35 m CMOS process, is shown in Fig. 13 . The calibration FLL is not shown as it was synthesized with the remainder of the logic in the SoC. The DG option was required to support 5-V I/O and power for the USB, while the second poly-Si layer was required to realize the PiP capacitors. A 4-nH spiral hollow-core inductor was realized by interconnecting metal-3 and metal-4 in parallel, a topology which is known to provide a high -factor for planar inductors [20] . The estimated loaded -factor for this structure is 5, resulting, per (8) , in a of 640 for the 12 MHz clock due to frequency division by a factor of 128. The clock generator draws 9.5 mA from the regulated 3.3-V power supply. Although the current is high for a low frequency clock generator, it is acceptable considering that for the given application power is provided via the PC USB port and an off-chip discrete XO that dissipated a similar amount of power was replaced in this application. Nevertheless, a trade-off between and power does exist. However, in future revisions of the device, the power dissipation can be reduced as will be discussed.
The SoC into which the clock generator was embedded includes an FS-USB PHY interface. Thus, the macro can be controlled and programmed completely by any PC that supports USB 2.0. The SoC was packaged in a plastic SSOP and assembled onto a PCB for environmental test. First, the nominal frequency of each device was calibrated post-fabrication with the embedded calibration FLL described previously and an external 48 MHz crystal clock reference which was included on the load board for the tester. Once the correct CB was determined, it was written into NVM. A sample time-domain waveform of the post-calibration clock signal is shown in Fig. 14 where the self-oscillation frequency is 11.9991 MHz.
The nominal period jitter was measured with two techniques. First, a 12.5 GSa/s digital sampling oscilloscope (DSO) and a self-synchronization technique were used as illustrated in Fig. 15(a) and described in [21] . Jitter was measured using the infinite persistence mode of the DSO to collect approximately 20k samples of the first positive clock edge that occurred after the trigger edge, where the trigger was set at 50% of the signal amplitude for a rising edge. The delay element shown in Fig. 15(a) was not required because the period of the 12 MHz clock signal is larger than the internal delay of the instrument. Ignoring the embedded trigger jitter, the measured jitter of the first edge following the trigger event was 4.732 ps as shown in Fig. 15(b) , corresponding to 66.57 ps assuming a BER of 10 .
Although this measurement approach represents a standard technique by which the can be determined, it is subject to errors due to variation in the test set-up including the load, trigger threshold and trigger jitter. Consequently, a 20 GSa/s real-time oscilloscope was utilized to capture 30 kSa of the oscillation period, from which statistics could be computed. This approach differs substantially from the former as the data captured are a collection of subsequent period measurements over time, thus enabling cycle-to-cycle jitter measurements. Fig. 16(a) shows the distribution of the period measurements which appears Gaussian centered around 83.33958 ns with a standard deviation of 6.78 ps, which is slightly higher than measured with the DSO approach. Additionally, the measured cycle-to-cycle jitter is 8.96 ps. In Fig. 16(b) , the computed is shown as a function of the BER. The last data point, 10 , corresponds to 26.438 years. As can be seen, even at this extreme, is less than 111 ps, corresponding to approximately 1/10th of the total data jitter budget of 1 ns. Lastly, the software statistics package included with the real-time oscilloscope indicated that no deterministic jitter was detected. These time-domain measurements are summarized in Tables I and II and indicate that very low period and cycle-to-cycle jitter were achieved with this implementation.
The SSB phase-noise PSD was measured for the 48 MHz and 12 MHz clock signals using a phase noise measurement system and results are shown in Fig. 17(a) and (b) , respectively. , or 12.04 dB, thus indicating that the dividers do not add substantially to the phase noise. At a 100 kHz offset, the SSB phase-noise PSD of the 48 MHz clock was 112.87 dBc/Hz. Assuming each divide-by-2 stage yields approximately 6 dB reduction in the SSB phase-noise PSD, then the 1.536 GHz reference oscillator phase noise must be approximately 112.87 dBc/Hz 24 dB, or 88.87 dBc/Hz at 100 kHz. From these estimates (4) can be employed to solve for .
Frequency drift over temperature and power supply voltage was measured with the real-time oscilloscope. Results without temperature compensation are shown in Fig. 18(a) where a linear negative TC was observed as predicted. Results with temperature compensation are shown in Fig. 18(b) . The total measured frequency accuracy was 397 to 401 ppm for a 10% variation in the external 5-V unregulated power supply and over a temperature range of 10 to 85 C. Additionally, the measured temperature coefficient of the compensated response is substantially linear as the least-squares linear fit of the measured data yields a correlation coefficient of 0.9351. However, the temperature response is clearly overcompensated. The number of programmable states was insufficient to set a response with a less positive coefficient, though the effort to add that functionality into future revisions is trivial. For comparative purposes, period jitter and phase noise measurements were performed on a standard commercial 12 MHz XO using identical measurement techniques. The measured performance is summarized in Table III and compared to the performance achieved with the RF-TCHO. Both approaches achieve compliance with the specification for USB 2.0, but the RF-TCHO design does not require a XTAL. However, the power dissipation is higher with the RF-TCHO. Though the XO output driver is included in the total power measurement, it is worth noting that the load was 4 pF and high impedance, thus output current draw was small. Power can be addressed in the RF-TCHO by minimizing the overdesign in the LCO and reducing the power in the CML divider stages. By adding additional programmable temperature compensation states it is quite likely that the total frequency drift can be decreased significantly, thus making the demonstrated approach directly competitive with XO-referenced and phase-locked digital clock synthesizers. In particular, results indicate that the clock frequency accuracy required for other bus protocols such as PCI ( 300 ppm) or IEEE1394 ( 100 ppm) may be achievable. The first implementation of an RF-TCHO clock generator in lieu of a XTAL, XO and PLL was demonstrated successfully for USB 2.0. It was shown that the primary design challenges with the implemented self-referenced approach involved maintaining high frequency accuracy and low jitter with a low-LC frequency reference and that the primary performance penalty is power. An analysis was presented illustrating the effects that give rise to self-oscillation frequency drift and jitter, the latter of which includes frequency multiplication and division as well as far-from-carrier phase noise. The measured performance indicates that very low jitter can indeed be achieved with an RF-TCHO. Additionally, results indicate that substantially better frequency accuracy can be achieved with these devices than has been reported in this paper, making the approach suitable for a broad range of digital clock generation applications. In continuing work, the authors intend to investigate aging effects, address the issue of power and focus on the development of self-referenced RF-TCHO clock generators that achieve better than 100 ppm total frequency inaccuracy.
