Abstract -Two circuits for performing on-chip subnanosecond signal measurements are presented. The first is an on-chip digitizer capable of capturing high-bandwidth arbitrary periodic signals. The second is a specialized jitter measurement structure based on a Time-to-Digital Converter (TDC). Both circuits were successfully implemented in a 0.35 Fm CMOS process. The digitizer is capable of capturing signals at an effective sampling rate of 1.6 GHz, while the jitter measurement device can measure jitter with an 18 ps resolution.
Introduction
Given the high degree of integration in the present day's System-On-Chip (SOC) devices, testing becomes an ever more critical component of product development. Higher operating speeds and the mixed-mode nature of these devices has made testing a difficult and expensive task that stretches the limits of conventional Automatic Test Equipment (ATE).
A first generation of on-chip mixed-signal testers/ characterization systems (comprising of an AWG and digitizer) have been developed as a response to the need for a high-performance affordable solution [I] . However, the proposed system is limited in terms of the speed of the input signal it can capture. The first section of this paper will present circuits and algorithms for extending the use of this test core to the capture of arbitrary high-bandwidth signals. These circuits though, do not provide for the sampling rates required for the characterization of jitter in high-speed digital signals, which is a parameter of prime importance in many SOC devices. As such, alternative circuitry designed specifically for this purpose is presented in the second section.
Experimental results demonstrating the functionality and performance of the high-bandwidth signal capture circuitry and the specialized jitter measurement device are presented in the final section of this paper.
High-Bandwidth Signal Capture

A. Digitizer architecture
A system level schematic of the on-chip digitizer architecture proposed in [l] is presented in Fig. 1 . The digitizer consists of three components: a comparator, a track and hold (T/H) stage followed by a buffer, and a programmable DC reference generator [2] (not shown).
One of the shortcomings of the circuitry presented in [I] is that the system does not provide for rail-to-rail operation, and
Input Signal ComDarator
Digilal Output 101 1 .. ,100 101 1 DC level sweep has a limited front-end bandwidth. These shortcomings are due to the size of the input sampling capacitor, and the design of the buffedcomparator that have limited voltage swings. A modified version of this circuitry that can provide for highspeed, rail-to-rail operation while utilizing the same components and circuits is presented in Fig. 2 .
The T/H stage is replaced with a two-stage sample and hold (SM) circuit. The front-end bandwidth is determined predominately by the RC bandwidth resulting from the onresistance of switch S I and the capacitance C I . The value of CI can be chosen to be small, and SI may be sized to produce an RC bandwidth in the range of several GHz. Also, by ratioing C , and CI, the S/H may be designed to have a gain under 1 V N due to charge sharing between the two sampling capacitors on the $I sampling phase. This gain relaxes the constraints on the voltage swings of the buffer and comparator. The function of the buffer then becomes to isolate the sampled voltage from the input nodes of the comparator, and to level shift it into the comparator's operating range. This allows the system to sample input voltages over the entire supply range.
B. Undersampling Techniques
The aforementioned digitizer architecture employs an oversampling algorithm to capture signals, implying that the bandwidth of the input signal can be at most F&?, where Fs is the sampling frequency of the system. Using undersampling algorithms though, signals with a bandwidth that extends well beyond the sampling frequency may be captured [3] .
An undersampling algorithm that was deemed to be the best suited for implementation with the digitizer involves consecutively sampling a periodic signal with a series of phase delayed clocks. One period of the input signal is initially sampled with a clock of period T, and zero phase delay. On the next pass of the input signal, the sampling clock is delayed by an incremental amount, AtsHm; and the input signal is resampled. The sampling clock is then continuously delayed by this incremental amount for each pass of the input signal, until the total phase delay is equal to T-ArsHIm Using this methodology, the input signal is captured with an equivalent sampling rate of FSAMP -EQuIV = J/AtsHIFT.
C. Generating Accurate, Tunable Phase Delays
To implement the undersampling algorithm described above, a circuit capable of producing variable, accurate and predictable clock phase delays is required. A tapped voltagecontrolled delay line with a DLL tuning mechanism, Fig. 3 , is capable of performing such a function.
This circuit tunes a delay equivalent to one period of the input clock, across the N voltage controlled buffers. When the loop is in lock, each buffer provides a delay equivalent to At,,,, = T/N, resulting in an equivalent sampling rate of FSAMP-EQUIV = N f l A phase selection block that selects the appropriate tap and routes it to the digitizer is also required. This block can be composed of a series of multiplexing stages, or a series of phase interpolators capable of further subdividing the phases [4] and providing higher effective sampling resolutions.
The design of such a unit is prone to effects such as power supply sensitivity, noise immunity, and matching constraints that effect the overall performance of the device in terms of jitter, and the linearity of the obtained measurements. As such, careful consideration of the circuits used in the design of the DLL and the tap selection circuitry is required.
For the design of the DLL, a self-biased architecture [4] that provides high noise immunity was implemented. The delay cell used in this architecture is presented in Fig. 4(a) . The two control voltages, Vp and Vn, are used to modify the delay of the cell and provide high power supply rejection through a feedback mechanism.
D. Calibration Methods
Given the dependency of the sub-sampling algorithm on the generation of precise, consistent phase delays in order to guarantee the quality of the captured waveform, matching becomes an issue that plays a vital role in the physical implementation of the phase delay generation block. Calibration methods may be utilized to circumvent these mismatch issues and obtain high linearity measurements.
This method involves characterizing the linearity of the voltage controlled delay line using external measurement equipment for a fixed control voltage. Using a software based approach, the measured non-idealities may be compensated for by altering the control voltage for each tap using an onchip DC generator [ 2 ] . These modifications may be stored in a calibration RAM that provides the appropriate control bits when the circuit is in operation.
On-Chip Jitter Measurement
E. A jitter measurement device based on a TDC
Time-to-digital converters are analogous to analog-todigital converters, where the analog quantity to be converted into a digital word is a time interval. A jitter measurement may be obtained from such a device by adding counters at the output of the D-latches that count the occurrence of logical levels at each of the sample instants. A jitter cumulative distribution function (CDF) is produced by combining the counts of the individual counters when the sampler is used to capture a signal with jitter.
12-1 -2
Fig. 5. A Jitter Measurement Device based on a TDC
The delay cell used in both the DATA and CLK delay lines [6] is presented in Fig. 4(b) . Transistors M3 and M7 are the voltage controlled elements, and adjust the current through the inverters, providing a variable delay.
As with the DLL, matching constraints play a considerable role in determining the accuracy and performance of the device. Large transistors must be used in the design of the delay cell to ensure a reasonable degree of matching. Careful consideration of the layout of the VDL sampler and -the placement of dummy devices are measures that can also alleviate device mismatch effects.
Experimental Results
A prototype IC containing the test core digitizer, the DLL, and the TDC was implemented in a 0.35 pm CMOS process. A chip micrograph is presented in Fig. 6 .
The digitizer was implemented using the proposed modifications, and occupies an area of 0.045 mm2 (the onchip DC generator was omitted from the prototype due to space limitations). Fig. 7(a) displays the digitizer's output versus input transfer characteristics. A closer examination of the gain of the digitizer presented in Fig. 7(b) , indicates that a gain of approximately 1 ViV can be guaranteed for input voltages greater than 10% of the supply.
A 32-stage self-biased DLL 141 that locks onto a 180" phase shift, with an optional inversion in the tap selection stage was implemented. This architecture provides for 64 effective taps and occupies an area of 0.8 mm2. The tap selection stage was also implemented as a series of multiplexers, also based on the self-biased architecture [SI. The measured delay per tap ranges from 180 ps/tap to 5 ns/ tap, allowing for effective sampling rates in the range 200 MHz to 5.6 GHz, subject to the constraints of the lock mechanism.
The DLL's DNL and INL as a function of output tap for an input clock of 25 MHz (or Ath,,ifi = 625 ps), before and after calibration, are presented in Fig. 8(a) and Fig. 8(b) respectively. The calibration voltages were stored in the ATE'S memory and produced by an off-chip DAC. Note that both the DNL. and INL are an order of magnitude smaller after calibration. The maximum DNL and INL after calibration are 21 ps and 47 ps. respectively, for an LSB of 625 ps. The measurement equipment utilized limits the characterisation of the TDC to resolutions no lower than 78 ps. Using an indirect measurement scheme that uses signals with a known deterministic jitter to extrapolate the performance of the device, the TDC was demonstrated to provide a resolution of 18 ps. This LSB represents a x10 improvement in resolution, over the 180 ps/tap fundamental limit of the DLL.
With the LSB set to 18 ps, 16384 cycles of a clock signal with a Gaussian jitter distribution were captured. The resulting on-chip jitter CDF is presented in Fig. ll(a) , and the resulting histogram in Fig. ll(b) . The measured rms and peak-to-peak jitter were 27.61 ps and 324 ps, respectively.
These results correlate with those taken with a Tektronix 11801C Digital Sampling Oscilloscope, which measured rms and peak-to-peak values of 30.41 ps and 318 ps.
The same experiment was repeated with a clock signal that 
Conclusions
Two circuits that perform sub-nanosecond resolution sampling were described. The resolution of the first is limited to the intrinsic gate delay of the technology it is implemented in. The second circuit is limited by the minimum attainable difference between gate delays, and provides a x10 factor improvement in resolution. Both devices were successfully implemented in a 0.35 pm CMOS process.
