A novel all-digital CDR for source-synchronous links, and its implementation in 90nm CMOS, is presented. A phase alignment technique with ping-pong action between two clock phases is used. The system is implemented in static CMOS logic, occupies 0.234 mm 2 and dissipates 16.6 mW at 6 Gb/s, demonstrating BER <10
the next is about half the available delay range.
In order to make M/MM decisions, the mismatch counter detects transitions in the incoming data and makes comparisons between the search and data samples when an appropriate transition occurs. Each phase position is observed for 32 transitions before a final M/MM decision is made. These decisions are then AND/OR filtered to prevent noise and transient events from corrupting the signature (Fig. 2) .
Only two transitions from match to mismatch (marked 1 and 2 in Fig. 2 ) are required to detect the eye opening. In normal operation, signature collection is stopped after these are found to speed operation. A more complete search is only conducted when the data eye begins to drift off the extent of the delay line, or delay line calibration is required. For example, in Fig. 3 the phase offset between clock and data is gradually increasing. As data drifts to the right, the CDR tracks this shift and updates the data phase accordingly. However, when data drifts far enough, the edge of the current eye moves off the end of the delay line and cannot be found. The CDR then searches for M/MM transitions 3 & 4 to acquire the preceding eye opening. It places the updated data phase in this eye opening and trades the positive and negative edges of the clock, completing the UI swap.
The complete system was implemented in Matlab for performance characterization. The simulated sinusoidal jitter (SJ) tolerance using PRBS-31 input with 1ps RMS random jitter at 6 Gbps is shown in Fig. 5 .
System Implementation The delay line used to generate the 2 UI delay was implemented as a differential inverter chain with tri-state buffer calibration (Fig. 4) [2] . By turning tri-states on or off using signals en [7] to en[0], the drive strength of each stage can be changed, thus calibrating overall delay. Weak crosscoupled inverters are used to maintain phase alignment between the two paths. This scheme has the advantage of allowing the direct, digital modulation of delay in pure static CMOS. The output of the calibration stages is fed-forward, allowing a large delay range without using large tri-state buffers. Pre-buffer delay cells are used to equalize rise and fall times of the clock signal before the delay line core, to ensure the linearity of the output phases. The phase interpolator is a pair of inverters with shorted outputs.
In order to multiplex the top and bottom sample and clock paths into the data and search paths, the multiplexer must make its switch between the two clocks without adding or dropping a rising edge. Changes in the multiplexer control signal are delayed until both clocks are high, such that the two are swapped when no transition is occurring in either. Timing of this path is critical. If the delay line is properly calibrated, the two incoming clocks will be at most 90° out-of-phase; consequently, the switch must take place within a quarter of a clock period. To meet this timing requirement, the NAND gate (Fig. 4) output is rise time optimized.
Measurement Results
The system was implemented in 90 nm CMOS, occupying a core area of 550 μm x 425 μm. CDR functionality was tested at 6 Gbps with PRBS-7 data, achieving a BER <10 -13 . The CDR successfully corrects for an unlimited range of delay between the forwarded clock and the data. At 6 Gbps, net power consumption was 16.6 mW. A breakdown of performance is included in Table I . The clock-swapping multiplexer (Fig. 4) showed an unforeseen metastability problem, occasionally resulting in errors when swapping between the two clocks. This can be resolved by adding an extra flip-flop on the data_top input of the multiplexer.
As in most CDRs, phase generator linearity affects the overall performance of the system. DNL affects M/MM transition detection, measured to be 0.44 LSB. The data phase position is calculated as the average of two M/MM transitions and is affected by the INL, measured to be 1.6 LSB. Thus, overall error in the data phase position is 2.04 LSB (Fig. 7) .
Conclusion and Acknowledgments An all-static-CMOS implementation of a novel delay linebased CDR scheme for source-synchronous links has been presented. It is capable of realizing an infinite delay range and has low power consumption and small area characteristics.
The 
