A truly monolithic clock and data recovery (CDR) circuit for low cost low-end data communication systems has been realized in 0.6ȝm CMOS. The implemented CDR comprises a phase-and frequency-locked loop using an I/Q ring VCO to recover clock from incoming non-return-to-zero (NRZ) data stream and a data decision circuit to retime the received data, respectively. The novelty of this design is that siliconsaving active inductors are used to improve the transmitted bit rate and the compatibility with digital circuits for monolithic integration, to reduce silicon area, while the excessive noise is suppressed by fully differential topology. The tested CDR IC achieves a locking range from 400MHz to 950MHz and a RMS jitter of 0.008 UI for a 622Mb/s pseudorandom bit sequence (PRBS) length of 2 31 -1.
For most applications, the time jitter and the phase noise are two important design criteria of a PLL. Unfortunately, the switching activity of digital modules in mixed-signal systems introduces power-supply or substrate noise, which greatly disturbs those noise-sensitive blocks in a PLL. In particular, noises injected onto the voltagecontrolled oscillator (VCO) pose the dominant jitter source of a PLL.
In this work, based on the negative conductance (NC) configuration, novel differential delay cells with active inductor loads are used to lower phase noise and to improve linearity of the proposed ring VCO. Moreover, the used submicron CMOS technology is most attractive to develop commercial low-end chips due to its high reliability and low manufacturing cost. So a low jitter, low phase noise, low cost CDR has been implemented in 0.6ȝm CMOS technology, which can be used in OC-12 / STM-4 systems. 
CIRCUIT DESIGN
As shown in Fig.1 , the proposed CDR circuit is composed of an I/Q VCO, a PFD, a loop filter and a data retimer which is based on a master-slave flip-flop. This CDR does not need any edge-detection circuits to preprocess the incoming NRZ data stream. In the VCO, 0 o in-phase (I) and 90 o quadrature-phase (Q) clocks are generated. The PFD has three functional blocks: phase detector (FD), quadrature phase detector (QPD) and frequency detector (FD). All of the main blocks in the CDR are fully differential. A differential VCO structure reduces the effects of common-mode noise, the magnitude of current spikes injected to power supply and substrate, and ultimately the clock jitter generation. Similarly, differential architectures adopted in the PFD and the loop filter can improve the performance of the CR with a noisy supply and substrate. Input, output buffers and inter-stage buffers are used to realize input matching, DC level shift, impedance transforming and decouple the CDR core from external 50: environment. Subcircuits are described below. 
Voltage Controlled Ring Oscillator
As a pivotal building block in PLL, high frequency and RF VCOs can be implemented monolithically as LC oscillators and ring oscillators. In comparison, monolithic high-Q LC oscillators have lower phase noises but ring VCOs offer wider tuning ranges and consume smaller die areas. The realized I/Q ring VCO comprises a differential control circuit for VCO tuning, four-stage delay cells and two buffering amplifiers as shown in Fig.2 (a). The adopted novel differential variable-delay gain stage with active inductor loads and the employed control circuit to generate differential control voltages V con+ and V con-for the gain stages of VCO are shown in Fig Since a high performance ring VCO can be easily obtained using the negative conductance [3] , the used VCO is also developed in this technology. As shown in Fig. 2(b) , the variable-delay gain stage is constituted by transistor pair M AP , M AN , cross-coupled pair M FP , M FN , active inductor load pair M LP /R g , M LN /R g and other transistors used as current sources, source followers and biasing blocks. Firstly, a gain stage with MOS loads is difficult to operate at high data rate due to the large time constant of load capacitance, but a gain stage with inductive loads can provide much larger gain-bandwidth product. Using inductive loads, the capacitive loading can be partly tuned out, and then the pole of each gain stage can be pushed toward high frequency end, which is the so-called shunt-peaking technique. In general, inductive loads can be implemented with on-chip spiral inductors or active inductors. It is very difficult to realize a high-inductance and high-Q on-chip spiral inductor with a small die area. Contrarily, active inductors are compact and offer adequately high operating speed [4] . Thereupon, active inductor pair M LP /R g , M LN /R g is introduced here as the loads of transistor pair M AP , M AN to maximize the operating frequency of the proposed VCO. Secondly, the cross-coupled pair M FP , M FN introduces a negative average conductance that reduces the overall output conductance and equivalently increases the output impedance and hence the delay. Thus, this VCO can operate at the expected frequency range with less gain stages and the phase noise is enormously lowered. Thirdly, to keep the output voltage swing a constant, a differential control circuit shown in Fig. 2(c) 
Phase Detector and Frequency Detector
Compared to a conventional PLL with PD only, PFLL could significantly increase acquisition range and reduce locking time. To optimize the operating speed and avoid problem caused by internal crosstalk, the proposed subcircuits are all based on differential current mode logic (CML). In Fig. 3  (a) , a CMOS version of Pottbäcker PD is proposed [4] . At every transition of the input data, I and Q clocks are sampled by the input NRZ data directly without preprocessing circuit. This operation generates beat notes with 50% duty cycle at PD/QPD outputs when the VCO frequency f OSC and bit-rate frequency (data rate) f b are different. The FD is a differential logic circuit that receives inputs from PD/QPD and generates frequency difference signal at the output Q 3 . As shown in Fig 4 (a) , when f OSC <f b , PD output Q 1 lags QPD output Q 2 and the superposition of Q 1 and Q 3 , is positive. On the other hand, when f OSC >f b Q 1 leads Q 2 and the superposition of Q 1 , and Q 3 is negative, as shown in Fig.4 (b) . The superposition of Q 1 and Q 3 indicates a clear DC component driving the loop towards lock. Fig. 5 shows the schematic of the employed loop filter. Unlike other design [6] , this loop filter integrated on chip without any off-chip component. Q 1 and Q 3 are first added up and then low-pass filtered. The DC component drives the loop towards lock. The transfer function of the loop filter is dominated by C 0 , 2R 1 , and R 2 .
Loop Filter

Loop Design
The bandwidth of PLL affects the stability, the suppression of phase noise from VCO, and the oppression of spurious modulation and pull-in time. Since the amount of long-term jitter that will result depends on the sensitivity of the VCO to noise, low-Q VCOs based on RC oscillators, such as relaxation or ring oscillators, are very sensitive to noise. Thus, low-Q VCOs can only obtain low long-term jitter by maximizing the loop bandwidth and tracking the input frequency as close as possible. In this design, it is set to 50MHz. By analyzing the closed and open loop responses, the phase margin is found to be 65 degree. 
CHIP FABRICATION
The designed CDR circuit was fabricated in CSMC Semiconductor Co., Ltd. The chip microphotograph of the dies is shown in Fig. 6 . The chip dimension including bonding pads is 1.3mmu1.3mm. The dimension of pads is 0.1mmu0.1mm. Obviously, bonding pads and on-chip capacitors are major consumers of layout area. 
MEAUREMENT RESULTS
The performance of the fabricated chips has been evaluated via on-wafer probing on uncut wafers employing a CASCADE MICROTECH probe station, an ADVANTEST D3186 Pulse Pattern Generator, an ADVANTEST R6142 Firstly, the performance of the VCO was evaluated in open loop mode. The tuning range of the VCO is from 360MHz to 1060MHz as displayed in Fig. 7 (a) , which is wide enough to cover large PVT variations. Then the loop was closed, and differential 622Mb/s 2 31 -l PRBS data streams were used as input. As shown in Fig.7 (b) , the spectrum of the recovered clock signal was measured from the in-locked VCO, which illustrates a phase noise of -92.95dBc/Hz at 10-kHz offset. Fig. 8 (a) gives the eyediagram of the retimed 622Mb/s NRZ data, and the measured jitter histogram of the in-locked VCO at 622MHz is shown in Fig. 8 (b) . The circuit is able to acquire lock in a frequency range between 398MHz and 960MHz. The power consumption of CDR core is just about 200mW and more than 160mW was dissipated by output data and clock buffer to drive 50: external loads of testing instruments. The testing results of the realized monolithic CDR were summarized in TABLE I. 
CONCLUSIONS
A low cost CDR circuit for data communication systems has been monolithically implemented in CSMC 0.6ȝm CMOS. The realized CDR is based on a PFLL using an I/Q ring VCO. The main contribution of this work is that on-chip active inductors are used to design high bit rate, low current dissipation, compact chips. CML based fully differential topology is introduced to suppress excessive noise. The validity of the adopted design strategies and circuit techniques are confirmed by the measured data.
