Abstract. This design combines the advantages of CDR CDR circuit two structures PID and PI-based clock data is based on the structure of semi-digital dual loop recovery system. Using TSMC-0.25μm CMOS process to achieve the PLL design, the operating frequency range of 1.6-2.7GHz, and successfully applied a SERDES chip. Small footprint annular VCO wide frequency adjustment range, and can easily produce the CDR SerDes required multi-phase clock.
I. INTRODUCTION
In the high-speed SERDES chip applications, in order to be able to transmit high-speed data, will clock information hidden in the serial data to be transmitted, and therefore, SERDES interface chip clock and data recovery circuit (Clock and Data Recovery: CDR) The main function is to receive an input signal and determines the phase of the high-speed serial signal to extract the clock information [1] . The actual system, when the high-speed signal transmission PCB board-level circuit or cable, since the path distance and parasitic capacitance and other factors, the signal arrives at the receiving end from the transmitting side, is already generating a random delay, coupled with the transmission in the process of introducing jitter and sending and receiving ends inherently synchronous clock frequency deviation will cause the received data in phase with the receiving end of the sampling clock error occurred [2] .
A simple clock data recovery circuit diagram shown in Figure. Clock data recovery to complete two work, one is clock recovery, a data retiming, is to restore the data. Clock recovery mainly from the received NRZ (non-return to zero) code embedded clock information in the data extracted. CDR is generally an oscillator feedback loop, by adjusting the phase of the oscillation loop to track the input clock data embedded clock [3] . By analyzing the characteristics of the NRZ code may know, in the spectral density of the random binary data, it does not contain lines at a data rate that there is no direct information needed for clock extraction. To find the clock information, the general approach taken is edge-detection technology. In order to determine the final sampling clock phase, CDR must also have a phase error detecting circuit [4] .
II. CIRCUIT DESIGN

A. CDR implementations Profile
To achieve synchronization, the receiver must restore circuit generates a synchronous sampling clock signal by the clock data. In order to obtain maximum timing margins, the receiver should be in the center of the eye pattern of data bits sampled. Clock edge with respect to the position of data bits determines the performance of the link.
Typically there are three ways to realize the clock data recovery: The first and most common is to use a phase-locked loop, the structure via a feedback loop to detect and adjust the position of the sampling clock edge. The second method is based on the data oversampling phase picking technology. The third method is the phase interpolation techniques. 1) based on the phase-locked loop (PLL-based CDR) structures PLL-based CDR structure, the relationship between the oscillation frequency of the phase detector detects the input data and the VCO between generating a phase error associated with its pulse output, the phase drive charge pump, the phase error is converted into a control current through After the filter as the VCO control voltage Vctrl, controls the oscillation frequency of the VCO, until both reach the match that is locked.
2) oversampling (Oversampling CDR) architecture [36] [5] The technique of the data stream in each of a plurality of bit phase point sampling, the data stream to obtain said oversampling. Hopping information extracted from the data sampling information. Sample data based on the bit information data center transition, and the data can be the closest sample point as received data. Data based on the phase picking oversampling technology needs for data oversampling feature makes the technology more suitable for low data rate applications.
3) based on the phase interpolation (Phase Interpolator-based) structure of CDR CDR phase interpolation is based Rambus Inc. proposed a structure which uses as a reference loop PLL or DLL to generate multi-phase high-speed reference clock signal, the reference phase input to the CDR loops, CDR loop control circuit select from a and interpolation operation on the interpolated result as data recovery clock signal.
The clock data recovery circuit structure many different structures for selection depends on the speed limit and the power supply voltage under different conditions, while also subject to the system of power and jitter requirements. Phase interpolation and oversampling clock data recovery structure to achieve is to use a digital implementation of clock and data alignment, belong based on digital CDR [6] . This type of CDR simple structure, low cost, but with limited ability to track the frequency changes, the jitter performance than PLL-based CDR. Phase interpolation and oversampling clock data recovery structure to achieve based on this design uses a CDR with a PI-based semi-digital dual-loop.
B. The overall structure of the clock data recovery circuit
This design is based CDR circuit digital dual loop half clock data recovery system structure, the overall structure shown in Figure 1 .
Figure1 CDR overall block diagram
Compared with such a circuit structure based on the structure of the traditional analog phase-locked loop clock and data recovery system, it has the following advantages [7] :
(1) Most of the phase locked loop circuit is implemented by a digital circuit design, reducing the overall circuit sensitivity to the process, it has better reusability.
(2) dual-loop structure so that the frequency locked loop and phase locked loop has a relatively high independence, the loop bandwidth can be selected easily satisfy loop stability and bandwidth requirements.
(3) In the case of multi-channel serial communication between the passages can share a frequencylocked loop, can reduce overall power consumption and area of the system. System consists of a frequency locked loop and phase locked loop composition. Frequency locked loop is a digital-analog hybrid PLL, its reference clock input from the outside, the main role is to produce a set of multi-phase low-jitter clock. Phase-locked loop for receiving data mainly to complete phase tracking and data retiming, restore the data [8] . This structure makes the frequency and phase locked loop locked relatively independent loop PLL loop bandwidth can be designed to be wider, to more effectively suppress loop noise. To avoid the impact of the phase-locked loop PLL loop work, with the same four buffer the oscillator and a phase locked loop to isolate, to avoid interaction between the two loops [9] .
Phase-locked loop, mainly by Bang-Bang phase detector, a digital low-pass filter, phase interpolation unit, and a digital control unit FSM (Finite State Machine) components. Relationship edge phase detector detects the input data with a multi-phase clock between the output phase advance (up) or delayed (dn) of the comparison result, digital filter phase detector output signal is filtered, and the output of the phase interpolator the control signal, the clock signal to change to a different weighting factor. FSM is a phase locked loop state machine, used to select the input phase interpolator clock signal, which produce phase zone interpolation. In order effectively filter out noise, can be phase-locked loop bandwidth is narrower than design, so by way of dual-loop, effectively solve the problem of single-loop bandwidth compromise, and in the use of multi-channel data communication can be shared between multiple channel frequency locked loop, effectively reducing the overall power consumption and circuit area.
C. PI circuit design
Before designed CPPLL constituted by the ring oscillator 4, the output signal is approximately sinusoidal oscillation, the phase selector selection, as the input clock signal PI. Because of its large rise time, and PI output node small time constant, the output voltage follows the input voltage can be a good signal to the PI has a higher linearity.
For PI output has a good linearity, asked to enter phase two clock signals PI interval to be small. In this design, when four of the ring oscillator operating at 1.25GHz, the interval between the adjacent phase is 45 °, ie Δt = 100ps. The right to control the interpolation median 8bit, ie w = 1, 2, ..., N (N = 16). In the ideal case, the interpolated output clock after the minimum step length interval of 100ps / 8 = 12.5ps. Figure 2 , the clock input stage, the current weight control stage and load components. Uppermost four PMOS transistor structure remains symmetrical load, the bias voltage from the bias voltage generating circuit VCO. Differential input ω1_a, ω1_b and ω2_a, ω2_b, CPPLL generated from a set of adjacent differential clock input signal, w0 ~ w7 is 8bit phase interpolation weight control signal, the signal from the external digital control section, namely 8bit shift register shifter output signal for controlling the tail current flow in the circuit path, the internal structure of a simple differential pair. When the control bit w0 ~ w7 all set to 0, PI output clock phase to follow ω2_a, ω2_b change; when w0 ~ w7 full of 1:00, PI output clock follow ω1_a, ω1_b change; when the number w0 ~ w7 1 and 0 when the number of equal, that control the weight bit (0000,1111), or (0101,0101), when (0110,0110), etc., PI output clock phase between the aforementioned two phase clock signal, which is Since each weight control bit weight values are the same. In fact, because this design uses a bidirectional shift register structures to control, weight control output 0 and 1 will not occur at intervals appear, but the same number as long as weight control bits to 1 or 0, then the output After interpolation phase of the clock is the same.
D. Design of the phase selection circuit
The output signal input phase interpolator clock signal derived from the ring oscillator frequency locked loop, in order to avoid the influence of the phase locked loop of the VCO oscillation frequency, using the same buffer unit set the VCO and phase locked loop isolated. Phase interpolation input signal is a group of adjacent differential clock signals, and take into account the needs in the range of the interpolation of 360 °, the interpolation of the input clock, and to facilitate the generation of quadrature output clock signal, we need to use a phase selection circuit.
Ring oscillator constituted by the delay unit 4, from the first stage to the fourth stage to complete the 180 ° phase shift of the phase, and therefore phase change between adjacent cells 45 °. Interpolating the input clock signal is phase interval of VCO output waveform of 45 °. If you need to operate in the 180 ° interpolation clock phase between 225 ° and can signal by 0 ° and 45 ° signal reversal, you can get a phase of 180 °, respectively, and 225 ° of the clock signal. Similarly, the other can get the desired phase signal.
Phase selection circuit transistor level implementation shown in Figure 3 , the uppermost portion of the circuit is still symmetrical PMOS transistor constituting the load 4, the oscillation clock of the VCO input ω1_a, ω1_b and ω2_a, ω2_b, VBN for the tail current source bias set voltage input. S1 ~ S4 of four control signal input, in any normal operating conditions, the four have one and only one is high, the other for low-level signals. For example: When S1 is high, S2 ~ S4 is low, the result can be equivalent to a pair of differential input clock signal from the VCO output after a differential amplifier; and when S2 is high, S1, S3 , S4 is low, the inverted output equivalent to that of the clock signal by a differential amplifier later. By controlling the selection signals S1 ~ S4, can easily achieve the phase inversion of the input clock to meet the phase interpolator clock input signal covers the entire phase range of 0 ~ 360 °. SERDES with full custom layout mode design, the overall layout meet TSMC 0.13um / 0.11um CMOS Logic Design Rule (C013G 1P4M + L-MARK) design rules. Map details see Table 1 . 
IV. CONCLUSIONS
From the design of wide-input range of the phase locked loop circuit, adopting self-biased technology, noise has a good inhibitory effect. Small footprint annular VCO wide frequency adjustment range, and can easily generate multi-phase clocks CDR serdes required, to meet the design requirements. High-speed SERDES chip die size 2433um * 2505um, specific layout shown in Figure 4 . 
