A high bandwidth and a robust performance are demanded in the consumer market applications. An ADC-based transceiver satisfies these demands and enables power/area scaling with process [1, 2] . We developed and tested a spread-spectrum-clocking (SSC) compliant 5-Gb/s transceiver in 65-nm CMOS. The receiver uses an ADC-based front-end that samples the incoming signal without adjusting the phase relation between the sampling clock and the signal, hence eliminating the need for phase control of the sampling clock (Fig. 8.7 .1). The phase tracking of the incoming signal and the data decision are performed entirely in the numerical domain without generating physical sampling-clock phases. An adaptive digital FFE (feed-forward equalizer) compensates for a channel loss up to 15dB at 2.5 GHz, using an on-chip adaptation controller based on CMA (constant-modulus algorithm). The CDR operated with BER less than 1E-12 when the transmitter and receiver clock signals were independently SSCmodulated at a modulation frequency of 30 kHz with a frequency deviation of 0 to -5000ppm.
A high bandwidth and a robust performance are demanded in the consumer market applications. An ADC-based transceiver satisfies these demands and enables power/area scaling with process [1, 2] . We developed and tested a spread-spectrum-clocking (SSC) compliant 5-Gb/s transceiver in 65-nm CMOS. The receiver uses an ADC-based front-end that samples the incoming signal without adjusting the phase relation between the sampling clock and the signal, hence eliminating the need for phase control of the sampling clock (Fig. 8.7 .1). The phase tracking of the incoming signal and the data decision are performed entirely in the numerical domain without generating physical sampling-clock phases. An adaptive digital FFE (feed-forward equalizer) compensates for a channel loss up to 15dB at 2.5 GHz, using an on-chip adaptation controller based on CMA (constant-modulus algorithm). The CDR operated with BER less than 1E-12 when the transmitter and receiver clock signals were independently SSCmodulated at a modulation frequency of 30 kHz with a frequency deviation of 0 to -5000ppm.
The transceiver consists of three blocks, a clock generator, a transmitter and a receiver (Fig. 8.7 .1). The clock generator uses a 4-phase 2.5-GHz SSC-compliant PLL. The resulting 4-phase 2.5-GHz clock is distributed to the transmitter and the receiver. The transmitter performs 16-to-1 multiplexing to generate the 5-Gb/s serial data from the 312.5-MHz 16-bit parallel data. The output driver has a 2-tap FIR filter, achieving a nominal de-emphasis level of 3.5dB.
The receiver analog front-end is a continuous-time analog equalizer followed by a 4-way interleaved 10-GS/s flash 5-bit ADC (Fig. 8.7 .1). The analog equalizer gives a nominal gain boost of 6 dB at 2.5 GHz to the incoming signal by using an RC-degenerated differential pair. The ADC converts the equalizer output into 5-bit data stream at 10GS/s or 2 times per UI. The binary data are then demultiplexed into 625-Mb/s 16-parallel 5-bit words and transferred to the digital backend block. The digital back-end block further equalizes the signal by using an FFE, which is a half-UI-spaced 2-tap FIR filter. The filter tap coefficients are controlled adaptively by a control logic implemented in the digital back-end. After the analog and digital equalization, the signal is sent to a digital CDR that operates at 625MHz.
The digital CDR tracks the center of the data eye and makes binary decision by slicing the data at the eye center. Unlike the conventional phase-tracking CDRs, the phase tracking and the data slicing are done entirely in the numerical domain (Fig. 8.7 .2). The CDR first extracts the data's zero-crossing timing, or the instantaneous phase, ph i , by using a linear interpolation. The instantaneous phase represents where in a one-UI period the zero crossing happens, and is expressed in a three-bit code. An averaged version of the instantaneous phase, ph av , is generated by using a second-order filter that consists of a modulo-1UI error subtractor followed by two integrators. The eye-center phase is estimated by adding a 0.5-UI phase code to the averaged zero-crossing phase. Once we know the eyecenter phase and the instantaneous phase (when applicable), the data decision is straightforward; we pick the sliced value of the sample that is on the same side as the eye-center phase across the zero crossing.
To handle the clock-frequency difference between the incoming data and the sampling clock frequency, the CDR outputs variable-width data through a datawidth controller. The width of a valid data depends on a phase-slip event, which is detected by observing the average zero-crossing phase, ph av , going across the one-UI-period boundaries. If there is no boundary crossing detected, the CDR outputs an 8-bit data word. If a boundary crossing happens to the direction such that the incoming data runs faster than the receiver, a 9-bit data word is issued. If the border crossing happens to the opposite direction, the data is 7-bit wide. The variable-width data is first converted into 15/16/17-bit wide 312.5-MHz data using a demultiplexor and written into a FIFO placed in the PHY logic to generate a fixed-width, 16-bit output. The PHY layer performs a flow control so that a FIFO overflow does not happen.
The 5-bit ADC has 17 regenerative amplifiers at its front (Fig. 8.7.3) . The amplifier uses the StrongArm-latch topology to get a narrow aperture time. The number of quantization levels is doubled by a resistor interpolation technique [3] to reduce the input capacitance. A regenerative amplifier is inserted in each interpolated signal path to enhance the gain. A resistor ladder generates the comparator reference levels. The ADC quantization range is 800 mVpp differential. After the outputs of the comparator array are converted into a full-swing CMOS level by the subsequent amplifiers, an encoder converts the resulting 32-level thermometer code into a 5-bit binary code via a 5-bit Gray code.
The tap coefficients of the FIR filter are adjusted by a control logic operating in the 625-MHz digital back-end (Fig. 8.7.4 ). The adjustment is based on CMA (constant-modulus algorithm) [4] , in which a stochastic gradient method is used to minimize a CMA cost function, E f = E { ( y n 2 -d 2 ) 2 }, where y n is the FIR-filter output at the n-th cycle, d the desired output level, and E{} stands for an expectation value. The incremental change in the tap coefficients is proportional to (y n 2 -d 2 ) 2 y n . To reduce the area and the power consumption, the sign of (y n 2 -d 2 ) and that of y n are used instead of the original multi-bit values. The CMA uses only the blind samples from the ADC, thus it works independently with the CDR. Simulations showed that the receiver gives a robust communication through various lossy transmission lines whose loss is ranging from 0 to 15dB (Fig. 8.7 .5).
The transceiver was implemented in 65-nm CMOS (Fig. 8.7.7) . The transmitter eye diagram is shown in Fig. 8 .7.6(a). We confirmed that the tap coefficients of the digital FFEs converged to the optimum values for the data transmission through a cable with a signal loss of 15dB at 2.5GHz, resulting in the BER below 1E-12. The jitter tolerance curve measured for the feed-forward CDR is shown in Fig. 8 .7.6(b). In the jitter tolerance measurement, the clock signals for the transmitter and receiver were modulated independently at 30kHz with a modulation depth of 0 to -5000ppm. The transceiver size was 1,000 × 1,000µm 2 and the total power consumption was 280mW at 1.2-V supply. 
