A 33 Gbit/s equalizer chip fabricated in 0.13 µm BiCMOS technology is presented. The proposed equalizer prototype includes adaptive continue time linear equalizer (CTLE) with middle frequency compensation and adaptive half-rate look ahead decision feedback equalizer (DFE). The slope detection based CTLE employs a two-path amplifier to adjust the ratio of the high frequency and low frequency adaptively, and a middle frequency amplifier dedicated to provide an appropriate compensation in the intermediate frequency range. For the half-rate DFE, by using a look ahead structure and an analog LMS algorithm circuit, the performance is improved in speed and area. Measurement results show that the equalizer chip can compensate lossy channel with a loss of 26 dB at 20 GHz effectively and the data rate can be up to 33 Gb/s under 3.3 V power supply, the total power consumption is about 726 mW at 33 Gb/s data rate.
Introduction
Recently, increasing requirements for high-performance data communication and versatile applications of multimedia interface need high data rate up to 25 Gb/s or above. The received signal, however, suffers from seriously inter-symbol interference (ISI) due to channel imperfections such as reflection, crosstalk and limited bandwidth, making the signal integrity face big challenges for 25+ Gb/s serial links. To mitigate the ISI, various equalization strategies such as feed forward equalizer (FFE), continues-time linear equalizer (CTLE) and decision feedback equalizer (DFE) have been widely employed at transmitter and/or receiver for high-speed serial communication [1, 2] . For example, transmitter can use fractionally spaced FFE to cancel both pre-cursor and post-cursor ISI. The precision of analog delay line of FFE, however, is affected by process, voltage and temperature (PVT) with easy [2] , degrading the performance greatly. Furthermore, it is difficult to compensate the channel loss adaptively for transmitter.
As a kind of analog equalizer, CTLE can combat pre-cursors and post-cursors effectively by amplifying high-frequency components around Nyquist frequency of transmitted data. For instance, a spectrum balancing method based on power comparison of high frequency and low frequency is explored in [3] to achieve an adaptation of CTLE with a cost of increasing power consumption for power detection and V/I conversion. Similarly, reference [4] introduces an adaptive CTLE recently which is based on the comparison of slopes between input and output signals of slicer. Although the ratio of high frequency and low frequency components can be adjusted adaptively, its range of frequency compensation cannot be extended to intermediate range, affecting the effectiveness of equalization.
This paper focuses on receiver equalization such as CTLE and DFE for serial communication. To improve the equalization performance, an adaptive CTLE with middle frequency compensation is proposed in this paper. By placing a pair of zeropole near the middle frequency, the frequency compensation range is extended, making the proposed CTLE outperform traditional slop-detection CTLE significantly. In addition, a half-rate look-ahead DFE with analog adaption is also implemented to eliminate the post-cursor ISI which is mainly due to reflection, crosstalk and bandwidth-limited channel. Different from the DFE in [5, 6] whose adaption are realized either in full-digital pattern or in mixed-signal one, both resulting in large area and power consumption, our adaptive circuit in DFE is an analog one with high performance and low power consumption.
Architecture
It is well known that CTLE can eliminate both pre-cursors and post-cursors to a certain extent with low cost while DFE can improve eye diagram further. In order to compensate the heavy channel loss at high frequency, a combination of adaptive CTLE and adaptive half-rate DFE is explored in our work shown as Fig. 1 , in which the CTLE and DFE are based on slope detection and analog LMS (Least mean squares) algorithm for their adaption, respectively. In addition, the DFE is a look-ahead structure aiming to achieve high processing speed up to 25 Gb/s or above.
a. CTLE Fig. 2 gives the block diagram of the proposed adaptive CTLE. It consists of four parts: 1) high-pass & low-pass path, to compensate ISI induced by dielectric loss and skin effect by adjusting the ratio of high frequency and low frequency components; 2) middle-pass compensation path, to extend the range of frequency compensation to intermediate frequency and make the transfer function of CTLE match better with the channel characteristic; 3) slicer, to shape the output of the middle frequency amplifier and generate a fixed swing signal; and 4) slope detector & integrator, to detect the slopes of slicer input and output, and generate the feedback control voltage. Fig. 3 gives the performance comparison of the traditional CTLE whose peaking position is near Nyquist frequency and the proposed CTLE which contains not only a high-pass & low-pass path but also a middle-pass path. We can see that our CTLE performs better due to the middle frequency compensation. b. DFE As mentioned above, the adaptive DFE in our work is a half-rate look-ahead structure, shown as Fig. 4 . With this structure, the timing constraint of feedback path, indicated by the dotted line, can be relaxed from t cq þ t setup þ t FB to t cq þ t setup þ t mux , where t cq and t setup are the clock-to-output delay and setup time of D Flip-Flop, respectively, t FB is the propagation delay of traditional feedback path including a multiplier, and t mux is that of a MUX. It is obvious that t mux is less than t FB , making the design of high-speed DFE more easy.
Another important module in DFE is the analog LMS algorithm circuit, which updates the adaptive coefficients C i ðnÞ as follows: where xðnÞ and eðnÞ are the input of slicer and error signal, respectively, int is the integral constant which affects the stability and convergence of LMS algorithm [7] . The circuit realization of Eq. (1) is also illustrated in Fig. 4 . We can see that totally four parts are included: 1) slicer, generating the desired data dðnÞ; 2) summer, producing the difference signal eðnÞ between the desired data dðnÞ and slicer input xðnÞ; 3) multiplier, achieving the product of eðnÞ and xðnÞ; and 4) integrator, completing integral operation and generating the tap coefficients C i ðnÞ. It is obvious that the analog LMS algorithm circuit is more efficient in speed and area than digital and mixed-signal one.
3 Circuit design
Two-path amplifier
In CTLE, the high-pass and all-pass path can also be called two-path amplifier, shown as Fig. 5 . By changing the bias currents using V ctrl , the ratio of high frequency and low frequency component can be adjusted to fit the channel characteristic adaptively. Additionally, a peaking inductor L is used to expand the bandwidth of the high frequency amplifier. Similarly, for unit gain amplifier, zero-pole cancellation technique is employed to increase the bandwidth. We can further analyze the transfer function of the two-path amplifier by:
where g m1;2 and g m3;4 are transconductances of Q 1 (Q 2 ) and Q 3 (Q 4 ), respectively. R L is the load resistor. R E1 and C E are negative feedback resistor and capacitor, respectively. R E2 , another negative feedback resistor, is used to produce the zero. Fig. 6 gives the frequency response of the two-path amplifier under different control voltage V ctrl . We can see that when the frequency is low, the gain determined by R E2 and R L can be considered as a constant. When the frequency goes high, however, its high-frequency gain is proportional to the transconductance g m1,2 (g m3,4 ) monotonously. The larger the transconductance determined by the control voltage V ctrl , the higher the high-frequency gain. 
Middle frequency amplifier
Another important module in CTLE is the middle frequency amplifier, dedicatedly designed to provide a moderate equalization in middle-frequency range and improving the overall performance of CTLE. Fig. 7 gives its schematic diagram, in which a negative feedback topology composed with Q 3 ∼Q 6 and a low pass filter composed with R fb and C fb are included. And Eq. (3) and (4) describe the zero and poles generated by the intermediate frequency amplifier:
where R fb , C fb are feedback resistors and capacitors of low pass filter, respectively. g m2 , g m3 and g m5 are transconductances of Q 2 , Q 3 and Q 5 , respectively. R L1 is load resistance at the front stage. R L2 , C L2 are load resistors and capacitors, respectively. Fig. 8 indicates the zero and pole distribution of whole CTLE, where the first zeropole pair f z1 and f p1 are located closely in intermediate frequency section (500 MHz∼700 MHz), and f p2 is located in high frequency section. In our design, f z1 is fixed near 500 MHz and f p1 is tunable in a small frequency range by changing the resistance of R s3 . Another zero-pole pair f z2 and f p3 are produced by the two-path amplifier. Fig. 9 shows the time domain waveform of two-path amplifier and middle frequency amplifier. From the impulse response, we can see that more residual ISI can be eliminated when both amplifiers are applied compared with only two-path amplifier is used.
Slope detector and integrator
In CTLE, the slope detection mainly detects the energy of input signal, and then converts it into a negative pulse signal. While the integrator generates control voltage V ctrl which reflects the energy difference between the input signals of the slope detector. To optimize the circuit performance, in our work, the slope detector and integrator are merged into one circuit, shown as Fig. 10 , in which those inside the dashed line belong to the slope detector, and the others belong to the integrator. We can analyze the drain current of the coupled differential pair M 1 and M 2 by:
where I ds1 and I ds2 are the drain currents of transistor M 1 and M 2 , respectively. And V in1,cm and V in1,dm are the input common mode and differential mode voltages of V in1 , respectively. From Eq. (6) we can observe that the energy of input signal V in1 can be measured by I out1 . Similarly the source coupled differential pair M 3 and M 4 which generates current I out2 can be used to detect the energy of input signal V in2 . While M 5 , M 6 and M 7 , M 8 are mirror transistors and M 6 and M 7 have the same size. Thus, the output control voltage V ctrl of integrator can be expressed as follows: Eq. (7) indicates that the control voltage V ctrl is proportional to the energy difference between the two inputs of the slope detector and can be used to adjust the output of CTLE adaptively. Fig. 11(a) and Fig. 11(b) give the two 25 Gb/s input signals of the slope detector V in1 and V in2 , and the output of the integrator, respectively. We can see that the slope of V in1 is greater than that of V in2 and the output V ctrl which is determined by the slop difference between V in1 and V in2 converges to 22 mV after 7 ns.
Analog integrator
In the LMS algorithm circuit, the analog integrator is designed by a Gm-C structure shown as Fig. 12 . We can see that the first stage of transconductance unit is a differential amplifier with a diode connected transistors used as load, and the second one is a common source amplifier with a load capacitance of C L . Its output V tap , the bias voltage of tail current source of DFE, is used to adjust the tap coefficients and optimize post-cursor ISI. Eq. (8) gives the transfer function of analog integrator:
where A 0 ¼ ðg m2 g m6 r out Þ=g m4 is the DC gain of integrator, and int ¼ ðg m4 C L Þ=ðg m2 g m6 Þ, is its time constant. It has been known that LMS algorithm can be stable enough when int ! 1 ns. And we also find that when A 0 ! 65 dB, V tap can convergence to a bias voltage which is suitable to achieve a valid DFE tap coefficient. As a result, we set int % 1:3 ns and A 0 % 70 dB in our design. Fig. 13 shows the convergence curve of DFE feedback control voltage V tap . We can see that it converges to about 700 mV after 5 ns. The design methods of D Flip-Flop and summer are referenced to [7] .
Measurement results
The equalizer chip was fabricated in 0.13 µm BiCMOS technology with 7 metal layers. The chip photograph is shown as Fig. 14 and the total area is 1.12 mm 2 including I/O pads. On-chip measurement is carried out, and the measurement setup is shown as Fig. 15 , in which Agilent N4974A PRBS Generator is used to generate 20∼40 GHz differential PRBS31, ROHDE&SCHWARZ Generator generates half-rate clock signal, and the eye diagram of equalized signal is captured using KEYSIGNTY DCA-X 86100D Oscilloscope. In our measurement, a 12-cm Rogers backplane is tested shown as Fig. 16(a) , and Fig. 16(b) is the frequency response of the Rogers channel, which indicates that the insertion loss is about 26 dB at the frequency of 20 GHz. Fig. 17 gives the measured eye diagram before and after equalization. From Fig. 17(a) we can see that the eye diagram is nearly closed at the end of the lossy channel when data rate is 28 Gb/s, illustrating that the signal integrity has been severely damaged. Fig. 17(b) is the output of the equalizer. It can be observed clearly that the closed eye diagram is opened successfully with the adaptive CTLE and adaptive look-ahead half-rate DFE. Fig. 18 is another equalized eye diagram indicating that the proposed equalizer can be work properly up to 33 Gb/s. The performance of the designed equalizer chip and its comparison with other published works are summered in Table I . We can see that reference [8, 9, 10] have better technology with much higher characteristic frequency f T . With a structure of TX FFE + RX CTLE & DFE, reference [8] can compensate the highest channel loss at Nyquist frequency of 28 Gb/s data rate based on the advanced 32 nm SOI CMOS technology. While [10] consumes the lowest power because of its simpler structure. On the other hand, our work can compensate larger loss compared with [9] and [10] , and the processing speed is also the highest among all works. Although the power consumption of our work is somewhat large, its adaption feature outperforms the others since both CTLE and DFE are all adaptive in our design. 
Conclusion
In this brief, we have demonstrated a 33 Gb/s receiver equalization chip fabricated by 0.13 µm BiCMOS technology for Rogers backplane. The novel adaptive CTLE structure with middle compensation can match the characteristics of the channel well and reduce the burden of DFE effectively. The half-rate look-ahead DFE with analog adaptive can relax the critical timing and achieve robust adaptation with low cost. Measurement results show that the equalizer can effectively achieve a clear eye diagram up to 33 Gb/s with 12 cm Rogers backplane whose loss is up to 26 dB at 20 GHz. 
