# Tradeoffs in Design of Low-Power Gated-Oscillator CDR Circuits

Armin Tajalli<sup>\*</sup>, Paul Muller, and Yusuf Leblebici

The authors are with Microelectronic Systems Lab., Swiss Federal Institute of Technology (EPFL), 1015 Lausanne, Switzerlan, {armin.tajalli, paul.muller, yusuf.leblebici}@epfl.ch

Corresponding author: Armin Tajalli

Address:

EPFL STI IMM LSM Station 11 Bâtiment ELD 335 CH-1015 Lausanne, Switzerland

*Phone*: 0041-21-693-6927 *Fax*: 0041-21-693-6955 *Email*: armin.tajalli@epfl.ch

> Date of Receiving: to be completed by the Editor Date of Acceptance: to be completed by the Editor

# Tradeoffs in Design of Low-Power Gated-Oscillator CDR Circuits

Armin Tajalli, Paul Muller, and Yusuf Leblebici

*Abstract-* This article describes some techniques for implementing low-power clock and data recovery (CDR) circuits based on gated-oscillator (GO) topology for short distance applications. Here, the main tradeoffs in design of a high performance and power-efficient GO CDR are studied and based on that a top-down design methodology is introduced such that the jitter tolerance (JTOL) and frequency tolerance (FTOL) requirements of the system are simultaneously satisfied. A test chip has been implemented in standard digital 0.18  $\mu$ m CMOS while the proposed CDR circuit consumes only 10.5 mW and occupies 0.045 mm<sup>2</sup> silicon area in 2.5 Gbps data bit rate. Measurement results show a good agreement to analyses proofs the capabilities of the proposed approach for implementing low-power GO CDRs.

*Key Words* – CMOS analog integrated circuits, clock and data recovery circuit, gated-oscillator CDR, power-aware design, chip-to-chip interconnection.

#### **1 INTRODUCTION**

Multi-channel data transceivers offer a very good solution for increasing the total data communication speed [1]-[4]. Meanwhile, using optical links can help more to prepare a reliable and high speed environment for data transmission [5]. Optical links can provide also a robust medium against electro-magnetic coupling in short-haul applications [6]. Integrated serial data transceivers with very low power consumption are key components for implementing high performance and low cost multi-channel serial data transceivers (Fig. 1(a)). Meanwhile, cost efficiency and high level of integration offered by CMOS technology has made this technology a good candidate for implementing multi-channel transceivers. However, the power consumption is generally in conflict with the system performance. This trade-off makes low power circuit design for this application very challenging.

Figure 1 shows the conceptual block diagram of the proposed serial data receiver [5]. In the proposed topology, an integrated photo-detector (PD) converts the optical signal to electrical current [5], [7]. This electrical signal is then amplified by transimpedance and limiting amplifiers (TIA and LA) [5]-[8] and then retimed by the CDR block [9], [10]. Clock and data recovery (CDR) circuits play a very important role in serial receivers and the general performance of the system is directly related to the performance of this building block [5]. This article studies the existing tradeoffs in design of low-power gated-oscillator (GO) -based CDRs. As will be shown later, the main performance parameters of this kind of CDR like jitter generation (JG), jitter tolerance (JTOL), and frequency tolerance (FTOL), are directly related to the power dissipation of the circuit. Therefore, a careful design methodology is required to implement a power-efficient GO CDR. Combined with the already demonstrated pure silicon based photo-detection and amplification front-end [6]-[8], the goal is to realize a completely integrated multi-channel receiver.

In this work, to implement a low power CDR, the GO topology has been selected. Because of their simple topology, GO CDRs are well suited for low-power and small-area applications [11]-[14]. In this type of CDRs, retiming can take place very quickly. Hence, they have been widely used in burst-mode applications [11]. This topology is also suitable for high frequency applications. Using advanced technologies, some very high frequency GO-based CDRS has been reported in literature [11], and [13]. In [13], using shunt-peaking and capacitive coupling technique, a 10 Gbps CDR has been designed in 0.13 µm

CMOS technology. On the other hand, [11] uses the half rate ring oscillator to operate in 10 Gbps data rate in 0.15  $\mu$ m CMOS technology. Unlike the already published reports, this paper proposes a structural methodology for implementing low-power GO CDRs. This approach is mainly based on investigating the basic properties of GO CDRs.

In short distance data receivers, JTOL and FTOL of the CDR are the two main design parameters that can affect the performance of the system. While GO CDRs are sensitive to any frequency offset between received data and sampling clock, they show relatively good JTOL performance [15]. In this paper, FTOL and JTOL in GO CDRs and their dependence on power consumption will be analyzed.

JTOL and FTOL of the proposed CDR will be analyzed in Section 2. Section 3 describes how the restrictions that are imposed by FTOL and JTOL can lead us to implement a low power circuit. Measurement results will be shown in Sections 4.

### **2 GATED-OSCILLATOR-BASED CDR SPECIFICATIONS**

#### 2.1. Clock Recovery

In a GO CDR, the sampling clock is produced by a ring oscillator. As depicted in Fig. 2(a), an edge detector keeps this clock synchronous with the received data. The edge detector block produces a retiming signal at each data transition [14], [16]. The clock generator can be a current controlled ring oscillator (CCO) whose phase is controlled by the edge detector. Shown in Fig. 2(b), at each receiving data edge the edge detector generates a synchronization signal (EDET) applied directly to the CCO. This signal prevents the CCO from oscillation and freezes the output clock (Ckout) to HIGH level via the first stage of the ring oscillator. At the rising edge of EDET, the oscillator releases and goes back to its free oscillation mode in a frequency determined by the controlling current and in phase with the last received data edge. Sampling the delayed data (DD<sub>in</sub>) instead of input data (D<sub>in</sub>) in the proposed topology eliminates the delay introduced by the delay line. This sampling scheme can also reduce the effect of delay-line jitter (which contains almost the same jitter as EDET) instead of Din. Meanwhile, parasitic delays due to the XNOR gate or the delay mismatch between two inputs of the NAND gate in the oscillator should be compensated by proper dummy gates (as briefly shown in Fig. 2(a)). In this way, synchronization between clock and data can take place within only a few transitions of the input data. However, this fast synchronization will take place at the expense of poor jitter transfer (JTRAN) characteristics. Indeed, any jitter on received data or delay-line will be transferred to the output without any attenuation. Since the JTRAN requirement is not the first priority in short-haul applications, the GO topology can be applied in this case.

As shown in Fig. 3, a phase-locked loop (PLL) generates a local high frequency clock (*HFCK*) from a reference input clock (*LFCK*) while *HFCK* is exactly equal to the baud rate of the received data. The proposed PLL uses an oscillator matched to the oscillator used in the proposed GO CDR. In multi-channel applications, this PLL can generate the controlling signal for all the CDRs. In this case, to have a better matching between each channel and PLL, current controlled oscillators (CCO) are used instead of voltage controlled oscillators (VCO). To tolerate any frequency mismatch between CDR and PLL, it is desirable to design the proposed CDR with a high frequency tolerance to avoid any incorrect sampling due.

In the following, the performance of a GO CDR in presence of frequency offset and also input jitter will be analyzed to explore the main limitations for reducing the power dissipation of this circuit.

## 2.2. Frequency Offset

Since in a GO CDR, the oscillation frequency of the CCO is not controlled directly through a phase-locked loop (PLL), a frequency difference can exist between the GO CDR and the incoming data stream. The frequency tolerance (FTOL), is defined as the maximum frequency difference at which the BER remains lower than a specified value (usually,  $BER < 10^{-12}$ ) [5]. For correct sampling in ideal conditions, when there is no jitter on data or clock, the frequency error must be smaller than  $|f_{ck} - f_0| < f_0/2n$  ( $f_0 = \omega_0/2\pi$  is the nominal data frequency,  $f_{ck} = 1/T_{ck}$  is the oscillator frequency, and *n* indicates the number of consecutive identical digits (CID)). Using 8B10B coding, CID would be limited to five, *i.e.*,  $n \le 5$  [5]. Hence, based on (2): FTOL<10%. However, in practice FTOL is less than this value mainly because of existing jitter on the sampling clock or on the input data. While the jitter on the received data is not under control, it is possible to reduce the sampling clock or data can reduce the FTOL. Figure 4(a) depicts how the jitter on sampling clock or data can reduce the FTOL. Figure 4(b) shows the achievable FTOL based on behavioral modeling in presence of random jitter on received data and recovered

clock [15], [17]. As can be seen, an increase in clock jitter will result in FTOL degradation. Here, deterministic jitter (DJ) has been also included on the received data to model more accurately the practical condition [16], [17]. Based on this approach, to have an acceptable frequency tolerance, jitter generation in the oscillator must be very small. The main source of jitter on sampling clock in this configuration is the accumulated jitter during free running of gated oscillator. The accumulated jitter increases with free running time interval of oscillator and can be expressed as [18],[19]:

$$\sigma_{ck} = \kappa \sqrt{\Delta T} \tag{1}$$

in which  $\sigma_{ck}$  indicates the *rms* (root mean square) jitter value on clock accumulated during the time interval of  $\Delta T$ , and  $\kappa$  is a proportionality factor depends on topology and also power consumption of the delay stages in ring oscillator and also technology parameters [18], [19]. In a GO CDR,  $\Delta T$  depends on the number of CIDs [15]. Therefore, according to Fig. 4(b) and using (1), it is possible to estimate the maximum acceptable  $\kappa$  to have the desired FTOL. In the next step, this criterion can be translated into circuit parameters such as biasing conditions and hence the size of devices in each delay cell. As will be shown later, this criteria is one of the main criteria that prevent further lowering the power dissipation of the proposed CDR. As depicted in Fig. 5, the approach proposes a simple methodology for designing a power efficient GO CDR. Based on this approach, the main limitations on oscillator jitter dictated by FTOL and JTOL can be used to determine the general circuit specifications such as power consumption. Regarding frequency of operation, then it is possible to determine the detailed circuit parameters such as biasing conditions and the size of transistors.

#### **2.3. Jitter Tolerance**

Jitter tolerance (JTOL) is a measure of CDR capability in tolerating the input jitter. JTOL is usually tested by adding a sinusoidal jitter (SJ) at given frequency range to the data stream which already includes the deterministic (DJ) and random jitter (RJ) components added in the channel [17]. The maximum jitter amplitude, which is a function of jitter frequency at which the CDR still operates at a given BER, is called jitter tolerance [5]. Simulation or analysis of the JTOL for a non-linear system like GO CDR is very complex. A behavioral modeling approach can be applied to find the maximum acceptable sampling clock jitter. Then, according to Fig. 5 this requirement can be translated into circuit parameters.

Based on the approach previously shown by the authors, it is possible to calculate the JTOL in a GO CDR based on variations on data period [15][16]. As shown in Fig. 6(a), the sampling clock should remain within the eye opening of the received data. In presence of sinusoidal jitter and according (2) and considering Fig. 6(a), it can be shown that the maximum input tolerable jitter amplitude is:

$$UI_{pp} = \omega_0 / (3\pi\omega_j). \tag{2}$$

in which,  $UI_{pp}$  is the maximum tolerable jitter amplitude (peak-to-peak), ( $\omega_0$  is the nominal data rate and  $\omega_j$  is the frequency of the sinusoidal jitter) [15]. Ignoring the channel jitter, this expression indicates a worst case approximation for JTOL in a GO topology since it is assumed that data period always has its lowest (or highest) possible value. It can be shown that in a more general case when there are *n* consecutive identical digits [as shown in Fig. 4(a)], the data edge must be within the time interval of:  $(2n-1) \cdot T_0/2 < T_{data} < (2n+1) \cdot T_0/2$ , and JTOL can be approximated by:

$$UI_{pp} \approx \frac{1}{(2n+1)\pi} \cdot \frac{\omega_0}{\omega_i}$$
(3)

It is also possible to use the jitter transfer (JTRAN) function of a CDR to calculate approximately the JTOL [9]. Based on this approach, the condition to avoid incorrect sampling [5] is:

$$|\phi_{out} - \phi_{in}| \le 0.5 |\phi_{in}| \tag{4}$$

or approximately:

$$JTOL(s) \le 0.5/[1 - JTRAN(s)]$$
<sup>(5)</sup>

In a GO CDR, the JTRAN can be approximated by a delay of  $T_0/2 = T_{osc}/2$ , or ( $T_{osc}$  is the period of oscillator):

$$JTRAN(s) \approx e^{-T_0 s/2} \tag{6}$$

where  $T_0 = 2\pi/\omega_0$  is the nominal data period. Therefore,

$$|JTOL(j\omega_j)| = \frac{0.5}{|1 - e^{-j\omega_j T_0/2}|} = \frac{1}{8 \times \sin(\omega_j T_0/4)}$$
(7)

This expression is acceptable as long as JTRAN can be approximated by (6).

Fig. 6(b) compares the JTOL calculated in (3) (based on data period variation), and (7) (based on JTRAN), with respect to the JTOL mask [21]. As can be seen in this figure as long as channel jitter is negligible, there is a good agreement between (3) and behavioral modeling results.

To have a more practical estimation for JTOL, the channel jitter must be also included in calculations [17][20]. Channel jitter generally includes both types of random (RJ) and deterministic jitter (DJ) with Gaussian and uniform distribution, respectively [17], [22]. If there is no jitter on sampling clock, then BER can be calculated as [15]:

$$BER = \int_{-\infty}^{T_0/2} P_d(\tau) \cdot d\tau + \int_{3T_0/2}^{+\infty} P_d(\tau) \cdot d\tau$$
(8)

in which  $P_d(\cdot)$  indicates the probability of data transition in specified time. Assuming the impulse model for DJ instead of uniform distribution [23], jitter tolerance can be expressed by:

$$UI_{pp(\min)} < \frac{\eta}{\pi} \cdot \frac{\omega_0}{\omega_i} \tag{9}$$

in which  $\eta$  depends on the specifications of different types of jitter as:

$$\eta = 1 - \left| \frac{1}{(2n \pm 1)/2 - 2T_{pp}/T_0 - \sqrt{2\lambda}\sigma_{RJ}/T_0} \right|$$
(10)

In Fig. 6(b) the JTOL estimated by (9) is compared to the behavioral modeling results which are in very good agreement. This figure shows the simulation results for different CID values. As expected, in low jitter frequencies JTOL is reduced by increasing the CIDs. However, when the jitter frequency is increased, JTOL will be reduced by reducing the CID number. The reason is that in high jitter frequencies, when there is a long sequence of consecutive identical bits, the jitter effect is diminished before the next sampling clock edge arrives. As can be seen in Fig. 6(b), due to the high bandwidth of an ideal GO CDR, this topology shows a very good JTOL performance beyond the minimum requirements.

#### 2.4. Frequency of Operation

The next important parameter that imposes a lower limit on power consumption is frequency of operation. The heart of a GO CDR is a ring oscillator that its frequency of operation is directly proportional to its power dissipation. As the frequency of operation in this work is very high, an SCL (source-coupled logic) topology can be a good choice for implementing the delay cells in the proposed ring oscillator (Fig. 7). In this case, the oscillation frequency would be [5]:

$$f_{osc} = \frac{1}{2\pi} \cdot \frac{\alpha}{2N \cdot \tau_L} \tag{11}$$

in which, *N* is the number of delay cells applied in the ring oscillator,  $\tau_L$  is the time constant at the output of the SCL delay cell, and  $\alpha$  is used to take into account the nonlinearity effects. The time constant at the output node of an SCL gate is proportional to the load specifications as:

$$\tau_L = R_L C_L = \frac{V_{sw}}{I_{SS}} \cdot C_L \tag{12}$$

here,  $V_{sw}$  and  $I_{SS}$  are the voltage swing at the output of each gate and  $I_{SS}$  is the tail bias current. As shown in Fig. 7, the voltage swing at the output of the SCL gate can be controlled by a replica bias circuit [9]. Regarding (11) and (12), it can be simply shown that the oscillation frequency is proportional to the  $I_{SS}$ . Figure 8 shows the normalized oscillation frequency versus the tail bias current  $I_{SS}$ .

In the next section, the criteria that explored in this part will be utilized to design a power efficient GO-based CDR.

#### **3** CIRCUIT IMPLEMENTATION BASED ON SYSTEM REQUIREMENTS

As illustrated in Section 2, JTOL, FTOL, and frequency of operation are imposing some restriction on design of GO CDRs. To keep the jitter on sampling clock below an acceptable level which is imposed by JTOL and FTOL, careful circuit design techniques is required. In the following, the techniques for implementing a power-efficient CDR circuit will be explained.

#### 3.1. Ring Oscillator Design

Frequency stability and timing jitter are the two most important specifications of the oscillator in a GO topology. Timing jitter of ring oscillators, or its frequency domain analogy, *phase noise*, has been extensively studied in [18], and [19]. As indicated in Section 2, sampling clock jitter can be described by (1). This equation can also be used to present a good estimation for jitter-power consumption tradeoff in a differential ring oscillator. Figure 8 illustrated the achievable  $\kappa$  value versus tail bias current  $I_{SS}$ . Therefore, (1) can help us to determine the minimum achievable power dissipation and satisfying the system jitter requirements. The  $\kappa$  value in this figure is estimated based on [18]. The tail current of delay stages in delay line or ring oscillator can be chosen based on (1) and Fig. 8. For the proposed work it has been chosen as  $I_{SS}=200\mu$ A. This figure also compares with the estimated  $\kappa$  value derived in [18] and [19] for the proposed differential ring oscillator.

#### 3.2. Design of GO CDR

Based on the topology shown in Fig. 2(a), the proposed CDR circuit has been implemented in standard 0.18µm digital CMOS technology. A PLL with a high order loop filter is utilized to suppress the ripples on controlling signal and thus have a very little jitter generation.

To achieve a good matching and balance, all the delay cells in delay line and the ring oscillator are built with identical SCL-based two-input multiplexer (MUX) gates optimized for this application (shown in Fig. 7) [10]. The minimum acceptable bias current for the delay cells has been chosen based on the maximum acceptable jitter on oscillator. This results in a low-power circuit while satisfying the system jitter requirements.

#### 3.3. Shared PLL

Figure 9(a) shows the block diagram of the proposed PLL. A third order loop filter has been applied to attenuate the ripples on the control signal. A transconductor  $(g_m)$  cell also converts the controlling voltage to current. Copies of this current will be applied to all CDRs to tune their oscillators on the desired frequency. In the proposed PLL, the parasitic pole introduced by the  $g_m$  cell and parasitic capacitors at the transconductor output can push the loop towards instability. To avoid this problem, it is possible to use this parasitic pole, *i.e.*,  $g_m/C_{parasitic}$  instead of  $1/(R_3C_3)$  for filtering purposes [10].

Figure 9(b) shows the transfer characteristic of the proposed transconductor. The  $g_m$  value of the proposed transconductor is low at low output currents and high at high output currents. This non-linear characteristic helps to achieve both a high current swing (to have a wide CCO tuning range) and also relatively constant CCO gain ( $K_{CCO}$ ) over process corners. In *slow* corners where  $K_{VCO}$  is low and higher control current is required to achieve the desired oscillation frequency, transconductance is high. For the same reason, transconductance must be low when the control current is low. So, regarding:

$$K_{CCO} = \frac{\partial f_{osc}}{\partial I_C} = \frac{\partial f_{osc}}{\partial V_C} \times \frac{\partial V_C}{\partial I_C} = g_m \cdot K_{VCO}$$
(13)

the CCO gain will remain almost insensitive to the process variation.

Figure 9(c) shows the circuit schematic of the proposed transconductor. In this circuit, the input voltage ( $V_{in}$ ) is converted to current by M1. When  $V_{in}$  is close to  $V_{SS}$ , M1 is in triode region and hence the circuit transconductance is low compared to the case that M1 is in saturation. When  $V_{in}$  approaches  $V_{DD}$ , M1 moves toward saturation and hence the transconductance increases rapidly. This explains the I-V characteristic in Fig. 9(b) where for  $V_{in}$  close to  $V_{SS}$  the output current is close to zero and then by increasing the  $V_{in}$ , the current approaches to  $I_B$ . It is possible to change the switching point from triode to saturation region using  $V_R$ . The circuit diagram of the frequency divider and the phase-frequency detector (PFD) are shown in Fig. 10.

#### **4 MEASUREMENT RESULTS**

The proposed multi-channel CDR has been implemented in a digital 0.18 µm CMOS technology. Figure 11 shows the mask layout of the proposed CDR. The delay line and ring oscillator are placed at the middle of the layout while the biasing circuitry are place on two sides of the layout close to the related circuits. Wherever possible, decoupling and bypassing capacitors have been applied. As can be seen in Fig. 12, the measured free running oscillation frequency of CCO shows good matching to post-layout simulation results. Based on this plot, the oscillation frequency shows a low sensitivity to the supply voltage variation, thanks to an internal bias control circuit. The proposed bias circuit as

illustrated in Fig. 7, keeps the voltage swing at the output of SCL delay cells constant. Hence, based on (13) and (14), the oscillation frequency will remain unchanged.

The eye diagram and bath tub curve shown in Fig. 13 are presenting a good horizontal eye opening. The eye closure in y-direction is mainly due to the bandwidth limitation of 50  $\Omega$  I/O buffers. Using LeCroy SDA 6000 serial data analyzer, the effective jitter *rms* value on recovered data is measured as 4.1ps<sub>rms</sub>.

To estimate the frequency tolerance of the proposed CDR, the nominal frequency of the reference clock has been changed until incorrect sampling occurs. The measured FTOL is  $\pm 3.5\%$  which is slightly smaller than what was expected. Figure 14 shows the incorrect sampling can happen in presence of frequency error. In this plot, the first bit after a long consecutive identical bits (here 5 bits) has been sampled incorrectly. Meanwhile, at the nominal sampling frequency no bit error was detected for a  $2^{31}$ -1 PRBS (pseudo random bit stream) input data.

The measured power consumption was 10.5 mW while each channel operates in 2.5-Gbps. The power consumption could be more reduced by removing the test blocks and extra buffers or biasing circuits have been used in each channel at the first implementation. Table I compares this design to the previous work. As can be seen in this table, PLL-based [25] on phase interpolator based [24] CDRs have larger area and normalized power consumption with respect to the GO CDRs [13]. The GO CDR reported in [26] shows a high normalized power dissipation mainly due to the flexible structure applied to operated in 1/5 data rate. As can be seen, the CDR reported in this work shows the lowest normalized power dissipation.

### **5** CONCLUSION

In this article, a structural methodology for implementation low-power GO CDRs has been presented. Based on the proposed approach, the power consumption in the circuit can be reduced as far as the main system requirements like the speed and jitter performances are satisfied. By proper choosing the biasing condition, it is also possible to control the sensitivity of the proposed topology to frequency offset. Implemented in a digital 0.18  $\mu$ m CMOS technology, the power dissipation of the proposed gated-oscillator based CDR is 10.5 mW occupying 0.045 mm<sup>2</sup>.

## ACKNOWLEDGEMENT

A. Tajalli acknowledges the support of MERDCI during the initial stage of this work.

#### REFERENCES

- H. Takauchi, H. Tamura, S. Matsubara, M. Kibune, Y. Doi, T. Chiba, H. Anbutsu, H. Yamaguchi, T. Mori, M. Takatsu, K. Gotoh, T. Sakai, T. Yamamura. A CMOS Multichannel 10-Gb/s Transceiver. IEEE J. of Solid-State Circuits, vol. 38, pp. 2094-2100 (2003).
- [2] Y. Moon, Y.-S. Park, N. Kim, G. Ahn, H. J. Shin, D.-K. Jeong. A Quad 0.6/3.2 Gb/s/Channel Interference-Free CMOS Transceiver for Backplane Serial Link. IEEE J. of Solid-State Circuits, vol. 39, pp. 795-803 (2004).
- [3] J. Kim; J. Yang, S. Byun, H. Jun, J. Park; C.S.G. Conroy, B. Kim. A Four-Channel 3.125-Gb/s/ch CMOS Serial/Link Transceiver with a Mixed/Mode Adaptive Equalizer. IEEE J. of Solid-State Circuits, vol. 40, pp. 462-471 (2005).
- [4] Y. Miki, T. Saito, H. Yamashita, F. Yuki, T. Baba, A. Koyama, M. Sonehara. A 50-mW/ch 2.5/Gb/s/ch Data Recovery Circuit for the SFI-5 Interface with Digital Eye-Tracking. IEEE J. of Solid-State Circuits, vol. 39, pp. 613-621 (2004).
- [5] B. Razavi, Editor, Design of Integrated Circuits for Optical Communications, McGraw-Hill (2003).
- [6] M. K. Emsley, O. Dosunmu, M. S. Unlü, P. Muller, and Yusuf Leblebici, Editors. Realization of High-Efficiency 10 GHz Bandwidth Silicon Photodetector Arrays for Fully Integrated Optical Data Communication Interfaces", Proceedings of European Solid-State Device Research Conference (ESSDERC), pp. 47-50, (2003) September, Estoril, Portugal.
- [7] P. Muller, Y. Leblebici, M. K. Emsley, and M. S. Unlü, Editors. A 4-Channel 2.5Gb/s/Channel 66dBOhm Inductorless Transimpedance Amplifier. Proceedings of European Solid-State Circuits Conference (ESSCIRC), pp. 491-494, (2004) September, Leuven, Belgium.
- [8] P. Muller, and Y. Leblebici, Editors. Limiting amplifiers for next-generation multi-channel optical I/O interfaces in SoCs. Proceedings of SoC Conference, pp. 193-196, (2005) September.
- [9] A. Tajalli, P. Muller, M. Atarodi, and Y. Leblebici, Editors. A Low-Power, Multichannel Gated Oscillator-Based CDR for Short-Haul Applications. Proceedings Low Power Electronics and Design (ISLPED), pp. 107-110, (2005) August, San Diego, USA.
- [10] A. Tajalli, P. Muller, M. Atarodi, and Y. Leblebici, Editors. A Multichannel 3.5mW/Gbps/Channel Gated Oscillator Based CDR in a 0.18µm Digital CMOS Technology. Proceedings European Solid-State Circuits Conference (ESSCIRC), pp. 193-196, (2005) September, Grenoble, France.
- [11] M. Nogawa, K. Nishimura, S. Kimura, T. Yoshida, T. Kawamura, M. Togashi, K. Kumozaki, Y. Ohtomo, Editors. A 10Gb/s Burst-Mode CDR IC in 0.13μm CMOS. Proceedings of IEEE International Solid State Circuits Conference (ISSCC), (2005) February.
- [12] S. Kobayashi, and M. Hashimoto. A Multirate Burst-Mode CDR Circuit with Bit-Rate Discrimination Function from 52 to 1244 Mb/s. J. IEEE Photonics Technology Letters, vol. 13, no. 11, pp.1221-1223 (2001).
- [13] S. Kaeriyama, and M. Mizuno, Editors. A 10Gb/s/Ch 50mW 120×120μm<sup>2</sup> Clock and Data Recovery Circuit," Proceedings IEEE International Solid State Circuits Conference (ISSCC), (2003) February.

- [14] M. Nakamura, N. Ishihara, Y. Akazawa, Editors. A 156 Mbps CMOS Clock Recovery Circuit for Burst-Mode Transmission. Proceedings IEEE Symposium on VLSI Circuits, Digest of Technical Papers, pp. 122-123, (1996).
- [15] A. Tajalli, P. Muller, M. Atarodi, and Y. Leblebici, Editors. Analysis and Modeling of Jitter and Frequency Tolerance in Gated Oscillator based CDRs. Proceedings IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2109-2112, (2006) May, Greece.
- [16] A. Tajalli, P. Muller, M. Atarodi, and Y. Leblebici, Editors. Top-Down Design of a Low-Power Multi-Channel 2.5-Gbit/s/Channel Gated Oscillator Clock-Recovery Circuit. Proceedings Design, Automation and Test in Europe (DATE), pp. 258-263, (2005) March, Munich, Germany.
- [17] J. Kim, and D. –K. Jeong. Multi-Gigabit-Rate Clock and Data Recovery Based in Blind Oversampling. IEEE Communication Magazine, pp. 68-74, (2003).
- [18] J. A. McNeill. Jitter in Rring Ooscillators. IEEE J. of Solid-State Circuits, vol. 32, pp. 870-879 (1997).
- [19] A. Hajimiri, S. Limotyrakis, and T. H. Lee. Jitter and Phase Noise in Ring Oscillators. IEEE J. of Solid-State Circuits, vol. 34, pp. 790-804 (1999).
- [20] L. M. De Vito. A Versatile Clock Recovery Architecture and Monolithic Implementation. in Monolithic Phase-Locked Loops and Clock Recovery Circuits, Theory and Design, B. Razavi, Editor, Edition New York: IEEE Press (1996).
- [21] InfiniBand Trade Association. InfiniBand Architecture Specification. Revision 1.0.a. June 19th (2001).
- [22] Maxim Integrated Products. Converting Between RMS and Peak-to-Peak Jitter at a Specified BER. Application Note HFAN-4.0.2 (**2000**).
- [23] N. Ou, T. Farahmand, A. Kuo, S. Tabatabaei, and A. Ivanov. Jitter Models for the Design and Test of Gbps-Speed Serial Interconnects. J. IEEE Design & Test Computers, vol. 21, pp. 302-313 (2004).
- [24] R. Krienenkamp, U. Langmann, C. Zimmermann, T. Aoyama, H. Siedhoff, Editors. A 10-Gb/s CMOS Clock and Data Recovery Circuit with an Analog Phase Interpolator. IEEE J. of Solid-State Circuits, vol. 40, pp. 736-743 (2005).
- [25] J. Savoj, B. Razavi, Editors. A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector. IEEE J. of Solid-State Circuits, vol. 36, pp. 761-767 (2001).
- [26] T. Iwata, T. Hirata, H. Sugimoto, H. Kimura, T. Yoshikawa, Editors. A 5Gbps CMOS Frequency Tolerant Multi Phase Clock Recovery Circuit. Proceedings IEEE Symposium on VLSI Circuits Digest of Technical Papers, pp. 82-83, (2002) June.

# **FIGURES AND TABLES**



Figure 1. (a) Conceptual picture of multiple optical links, (b) block diagram of an integrated optical receiver





**(b)** 

Figure 2. (a) Proposed GO CDR topology, (b) timing of operation



Figure 3. Proposed 8-channel CDR topology which uses a shared-PLL for frequency tuning



Figure 4. (a) Incorrect sampling in presence of frequency offset and also jitter on sampling clock and received data, (b) simulated BER in different values of frequency error and jitter on sampling clock (input data specifications: RJ=0.015-UIrms, DJ=0.2-UIpp)



Figure 5. Proposed GO CDR top-down design methodology



**(b)** 

Figure 6. (a) Data period in presence of SJ. (b) JTOL based on (9), (13), (19), and behavioral modeling in comparison to JTOL mask [18] when channel jitter is negligible (RJ=0.01-UIrms, and without DJ)



Figure 7. (a) Delay cell and replica bias circuit, (b) four-stage ring oscillator applied in the GO CDR. In delay cells:  $V_B=0$  and  $V_{sel}=1$  (else than the first delay cell in which  $V_{sel} = EDET$ )



Figure 8. Jitter – power and frequency – power trade off in a ring oscillator



Figure 9. (a) Block diagram of the proposed PLL, (b) the transfer characteristics of the transconductor used in PLL loop, (c) proposed non-linear transconductor



Figure 10. Building blocks used in PLL: (a) frequency divider (divide by two) consists of two SCL-based latches, (b) phase frequency detector (PFD)



Figure 11. Proposed CDR mask layout (250µm × 180µm)



Figure 12. Measured oscillator free running tuning characteristics in comparison to the simulation results



Figure 13. Eye diagram of the output recovered data and the bath tub curve at  $f_{clk}$ =2.5 GHz



Figure 14. Incorrect sampling due to the frequency error for 2<sup>5</sup>-1 PRBS input random data stream

|           | Year | Tech    | Supply<br>[V] | Data Rate<br>[Gbps] | Normalized<br>Power Diss.<br>[mW/Gbps] | Area<br>[mm <sup>2</sup> ] | CDR Type           |
|-----------|------|---------|---------------|---------------------|----------------------------------------|----------------------------|--------------------|
| [24]      | 2005 | 0.11 μm | 1.5           | 10                  | 22                                     | 0.35                       | Phase interpolator |
| [25]      | 2001 | 0.18 µm | 2.5           | 10                  | 7.2                                    | 0.99                       | PLL                |
| [26]      | 2002 | 0.18 µm | 1.8           | 5                   | 18                                     |                            | GO                 |
| [13]      | 2003 | 0.15 µm | 1.5           | 10                  | 5                                      | 0.02                       | GO                 |
| This Work | 2007 | 0.18 µm | 1.8           | 2.5                 | 4.2                                    | 0.05                       | GO                 |

Table 1: Comparison with the Previous Work (all in CMOS technology)

#### **BIOGRAPHIES**

**Armin Tajalli** received the B.S. and M.S. degrees (with honors) in electrical engineering from Sharif University of Technology, Tehran, and Tehran Polytechnic University in 1997 and 1999, respectively, and the Ph.D. degree from Sharif University of Technology in 2006 (with honors). From 1998 to 2004 he was with Emad Semicon as a senior analog design engineer. From 2006 he has joined Microelectronic Systems Laboratory (LSM) in Swiss Federal Institute of Technology in Lausanne (EPFL) working on ultra-low power circuit design techniques. He has received the Kharazmi award on Research and Development, 2000, and Presidential award of the best Iranian researchers, 2003.

**Paul Muller** received his engineering degree in electrical engineering from the Ecole Polytechnique Fédérale de Lausanne (EPFL) in 1999 and the Dr. Sc. degree in 2006. From 1999 to 2002, he worked as an analog and mixed-signal design engineer at XEMICS SA, where he contributed to several sensing and data acquisition circuit designs. In 2002, he joined the Microelectronic Systems Laboratory at EPFL as a research assistant, working on the modeling and design of multi-channel gigabit receivers for short-distance optical communication interfaces. Since 2006, he is with the wireless division of Marvell Semiconductor in Etoy, Switzerland and Santa Clara, CA.

**Yusuf Leblebici** received the B.S. and M.S. degrees in electrical engineering from Istanbul Technical University in 1984 and 1986, respectively, and the Ph.D. degree in electrical and computer engineering from the University of Illinois at Urbana-Champaign in 1990. From 1991 to 1993 he worked as Visiting Assistant Professor of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. From 1993 to 1998, he was on the faculty of Istanbul Technical University as Associate Professor of Electrical and Computer Engineering. Dr. Leblebici worked as Associate Professor of Electrical and Computer Engineering at Worcester Polytechnic Institute (WPI) in Massachusetts between 1998 and 2001, where he established and directed the VLSI Design Laboratory, and served as a project director at the New England Center for Analog and Mixed-Signal IC Design. From 2000 to 2001, he also took the responsibility of developing the

microelectronics degree program at Sabanci University, as the Microelectronics Program Coordinator.

Since 2002, Dr. Leblebici is a chair professor at the Swiss Federal Institute of Technology in Lausanne (EPFL), and director of Microelectronic Systems Laboratory. His research interests include design of high-speed CMOS digital and mixed-signal integrated circuits, computer-aided design of VLSI systems, intelligent sensor interfaces, modeling and simulation of semiconductor devices, and VLSI reliability issues. Dr. Leblebici is the coauthor of two textbooks, *Hot-Carrier Reliability of MOS VLSI Circuits* (Kluwer Academic Publishers, 1993) and *CMOS Digital Integrated Circuits: Analysis and Design* (McGraw Hill, 1996, 1998, and 2002), as well as more than 150 scientific articles published in international journals and conferences. He was on the organizing and steering committees of several internations on Circuits and Systems II between 1998 and 2000, and as an Associate Editor of IEEE Transactions on VLSI between 2001 and 2003. Dr. Leblebici has received the Young Scientist Award of the Turkish Scientific and Technological Research Council in 1995, and the Joseph Samuel Satin Distinguished Fellow Award of the Worcester Polytechnic Institute in 1999.