# AN ABSTRACT OF THE THESIS OF

<u>Chao Ma</u> for the degree of <u>Master of Science</u> in <u>Electrical and Computer Engineering</u> presented on <u>September 13, 2012</u>.

Title: <u>Energy-Efficient Clock Generation for Communication and Computing</u> <u>Systems Using Injection Locking.</u>

Abstract approved:

Patrick Y. Chiang

The design of high-performance, high-speed clock generation and distribution becomes challenging in terms of phase noise, jitter and power consumption, due to the fast development of communication and computing systems. Injection locking is a promising clocking technique since it can significantly improve the energy efficiency, suppress the phase noise of the ring oscillator, enable a fast startup and conveniently generate multiple time-interleaved phases.

A quasi-linear model of injection-locked ring oscillator (ILRO) is utilized to mathematically formulate the frequency and time domain characteristics of the system, as well as the phase noise shaping and jitter tracking behavior. The settling behavior of ILRO is also exploited and shows a strong dependence on the locking range and the initial phase difference of the injected and the resultant oscillation signals.

A forwarded-clock synchronization based on injection locking is designed for a 10 Gb/s photonic interconnect according to the specific features of optical links. A single clock recovery can be used for all the four channels, resulting in a large amount of power and area saving. The applications of sub-harmonic and super-harmonic injection locking in wireless communications for frequency multiplying and division are also discussed.

©Copyright by Chao Ma September 13, 2012 All Rights Reserved

# Energy-Efficient Clock Generation for Communication and Computing Systems Using Injection Locking

by Chao Ma

# A THESIS

submitted to

Oregon State University

in partial fulfillment of the requirements for the degree of

Master of Science

Presented September 13, 2012 Commencement June 2013 Master of Science thesis of Chao Ma presented on September 13, 2012

APPROVED:

Major Professor, representing Electrical and Computer Engineering

Director of the School of Electrical Engineering and Computer Science

Dean of the Graduate School

I understand that my thesis will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my thesis to any reader upon request.

Chao Ma, Author

#### ACKNOWLEDGEMENTS

I really appreciate that there are always lots of people helping and supporting me in my life. Without them, it would be impossible for me to finish this thesis. I would like to express my sincere appreciation to my advisors, the VLSI research group members, friends, and family.

First, I would like to thank my major advisor, Prof. Patrick Y. Chiang for the opportunity to study in his group. I am always inspired by his broad knowledge, great ideas, and infinite passions for academic research. What I leaned is beyond the specific major, which are the foreseeing and global visions, critical analysis and how to collaborate with each other. His advice will definitely influence my career and whole life.

Secondly, I would like to extend my gratitude to Prof. Sam Palermo of Texas A&M University for his advice on joint projects. I would like to thank Prof. Gabor C. Temes and Prof. Huaping Liu for being in my academic committee, and their valuable advices on my coursework and oral exam. I also want to thank Prof. Pavan K. Hanumolu, Prof. Karti Mayaram, and Prof. Un-Ku Moon for their great helps in and after classes. I am grateful to Prof. Merrick C. Haller for taking the time to serve as graduate council representative.

I would like to thank the group members, Lingli Xia, KangMin Hu, Changhui Hu, Nariman Moezzi, Tao Jiang, Jacob Postman, Rui Bai, Jiao Cheng, Joe Crop, Robert Pawlwski, Ben Goska, Ryan Albright, Thomas Ruggeri and Sam House, for the help and collaboration. I also want to thank my friends, Fanghui Ren, Fang Yuan, Weiyang Li, Ce Cheng, and others who have been spending the excellent time with me.

Finally, I would like express the deepest gratitude to my family members for their love and support, especially my parents and boyfriend. No matter where I am, they are always with me.

# TABLE OF CONTENTS

Page

| CHAPTER 1. INTRODUCTION                                     | 1  |
|-------------------------------------------------------------|----|
| 1.1 Clocking in Communication and Computing Systems         | 1  |
| 1.1.1 Clocking for Wireline Communications                  | 4  |
| 1.1.2 Clocking for Wireless Communications                  | 5  |
| 1.2 Design Considerations                                   | 6  |
| 1.2.1 Jitter                                                | 6  |
| 1.2.2 Phase Noise                                           | 7  |
| 1.2.3 Deskewing and Multi-Phase Generation                  |    |
| 1.2.4 Power Consumption                                     | 9  |
| 1.2.5 Motivation of Injection-Locked Ring Oscillator (ILRO) | 9  |
| 1.3 Thesis Organization                                     | 10 |
| CHAPTER 2. RING OSCILLATOR                                  |    |
| 2.1 Introduction                                            | 12 |
| 2.2 Type of Ring Oscillator                                 | 15 |
| 2.2.1 Sing-Ended Signal Ring Oscillator                     | 15 |
| 2.2.2 Differential Ring Oscillator                          | 17 |
| 2.2.3 Pseudo-Differential Ring Oscillator                   | 19 |
| 2.3 Phase Noise and Jitter of Ring Oscillator               |    |
| 2.4 Design Considerations of Ring Oscillator                |    |
| 2.4.1 Frequency                                             | 25 |
| 2.4.2 Phase Noise                                           |    |

# TABLE OF CONTENTS (Continued)

| 2.5 Summary                                                         | <u>Page</u><br>29 |
|---------------------------------------------------------------------|-------------------|
| CHAPTER 3. INJECTION LOCKING                                        | 30                |
| 3.1 Introduction                                                    | 30                |
| 3.2 Modeling of Injection Locking                                   | 31                |
| 3.3 Phase Noise and Jitter of Injection-Locked Ring Oscillator      | 35                |
| 3.3 Fast Wakeup                                                     | 40                |
| 3.4 Harmonic Injection Locking                                      | 43                |
| 3.4.1 Implementation in Wireless Transmitter                        | 44                |
| 3.4.2 Implementation in Wireless Receiver                           | 47                |
| 3.5 Summary                                                         | 49                |
| CHAPTER 4. PHOTONIC FORWARDED-CLOCK INTERCONNECT                    |                   |
| SYNCHRONIZATION USING INJECTION-LOCKED RING OSCILLATOR              | 51                |
| 4.1 Introduction                                                    | 51                |
| 4.2 Photonic Interconnect Features                                  | 54                |
| 4.2.1 Photodiode                                                    | 55                |
| 4.2.2 Optical Front-End                                             | 57                |
| 4.3 Architecture of Optical Clock Receiver                          | 59                |
| 4.3.1 Current-Integrating Clock Receiver with Super-Harmonic Inject | ction             |
| Locking                                                             | 60                |
| 4.3.2 Proposed Architecture                                         | 62                |
| 4.4 Circuit Implementations                                         | 64                |

# TABLE OF CONTENTS (Continued)

|                                         | Page |
|-----------------------------------------|------|
| 4.4.1 TIA and Limiting Amplifier        | 64   |
| 4.4.2 Single-to-Differential Converter  | 65   |
| 4.4.3 ILRO                              | 67   |
| 4.4.4 Other Building Blocks for Testing | 69   |
| 4.5 Results                             | 69   |
| 4.6 Summary                             | 75   |
| CHAPTER 5. CONCLUSION                   | 77   |
| Bibliography                            | 80   |

# LIST OF FIGURES

| <u>Figure</u> <u>Page</u>                                                                                                           |
|-------------------------------------------------------------------------------------------------------------------------------------|
| Figure 1.1. NoC projections about number of cores and per-channel I/O bandwidth2                                                    |
| Figure 1.2. NoC projections about (a) on-chip and (b) off-chip clock frequency 3                                                    |
| Figure 1.3. Clocking in a basic electrical link                                                                                     |
| Figure 1.4. Time-division multiplexing in a serial link                                                                             |
| Figure 1.5. Clocking in a typical wireless transceiver                                                                              |
| Figure 1.6. Illustration of the clock jitter                                                                                        |
| Figure 1.7. Illustration of the phase noise: (a) typical profile and (b) its effects on producing the adjacent channel interference |
| Figure 2.1. Simplified ring oscillator model                                                                                        |
| Figure 2.2. Unity-gain negative feedback system                                                                                     |
| Figure 2.3. Delay cell of the ring oscillator with single-ended signal structure 15                                                 |
| Figure 2.4. "Current-starved" delay cell of ring oscillator                                                                         |
| Figure 2.5. Ring oscillator with differential structure and even number of stages 18                                                |
| Figure 2.6. Delay cell of a differential ring oscillator with symmetric loads                                                       |
| Figure 2.7. Pseudo-differential ring oscillator with (a) odd stages and (b) even stages.                                            |
| Figure 2.8. Pseudo-differential delay cell with (a) cross-coupled load and (b) cross-<br>coupled input stage                        |
| Figure 2.9. Delay modulation of pseudo-differential signals                                                                         |
| Figure 2.10. Modeled single-sideband phase noise spectrum                                                                           |
| Figure 2.11. Frequency tuning of a current-starved ring oscillator with supply voltage.                                             |
| Figure 2.12. Phase noise of a current-starved ring oscillator with different power consumption                                      |

# LIST OF FIGURES (Continued)

| Figure                                                                                                                                         | <u>Page</u> |
|------------------------------------------------------------------------------------------------------------------------------------------------|-------------|
| Figure 2.13. Phase noise of a current-starved ring oscillator at different supple vo                                                           | -           |
| Figure 3.1. Model of first-harmonic injection locking                                                                                          | 32          |
| Figure 3.2. Phase noise of a ring oscillator with and without injection locking                                                                | 39          |
| Figure 3.3. Injection-locked ring oscillator                                                                                                   | 39          |
| Figure 3.4. Settling behavior of ILRO with different locking ranges                                                                            | 41          |
| Figure 3.5. Settling behavior of ILRO with different initial phase difference betwee the resultant oscillation signal and the injected signal. |             |
| Figure 3.6. Model of super-harmonic injection locking.                                                                                         | 44          |
| Figure 3.7. Schematic of the sub-harmonic injection-locked oscillator                                                                          | 45          |
| Figure 3.8. Phase noise with and without injection locking at VDD=0.6V, 1V for ILFM.                                                           |             |
| Figure 3.9. 5-stage ring oscillator based ILFD.                                                                                                | 48          |
| Figure 4.1. A WDM photonic interconnect.                                                                                                       | 52          |
| Figure 4.2. One channel of an embedded-clock architecture                                                                                      | 53          |
| Figure 4.3. Equivalent electrical model of a reverse-biased photodiode                                                                         | 56          |
| Figure 4.4. Current-integrating clock receiver front-end.                                                                                      | 58          |
| Figure 4.5. TIA based clock receiver front-end.                                                                                                | 59          |
| Figure 4.6. Current-integrating clock receiver with super-harmonic injection lock                                                              | -           |
| Figure 4.7. TIA based clock receiver with first-harmonic injection locking                                                                     | 63          |
| Figure 4.8. Single-to-differential conversion with CML buffer.                                                                                 | 65          |
| Figure 4.9. Proposed single-to-differential conversion with transmission gate                                                                  | 66          |

# LIST OF FIGURES (Continued)

| <u>Figure</u>                                                                                                                                                                                                                              | Page |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| Figure 4.10. Monte Carlo simulation of duty cycle caused by single-to-differential conversion: (a) 100 runs of transient simulations and (b) distributions of half-cycle the output ( $\mu = 201.2 \text{ ps}, \sigma = 1.1 \text{ ps}$ ). | of   |
| Figure 4.11. Schematic of ILRO.                                                                                                                                                                                                            | 67   |
| Figure 4.12. Layout of (a) the whole receiver bank and (b) the clock receiver                                                                                                                                                              | 70   |
| Figure 4.13. Frequency tuning of ILRO with bias current                                                                                                                                                                                    | 71   |
| Figure 4.14. Deskewing of ILRO by tuning free-running frequency.                                                                                                                                                                           | 72   |
| Figure 4.15. Phase spacings when 2.5 GHz clock is injected                                                                                                                                                                                 | 72   |
| Figure 4.16. Jitter performance of ILRO.                                                                                                                                                                                                   | 73   |
| Figure 4.17. Phase noise performance of ILRO: (a) single-sided band phase noise profile and (b) phase noise at frequency offset of 1 MHz for different injection frequencies.                                                              | 74   |

# LIST OF TABLES

| Table                     | Page |
|---------------------------|------|
| Table 4.1 Power Breakdown | 76   |

# Energy-Efficient Clock Generation for Communication and Computing Systems Using Injection Locking

# **CHAPTER 1. INTRODUCTION**

Recently, the continuous developments of high-performance microprocessors, wireline and wireless communications enable the fast data transfers at different distances and through different transmission media. The clock signal, which is the common signal for these systems, has been pushed to multi-gigahertz frequency range due to the rapid advancing. New challenges have been arrived for the high-performance clocking considering jitter, phase noise, skew, and power consumption. Clocking generation and distribution techniques other than the conventional PLL/DLL have drawn much attention recently, and injection locked ring oscillator is a promising one of them.

### **1.1 Clocking in Communication and Computing Systems**

Communication and computing systems are the foundations for modern information technology. The speed of these systems has increased dramatically in the past few decades. The speed of Ethernet, the most widely used wireline communication standard, is increased from 3 Mb/s in 1973-1975 to 10 Gb/s widely used nowadays, and faster 40- and 100-GbE standard has been standardized by the IEEE 802.3 High Speed Study Group (HSSG) [1]. At the same time, the wireless systems are expanding

form crowed sub- and low-gigahertz range to multi-gigahertz range due to the introductions of wireless local area network (WLAN) and wireless personal area network (WPAN). The speed of computing system, advanced by the fabrication and many-core technology, also increases at an astounding trend. According to the recent ITRS (International Technology Roadmap for Semiconductors) [2], the number of cores is projected to increase 1.4x per year and the per-channel I/O bandwidth is also projected over 70 Gb/s in 2024, as shown in Fig. 1.1. Even though, there is still a huge gap between the aggregate I/O bandwidth and the total NoC throughput due to



Figure 1.1. NoC projections about number of cores and per-channel I/O bandwidth.

packaging technology allowing a modest increase in I/O channel number (about 2x to 4x during this period) [2]. ITRS 2010 also predicts that the processor clock speed will increase at a speed of 1.25x per year and the off-chip clock will increase at a faster speed and reach 75 GHz in 2024 (Fig. 1.2). Besides, the clock frequency is limited by the power budget, a high energy-efficient design is quite demanding.



Figure 1.2. NoC projections about (a) on-chip and (b) off-chip clock frequency.

#### **1.1.1 Clocking for Wireline Communications**

Both the electrical and optical interconnects are commonly used wireline communications and many internal blocks and principles are identical for these two links. Therefore, we show the clocking in electrical links for example in Fig.1.3. A transmitter (TX) relies on a TX clock to convert the digital data into electrical signal that travels through the channel, and then a receiver (RX) converts the incoming electrical signal back into digital data also depending on a RX clock [3]. Based on the schemes of clock generation and recovery, the links are usually characterized as synchronous (common clock), mesochronous (forwarded clock) and plesiochronous (embedded clock) transmission.

The data transmission rate of a serial links can be dramatically increased without being limited by the clock frequency if we use a time-division multiplexing [4] as shown in Fig.1.4. The *M* branches of transmitters and receivers as well as *M* phases of the clock, which are equally spaced, make the data rate *M* times of clock frequency.



Figure 1.3. Clocking in a basic electrical link.



Figure 1.4. Time-division multiplexing in a serial link.

The timing of the data transmission needs a very precise control in order to minimize the bit error rate (BER) and achieve a very high data rate. The optimum sampling point of the data receiver is usually close to the middle of the bit-period and both the frequency and phase stability of the clock signal should be maintained.

#### 1.1.2 Clocking for Wireless Communications

In the wireless communications, the clock signal is RF carrier or LO signal that determines the RF transmission frequency. A typical wireless transceiver is shown in Fig. 1.5. The output frequency of a local frequency synthesizer up converts the low-frequency baseband signal to radio signal in the TX, and down converts the received signal from radio frequency to low frequency in RX. The main consideration is the interferences of adjacent channels caused by phase noise.



Figure 1.5. Clocking in a typical wireless transceiver.

## **1.2 Design Considerations**

In high-speed communication and computing systems, the tolerable timing error is reduced as the clock cycle time is decreased. Jitter for wireline communication and computing systems, and phase noise for wireless communication systems are used separately to describe the clock timing errors. Together with skew, power consumption and multi-phase generation, all these design considerations of clocking should be investigated.

### 1.2.1 Jitter

Jitter can be defined as "the short-term variation of a signal from its ideal position in time" as shown in Fig. 1.6 [5]. The deterministic jitter usually caused by duty-cycle distortion and data dependent jitter is predictable, and the peak-to-peak value of this jitter is bounded. However, the random jitter originated from device noise is unbounded and can be modeled with a Gaussian distribution, the RMS value  $\sigma_{RMS}$  of which is used to characterize the jitter performance. Jitter transfer is another important characteristic, which is the relationship between the applied input jitter and the resulting output jitter as a function of frequency. Its tracking bandwidth is usually determined by the loop bandwidth of clock recovery.



Figure 1.6. Illustration of the clock jitter.

### 1.2.2 Phase Noise

The frequency domain representation of jitter is phase noise [6], which is usually characterized by the single-sideband noise spectral density (dBc/Hz):

$$L(\Delta f) = 10\log \frac{P_{sideband}(f_0 + \Delta f, 1Hz)}{P_{carrier}}$$
(1.1)

where  $P_{sideband}(f_0 + \Delta f, 1Hz)$  is the single-sideband power at a frequency offset of  $\Delta f$  from the carrier in a measurement bandwidth of 1 Hz. The typical phase noise profile is shown in Fig. 1.7(a).

Phase noise results in the spectral purity degradation of the LO signal in wireless communication systems which will produce adjacent channel interference as shown in Fig. 1.7(b). When an LO signal down converts the desired RF signal, adjacent undesired signal also gets convolved with the LO signal and down converted near the desired signal and overlaps its spectra with that of the desired signal.



Figure 1.7. Illustration of the phase noise: (a) typical profile and (b) its effects on producing the adjacent channel interference.

#### 1.2.3 Deskewing and Multi-Phase Generation

As discussed in 1.1.1, it is necessary to control the relative positions of sampling clock and the received data to minimize the BER, and generate multiple clock phases to achieve time-division multiplexing. In a forwarded-clock receiver of a serial link, for example, the conventional way of clock generation and phase deskewing is using local delay/phase locked loop (DLL/PLL) [7] to generate multiple time-interleaved phases and proceeding phase rotator to interpolate the appropriate phase position to

sample the received data in the center. However, it consumes significant power as each link needs a local, phase rotator-based PLL to generate and deskew the clock phases for recovery of the data and [8] shows that the phase rotation alone occupies almost half of the total power of the receiver.

### **1.2.4 Power Consumption**

Power consumption is one of the most critical requirements for future wireless body-area network (WBAN) sensors [9], [10], especially those battery-powered medical implanted devices. In the transceiver of these devices, frequency synthesizer consumes a large portion of the total radio power, especially when high frequency and low phase noise are required. In the wireline communication and computing systems, the power consumption limitation is also prominent which requires an energy-efficient implementation to comply with power budgets that have plateaued near 100 W due to heat constraint [2]. Therefore, an energy-efficient clock generation and distribution scheme is desired for both communication and computing systems, while maintains the adequate performance regarding the considerations discussed above.

#### 1.2.5 Motivation of Injection-Locked Ring Oscillator (ILRO)

Among the alternative frequency synthesis techniques other than the conventional PLL/DLL, injection locking which synchronizes the frequency and phase of a free running oscillator with a source, has attracted much attention recently [11], [12]. This

technique can significantly suppress the phase noise of the ring oscillator, improve the energy efficiency, enable a fast startup and conveniently generate multiple timeinterleaved phases for the time-division multiplexed applications. An ILRO only consumes 1.08 mW of power at 1.8 GHz in a forwarded-clock receiver [12], while the phase rotation only in a software CDR based receiver needs about 4 mW of the power [8]. The disadvantages may be the inferior frequency stability to PLL and the increased reference spurs. Most of the applications we apply the ILRO to can tolerate with or eliminate these advantages by adding specific calibration blocks.

## **1.3 Thesis Organization**

The following parts of the thesis are organized as follows.

Chapter 2 introduces the background knowledge of ring-based voltage-controlled oscillator (VCO) and the design intuitions are discussed with some simulation results.

Chapter 3 presents a model of injection locking to understand this phenomenon and guide the circuit design. The issues such as fast wakeup, phase noise suppression and jitter tracking, and harmonic injection are also discussed. The applications of ILRO in wireless communication systems are briefly discussed.

In Chapter 4, a forwarded-clock synchronization based on first-harmonic injection locking is designed for a 10 Gb/s photonic interconnect according to its specific features. TIA based front-end is chosen in order to satisfy the requirement of sensitivity. Significant power and area savings are achieved by the ring oscillator without degrading the phase noise and jitter performance.

Finally, Chapter 5 concludes the thesis and proposes some suggestions for future work.

# **CHAPTER 2. RING OSCILLATOR**

Ring oscillator has been a crucial building block in many communication systems because of its integrated nature. In this chapter, an overview of the basic concepts of ring-based voltage-controlled oscillator (VCO) is presented, and then design intuitions considering the frequency, power, phase noise and jitter are discussed.

# **2.1 Introduction**

A ring oscillator can be modeled as a chain of delay cells where the output of the last stage is fed back to the input of the chain (Fig. 2.1). The oscillation period can be intuitively viewed as the time it takes a transition to propagate twice around the loop. In an *N*-stage ring oscillator, the oscillation frequency is approximately as:



Figure 2.1. Simplified ring oscillator model.

$$f_{osc} = \frac{1}{2NT_D} \tag{2.1}$$

where  $T_D$  is the propagation delay of each delay cell, which is sensitive to variations of process, supple and control voltage, and temperature according to different implementation of the delay cells.

A feedback system satisfying "Barkhausen Criteria" has the potential of oscillation. Consider the unity-gain negative feedback system shown in Fig. 2.2, where the closed-



Figure 2.2. Unity-gain negative feedback system.

loop transfer function can be written as

$$\frac{V_{out}(j\omega)}{V_{in}(j\omega)} = \frac{H(j\omega)}{1 + H(j\omega)}.$$
(2.2)

If  $\omega = \omega_0$ ,  $H(j\omega_0) = -1$ , then the closed-loop gain approaches infinity at  $\omega = \omega_0$ . Under this condition, the noise component at  $\omega_0$  will be amplified by the circuit, resulting in oscillation [13]. Thus, the necessary but not sufficient conditions for a negativefeedback circuit to oscillate are:

$$|H(j\omega_0)| \ge 1 \tag{2.3}$$

13

$$\angle H(j\omega_0) = 180^{\circ} \tag{2.4}$$

In the real implementations, the loop gain should be chosen as more than 2-3 in order to ensure oscillation in the presence of temperature and process variation [13].

A ring oscillator can be simply constructed by using a chain of inverters to create the needed phase shift. Assume each inverter contributes one dominant 3-dB pole, the open loop transfer function of an *N*-stage ring oscillator is:

$$H(j\omega) = \left(\frac{-H_0}{1 + j\omega/\omega_{3dB}}\right)^N \tag{2.5}$$

where  $H_0$  and  $\omega_{3dB}$  is the gain and 3-dB pole of an inverter stage. According to the "Barkausen Criteria", it can be derived as:

$$\left|H(j\omega_0)\right| = \left(\frac{H_0}{\sqrt{1 + (\omega/\omega_{3dB})^2}}\right)^N \ge 1 \Longrightarrow H_0 \ge \sqrt{1 + (\tan(\frac{\pi}{N}))^2}$$
(2.6)

$$\Phi(j\omega_0) = -N \tan^{-1}(\omega_o / \omega_{3dB}) = -\pi \Longrightarrow \omega_{3dB} = \frac{\omega_o}{\tan(\pi / N)}$$
(2.7)

Each inverter stage contributes  $\pi/N$  to the phase, so that a total phase shift of  $2\pi$  is around the loop (including the inversion). Thus at least three cascaded inverter stages are needed in the implementation of  $H(j\omega)$  ideally, to form an oscillator. The implementation of an oscillator with even number of stages will be discussed in the following sections.

14

# 2.2 Type of Ring Oscillator

## 2.2.1 Sing-Ended Signal Ring Oscillator

The simplest ring oscillator is implemented as the single-end signal structure, with each of its delay cell shown in Fig. 2.3. The number of stage N should be odd, and equal or larger than 3 for oscillation to build up. A practical 2-stage ring oscillator will oscillate if the stages have very high gain with special efforts to shape the phase of the open-loop transfer function.



Figure 2.3. Delay cell of the ring oscillator with single-ended signal structure.

The expression of propagation delay  $T_D$  of each inverter cell should be derived in order to calculate the oscillation frequency according to 2.1. It can be calculated as:

$$T_D = C \int_{\nu_1}^{\nu_2} \frac{d\nu}{i}$$
(2.8)

where *i* is the current charging or discharging the load capacitor,  $v_1$  and  $v_2$  are initial and final voltages of this capacitor. The full output swing of the inverter is assumed, thus the output voltage is changing between 0 and  $V_{DD}$ . For simplicity, we assume *i* is constant and is the average of the current at the endpoints of the voltage transition. Since the propagation delay is defined as the time it takes for the output to reach the half of the transition, the propagation delay  $T_D$  can be expressed as:

$$T_D = \frac{CV_{DD}}{2I_{av}} \tag{2.9}$$

Therefore, the frequency of the ring oscillator can be tuned by several ways: 1) tuning supple voltage  $V_{DD}$ ; 2) tuning capacitive load *C*; 3) varying the current available for charging and discharging the load capacitance as shown in Fig. 2.4, which is the "current-starved" structure. Tuning the supply voltage and the load driving strength



Figure 2.4. "Current-starved" delay cell of ring oscillator.

usually has a wide tuning range which is necessary to cover PVT variations (PVT variation can cause more than 50% frequency variation), while varying the load capacitor does not since the range of capacitance variation available in most varactor technologies is limited. Thus, this approach is usually used as a fine tuning method of oscillation frequency. Basically, a good tuning scheme should provide: 1) wide tuning range; 2) good linearity; 3) no jitter degradation [14].

The advantages of the single-ended signal ring oscillator are the good power efficiency as the delay stage only draws current when there is a signal transition, and the full output swing. However, its disadvantage is also obvious as signal-ended stage intrinsically susceptible to supply and substrate interference. Both the amplitude coupling and delay modulation exit in this structure and cause variation in the oscillation waveform appearing as jitter.

#### 2.2.2 Differential Ring Oscillator

To reject common-mode noise and supply/subtract noise, differential structure will be used as shown in Fig. 2.5. The delay cell usually uses a differential pair as the input and different types of load to get enough gain and frequency tuning. An example of delay cell is shown in Fig. 2.6, it employs the load with symmetric I-V characteristics about the center of the voltage swing, thereby improving the power supple rejection ratio (PSRR) and the linearity of frequency tuning [15].

The number of stages is not limited to be odd now since wire inversion can be used



Figure 2.5. Ring oscillator with differential structure and even number of stages.



Figure 2.6. Delay cell of a differential ring oscillator with symmetric loads.

to meet the oscillation criterion. Even-stage is useful in wireless communication systems while both in-phase and quadrature signals are needed. Although the differential structure is immune to common-mode noise, the single swing should be limited to small value to keep all devices in the saturation region. It also consumes more power and has poor jitter performance since the transistors contribute noise all the time [14].

#### 2.2.3 Pseudo-Differential Ring Oscillator

Pseudo-differential ring oscillator has the advantages of the above two types of ring oscillator, having some degree of common-mode rejection while maintaining large output swing. One example is shown in Fig. 2.7(a), in which small latches couple the two single-ended ring oscillator, forcing differential operation. However, the oscillation frequency is usually slower than that of the single-ended ones, since the latch will fight with the delay cell [14]. It can also be implemented with even delays stage as in Fig. 2.7(b), by inverting the wires. Another implementation of pseudo-differential ring oscillator is shown in Fig. 2.8, with the same structure as differential ring while eliminating the tail current source. Both the input pair and load of the delay cells can be configured as cross-coupled structures in order to operate with the pseudo-differential performance [15].

Although the pseudo-differential ring oscillator performs some degree of common mode rejection, there is no rejection of delay modulation due to supply voltage signals variation as shown in Fig. 2.9 [14]. The same delay change  $\Delta T_{\rm D}$  occurs for both signals with supply voltage variation  $\Delta V_{\rm DD}$ .



Figure 2.7. Pseudo-differential ring oscillator with (a) odd stages and (b) even stages.



Figure 2.8. Pseudo-differential delay cell with (a) cross-coupled load and (b) cross-coupled input stage.



Figure 2.9. Delay modulation of pseudo-differential signals.

## 2.3 Phase Noise and Jitter of Ring Oscillator

The spectrum of an ideal oscillator should be an impulse function at the oscillation frequency, while in fact it exhibits phase noise "skirts" around the center frequency. Phase noise is the random fluctuations in the phase of a signal due to time-domain instabilities in an oscillator which is caused by intrinsic thermal and flicker noise. It is usually expressed as the relative value of noise power density to carrier power (dBc), normalized to a 1-Hz bandwidth (dBc/Hz) at a specified offset frequency from the carrier [6].

Jitter is the time domain uncertainty of the transition spacing of the oscillation and is increasing with measurement interval  $\Delta T$ . The standard deviation of the jitter after  $\Delta T$  seconds is shown in [16]:

$$\sigma_{\Delta T} = \kappa \sqrt{\Delta T} \tag{2.10}$$

where  $\kappa$  is a proportionality constant determined by circuit parameters.

There has been many work on modeling phase noise and jitter, some in the time domain [16], [17], and some in frequency domain [18], [19]. One of the time-domain models for the phase noise of oscillator is derived by Leeson [20], showing the phase noise at a frequency offset of  $\Delta \omega$  is:

$$L(\Delta\omega) = \frac{1}{4Q^2} \left(\frac{\omega_0}{\Delta\omega}\right)^2 \tag{2.11}$$

where  $\omega_0$  is the center frequency, Q is the quality factor of the oscillator. According to Razavi in [18] the quality factor Q can be defined as:

$$Q = \frac{\omega_0}{2} \frac{d\Phi}{d\omega}$$
(2.12)

where  $\Phi$  is the phase of the open-loop transfer function of the oscillator. With the expression given by (2.7), quality factor can be calculated as

$$Q = \frac{\omega_0}{2} \left| \frac{d\Phi}{d\omega} \right|_{\omega = \omega_0} = \frac{N}{4} \sin \frac{2\pi}{N}$$
(2.13)

where N is the number of delay stages.

Hajimiri in [21] develops a general model for phase noise and jitter which gives us some design intuitions. Based on the impulse sensitivity function (ISF)  $\Gamma(x)$ , which is time-varying proportionality constant, the phase noise in the  $1/f^2$  region (Fig. 2. 10) is quantitatively predicted as:

$$L(\Delta f) = 10 \cdot \log\left(\frac{\Gamma_{rms}^2}{2 \cdot (2\pi\Delta f)^2} \cdot \frac{\overline{i_n^2} / \Delta f}{q_{max}^2}\right)$$
(2.14)

where  $\Gamma_{rms}$  is the RMS value of the ISF,  $\overline{t_n^2} / \Delta f$  is the single-sideband power spectral density of the noise current source,  $q_{max}$  is the maximum charge of the node of interest and  $\Delta f$  is the frequency offset from the carrier. Phase noise in the  $1/f^3$  region is caused by the device 1/f noise upconverted by the DC value of the ISF and phase noise in the  $1/f^2$  region is due to upconverted thermal noise around the frequency of the carrier and its harmonics. The corner frequency between  $1/f^3$  and  $1/f^2$  region ( $f_{1/f^3}$ ) is relative to the 1/f noise corner  $f_{1/f}$  with the following expression:

22



Figure 2.10. Modeled single-sideband phase noise spectrum.

$$\Delta f_{1/f^3} = \Delta f_{1/f} \cdot \frac{\Gamma_{dc}^2}{\Gamma_{rms}^2}$$
(2.15)

where  $\Gamma_{dc}$  is the DC value of the ISF. Hajimiri also derives an equation for the proportionality constant  $\kappa$  to calculate jitter:

$$\kappa = \frac{\Gamma_{rms}}{q_{max}\omega_0} \sqrt{\frac{1}{2} \frac{\overline{i_n^2}}{\Delta f}}$$
(2.16)

The Hajimiri's model is accurate for predicting phase noise and jitter, while it is hard to calculate the ISF value. However, the paper derives some expressions for phase noise and jitter based on different types of ring oscillator, which clarifies the design tradeoffs. For the single-ended signal ring oscillator, its phase noise and jitter are expressed as:

$$L(\Delta f) = \frac{8}{3\eta} \cdot \frac{kT}{P} \cdot \frac{V_{DD}}{V_{char}} \cdot \frac{f_0^2}{\Delta f^2}$$
(2.17)

$$\kappa \approx \sqrt{\frac{8}{3\eta}} \cdot \sqrt{\frac{kT}{P} \cdot \frac{V_{DD}}{V_{char}}}$$
(2.18)

where  $\eta$  is a proportionality constant (close to one), *P* is the power consumption of the ring oscillator and  $V_{char}$  is the characteristic voltage of the device. It should be noticed that the phase noise is inversely proportional to the power dissipation and increases quadratically with the oscillation frequency, while the phase noise does not depend on the number of stages and load capacitor *C*. Thus, *C* is a free parameter for designer, yuning the oscillation frequency without affecting the phase noise and jitter performance.

In contrast with the single-ended signal ring oscillator, the phase noise of differential signal ring oscillator degrades as the number of stages increases for a given frequency and power consumption. Besides, for *N*-stage ring oscillator, the phase noise of the differential ring oscillator is approximately  $N \cdot \{1 + V_{char} / (R_L \cdot I_{tail})\}$  times larger than that of a single-ended signal ring oscillator, with the same value of *N*, *P* and *f*<sub>0</sub>. For large number of stages *N*, the difference will be even more significant, thus 3 to 5 stage differential ring oscillator is commonly used. Single-ended structure has better fundamental phase noise and jitter performance since it intrinsically has larger swing and lower average current and needs no bias current which contributes additional noise.

24

# 2.4 Design Considerations of Ring Oscillator

# 2.4.1 Frequency

Equation (2.9) shows that the frequency of the ring oscillator can be tuned by varying supple voltage  $V_{DD}$ , capacitive load *C* and by current-starved structure. Fig. 2.11 shows the frequency versus supple voltage of current-starved ring oscillator.



Figure 2.11. Frequency tuning of a current-starved ring oscillator with supply voltage.

When the current starving transistors are operating in saturation region, (2.9) can be modified as:

$$T_D = \frac{CV_{SW}}{2I_{av}} \tag{2.19}$$

where  $V_{SW}$  ( $\langle V_{DD}\rangle$ ) is the voltage swing of the inverter stage. However, the delay of each stage will be determined by the resistance of the current starving transistors rather than the delay cell ( $T_D \propto (R_{triod} + R_{SW})C$ ), when the current starving transistors are operating in the triode regime, which is usually the case when operating in the low supply voltage. Therefore, some techniques such as pre-distorting linearization technique are needed to linearize the tuning range of the oscillation frequency.

Higher oscillation frequency can be achieved by increasing the bias current, thus there is a tradeoff between the power dissipation and the frequency. Smaller length of transistors can lead to higher frequency, while changing the width of the device cannot help with since the bias current and the capacitance may be equally changed in theory.

#### 2.4.2 Phase Noise

Both the systemic noise and the random noise contribute to the phase noise of the ring oscillator. The systemic noise such as common-mode supple noise can be avoided by using symmetric architectures, while the influence of the random noise such as thermal noise and flicker noise cannot be easily alleviated.

Phase noise performance can be improved by enlarging the power consumption as discussed before. It can be predicted from (2.17) and (2.18) that the phase noise and jitter of the ring oscillation will decrease as the power consumption increases. There are other ways of improving phase noise such as increasing the width of the device and minimizing the channel length, which also add the power dissipation. The phase

noise of a current-starved ring oscillator is simulated with different transistor widths, as shown in Fig. 2.12. The widths of the NMOS transistors are 3  $\mu$ m and 30  $\mu$ m respectively in the two cases. The power consumption increases from 2.1 mW to 12.4 mW when transistor width changes to 10 times larger, while the phase noise decreases from -85.13 dBc/Hz to -94.25 dBc/Hz at a frequency offset of 1 MHz.



Figure 2.12. Phase noise of a current-starved ring oscillator with different power consumption.

Dynamic power consumption is proportional to  $V^2 f$ , and thus is reduced quadratically as the supply voltage is decreased. However, problems such as constrained headroom, device variation, and leakage current become significant as the supply voltage approaches this near-threshold region. It has been already shown in the previous section that low supple voltage will introduce nonlinear frequency tuning. Phase noise degradation of the ring oscillator is another important factor that will limit the potential supply scaling. As the supply voltage decreases, the intrinsic thermal noise (kT/C), relative to the linear reduction in the capacitor voltage-swing, results in degraded signal-to-noise ratio and therefore larger oscillator phase noise. Furthermore, the slower inverter rise/fall edge rates degrade the impulse sensitivity function (ISF), resulting in higher phase noise [21]. Fig. 2.13 shows the simulated phase noise of the ring oscillator running at different supply voltage, while maintaining the same oscillation frequency by changing the transistor size. The phase noise at a frequency offset of 1 MHz for  $V_{\rm DD}$ = 1.0 V ring oscillator is -85.13 dBc/Hz, while for  $V_{\rm DD}$ = 600



Figure 2.13. Phase noise of a current-starved ring oscillator at different supple voltage.

mV is -77.67 dBc/Hz. The integrated jitter also changes from 289.37 ps (RMS) to 459.28 ps (RMS) when supple voltage is decreased from 1.0 V to 0.6 V.

# 2.5 Summary

By exploiting the basic concepts of VCO, pseudo-differential structure is a compromising choice of three different kinds of ways to implement a ring oscillation, regarding the trade off of power, voltage swing, phase noise and jitter. The design considerations such as frequency tuning, phase noise and jitter are discussed and the relative conclusions are verified by some simulation results.

# **CHAPTER 3. INJECTION LOCKING**

Recently, injection locking has attracted much attention in the applications including clock recovery, frequency synthesis, phase lock and quadrature generation. In this chapter, a model of injection locking is presented to understand this phenomenon and guide the circuit design. The issues such as fast wakeup, phase noise suppression and jitter tracking, and harmonic injection are also discussed.

### **3.1 Introduction**

Injection locking is when the oscillation systems with close-frequency environmental couplings can have interaction leading to change of their phases and frequencies. It was first observed by Christiaan Huygens in 17<sup>th</sup> that the pendulum of two clocks hung on the same wall would eventually swing at exactly the same frequency and 180 degrees out of phase [22]. This phenomenon has been studied for years and is proved to be very useful in a number of applications such as clock recovery, frequency synthesis, synchronization, and so on. Its intrinsic properties of energy-efficiency and phase noise reduction are very attractive, especially for these low power applications such as medical implants and wireline interconnects.

Basically, injection locking can be categorized into three groups, first-harmonic, super-harmonic, and sub-harmonic injection locking. These three different types of injection locking can be utilized for different applications based on the relationship between injection and oscillation frequency. In a first-harmonic injection locking, the injection frequency is the fundamental frequency of oscillation frequency, while in the super-harmonic and sub-harmonic injection locking, the injection frequency is a harmonic or sub-harmonic of the oscillation frequency. However, all these three types of injection locking can be explained by the general model of first-harmonic injection locking, thus the behavior of first-harmonic injection locking will be firstly analyzed and then extended to other two types.

## **3.2 Modeling of Injection Locking**

In order to understand the phenomenon of injection locking, different kinds of models have been developed from frequency domain to time domain. The most intuitive modeling of injection locking is given by Adler [23] in frequency domain, which is usually accurate for harmonic oscillator. Analyzing ring oscillator, transient waveform-based methods are needed as shown in [24], [25]. However, these time-domain methods usually require a full circuit description and do not give an intuitively analytical expression.

In this section, injection-locking model of ring oscillator is built using the frequency domain model assuming quasi-linear operation of the circuit [26], where the large-

signal operation of each stage of non-harmonic oscillator is similar to its small-signal ac behavior. Then, based on the liner open-loop transfer function  $H(j\omega)$  of ring oscillator derived in Section 2.1, the model of first-harmonic injection locking is shown in Fig. 3.1 [26], [27]. The injected signal  $S_{inj} \cos(\omega_{inj}t + \alpha_{inj})$  is modeled as an additive input and the output  $S_{osc} \cos(\omega_{inj}t + \varphi_{osc})$  has a carrier frequency of  $\omega_{inj}$  (rather than  $\omega_0$ ).



Figure 3.1. Model of first-harmonic injection locking.

Calculating  $S_x \cos(\omega_{inj}t + \psi)$ , which is the input to the open-loop transfer function  $H(j\omega)$  of the ring oscillator, is shown as:

$$S_x \cos(\omega_{inj}t + \psi) = S_{inj} \cos(\omega_{inj}t + \alpha_{inj}) + S_{osc} \cos(\omega_{inj}t + \varphi_{osc})$$
(3.1)

$$= (S_{inj} + S_{osc} \cos \theta) \cos(\omega_{inj}t + \alpha_{inj}) - S_{osc} \sin \theta \sin(\omega_{inj}t + \alpha_{inj})$$
(3.2)

where  $\theta = \varphi_{osc} - \alpha_{inj}$ . So the phase  $\psi$  can be calculated as:

$$\tan(\psi - \alpha_{inj}) = \frac{S_{osc} \sin \theta}{S_{inj} + S_{osc} \cos \theta}$$
(3.3)

Assuming weak injection, which is  $\,S_{\rm inj} << S_{\rm osc}$  ,

$$\frac{d\psi}{dt} \approx \frac{d\theta}{dt} \tag{3.4}$$

Upon travelling through  $H(j\omega)$ ,  $S_x \cos(\omega_{inj}t + \psi)$  experiences a phase shift given by:

$$S_{osc}\cos(\omega_{inj}t+\varphi_{osc})\approx S_x\cos\left(\omega_{inj}t+\psi+\tan^{-1}(\frac{2Q}{\omega_0}(\omega_0-\omega_{inj}-\frac{d\psi}{dt}))\right)$$
(3.5)

where Q is the quality factor of ring oscillator. The phases of the right and left expressions should be the same, thus

$$\varphi_{osc} = \psi + \tan^{-1}\left(\frac{2Q}{\omega_0}\left(\omega_0 - \omega_{inj} - \frac{d\psi}{dt}\right)\right)$$
(3.6)

Combining (3.4) and

$$\tan \Phi = \tan(\varphi_{osc} - \psi) = \tan(\theta - (\psi - \alpha_{inj})) = \frac{S_{inj} \sin \theta}{S_{osc} + S_{inj} \cos \theta} \approx \frac{S_{inj}}{S_{osc}} \sin \theta \qquad (3.7)$$

( $\Phi$  is the phase shift of the open-loop transfer function), (3.6) can be reformed as:

$$\frac{d\theta}{dt} = \omega_0 - \omega_{inj} - \frac{\omega_0}{2Q} \frac{S_{inj}}{S_{osc}} \sin \theta = \omega_0 - \omega_{inj} - \omega_L \sin \theta$$
(3.8)

The phase difference  $\theta$  between the resultant oscillation signal and the injected input signal can be deskewed with the different values of free-running oscillation frequency  $\omega_0$ . This achieves the same results as Adler's equation, where the single-sided locking range is:

34

$$\omega_L = \frac{\omega_0}{2Q} \frac{S_{inj}}{S_{osc}}$$
(3.9)

Obviously, larger injection strength and smaller quality factor of the ring oscillator will results in a larger locking range. Quality factor Q of ring oscillator can be simply expressed as:

$$Q = \frac{\omega_0}{2} \left| \frac{d\Phi}{d\omega} \right|_{\omega = \omega_0}$$
(3.10)

Based on the expression of  $\Phi$  derived in Section 2.1 for *N*-stage ring oscillator, quality factor is:

$$Q = \frac{N}{4} \sin \frac{2\pi}{N} \tag{3.11}$$

Thus the locking range  $\omega_{\rm L}$  can be expressed as:

$$\omega_{L} = \frac{2\omega_{0}}{N\sin(2\pi/N)} \frac{S_{inj}}{S_{asc}}$$
(3.12)

The locking range is inversely proportional to the stage number of ring oscillator N. It can be explained that as the N increases, the slope of the phase transfer function of ring oscillator becomes steeper leading to the reduced locking range. Therefore, large injection strength and minimum number of stages are preferred to achieve a wide locking range.

Based on (3.7), the total phase shift in an oscillation loop is changed by the injection locking and can be calculated in another way as follows [28]:

$$\Phi(j\omega_{inj}) = -N\tan^{-1}(\frac{\omega_{inj}}{\omega_o}\tan\frac{\pi}{N}) = -N\tan^{-1}\left((1+\frac{\Delta\omega}{\omega_o})\tan\frac{\pi}{N}\right)$$
(3.13)

where  $\Delta \omega$  is the frequency difference between the injection frequency and the oscillation frequency, and it is assumed that  $\Delta \omega \ll \omega_0$ , then by using first-order Taylor approximation, equation (3.13) can be written as:

$$\Phi(j\omega_{inj}) = -N\left(\tan^{-1}(\tan\frac{\pi}{N}) + \frac{\Delta\omega}{\omega_o}\tan\frac{\pi}{N}\frac{1}{1 + \tan^2\frac{\pi}{N}}\right) = -\pi - \frac{\Delta\omega}{\omega_o}\frac{N}{2}\sin(\frac{2\pi}{N}) (3.14)$$

The phase shift of the open-loop transfer function  $\Phi$  decreases as the increase of the number of inverter stages *N*.

# 3.3 Phase Noise and Jitter of Injection-Locked Ring Oscillator

The phase noise of an injection-locked ring oscillator (ILRO) mainly comes from two sources: the phase noise of free running oscillator ( $L_{osc}$ ) and the phase noise of the injected signal ( $L_{inj}$ ). The final output phase noise can be calculated by adding two uncorrelated random process to the noiseless system in the previous section.

Equation (3.14) can be rewritten by replacing the expression of *Q*:

$$\Phi(j\omega_{inj}) = -\pi - \frac{\Delta\omega}{\omega_o} 2Q \tag{3.15}$$

From (3.7), the total phase shift in an oscillation loop  $\Phi$  can also be expressed as:

35

36

$$\Phi(j\omega_{inj}) \approx \tan^{-1}(\frac{S_{inj}}{S_{osc}}\sin\theta)$$
(3.16)

So, combining (3.15) and (3.16), we can have

$$-\frac{2Q}{\omega_o}(\omega_{inj}-\omega_0) \approx \tan^{-1}\left(\frac{S_{inj}}{S_{osc}}\sin(\varphi_{osc}-\alpha_{inj})\right)$$
(3.17)

The noise caused by the free running oscillator is added to (3.17) by phase perturbation  $\Delta \Phi$ , the noise generated by injected signal is presented by the added phase perturbation  $\Delta \alpha_{inj}$ , and the resultant output phase is  $\varphi_{osc}+\Delta\varphi_{osc}$  [28]. Equation (3.17) is modified as:

$$-\frac{2Q}{\omega_o}(\omega_{inj} + \frac{d\Delta\varphi_{osc}}{dt} - \omega_0 - \frac{d\Delta\Phi}{dt}) \approx \tan^{-1}\left(\frac{S_{inj}}{S_{osc}}\sin(\varphi_{osc} + \Delta\varphi_{osc} - \alpha_{inj} - \Delta\alpha_{inj})\right) (3.18)$$

The right side of (3.18) can be expanded by first-order Taylor approximation as:

$$\tan^{-1}\left(\frac{S_{inj}}{S_{osc}}\sin(\theta + \Delta\theta)\right) \approx \tan^{-1}\left(\frac{S_{inj}}{S_{osc}}\sin\theta\right) + \frac{\frac{S_{inj}}{S_{osc}}\cos\theta}{1 + \left(\frac{S_{inj}}{S_{osc}}\sin\theta\right)^2}\Delta\theta \qquad (3.19)$$

Assume weak injection,  $S_{inj} / S_{osc} <<1$ , so (3.19) is approximated as

$$-\frac{2Q}{\omega_o}(\omega_{inj} + \frac{d\Delta\varphi_{osc}}{dt} - \omega_0 - \frac{d\Delta\Phi}{dt}) \approx \tan^{-1}(\frac{S_{inj}}{S_{osc}}\sin\theta) + \frac{S_{inj}}{S_{osc}}\cos\theta\Delta\theta \qquad (3.20)$$

Combining (3.9), (3.20) can be rewritten as

$$\frac{d\Delta\varphi_{osc}}{dt} + \omega_L \cos\theta\Delta\varphi_{osc} = \omega_L \cos\theta\Delta\alpha_{inj} + \frac{d\Delta\Phi}{dt}$$
(3.21)

After Laplace transform, (3.21) is

$$\Delta \varphi_{osc}(s) = \frac{\omega_L \cos\theta}{s + \omega_L \cos\theta} \Delta \alpha_{inj}(s) + \frac{s}{s + \omega_L \cos\theta} \Delta \Phi(s)$$
(3.22)

In order to get noise power spectral density of the injection-locked oscillator  $|\Delta \varphi_{\rm osc}|^2$ , *s* in (3.22) is substituted by  $j\Delta \omega$ , and  $|\Delta \alpha_{\rm inj}|^2$  and  $|\Delta \Phi|^2$  are the noise spectral densities of injected signal ( $L_{\rm inj}$ ) and free running oscillator ( $L_{\rm osc}$ ), therefore the total phase noise after injection locking is

$$L_{out}(\Delta\omega) = \frac{\omega_L^2 \cos^2 \theta}{(\Delta\omega)^2 + \omega_L^2 \cos^2 \theta} L_{inj}(\Delta\omega) + \frac{(\Delta\omega)^2}{(\Delta\omega)^2 + \omega_L^2 \cos^2 \theta} L_{osc}(\Delta\omega) \quad (3.23)$$

If the injection-locked system is stable which is when  $d\theta/dt = 0$ , (3.8) can be solved as:

$$\sin\theta = \frac{\omega_0 - \omega_{inj}}{\omega_L} \tag{3.24}$$

Assuming  $\omega_0$  is very close to  $\omega_{inj}$ ,  $\cos^2\theta$  is approximated as 1, so (3.23) can be simplified as:

$$L_{out}(\Delta\omega) = \frac{1}{1 + (\Delta\omega/\omega_L)^2} L_{inj}(\Delta\omega) + \frac{1}{1 + (\omega_L/\Delta\omega)^2} L_{osc}(\Delta\omega)$$
(3.25)

Equation (3.25) shows that ILRO will low-pass the noise from injected signal, while high-pass the noise from itself, which is very similar to the phase noise shaping of a  $1^{\text{st}}$ -order PLL, with the loop bandwidth of  $\omega_{\text{L}}$ . A similar equation describes jitter tracking behavior is shown as:

$$\sigma_{out}^{2} = \frac{1}{1 + (\Delta \omega / \omega_{L})^{2}} \sigma_{inj}^{2} + \frac{1}{1 + (\omega_{L} / \Delta \omega)^{2}} \sigma_{osc}^{2}$$
(3.26)

If the injection frequency is at the edge of the locking range, (3.23) can be reduced to

$$L_{out}(\Delta \omega) = L_{osc}(\Delta \omega) \tag{3.27}$$

in which case, the total phase noise is that of the free-running oscillator. Therefore, the in-band phase noise of ILRO will be reduced significantly if the oscillator is locked to a low-phase-noise source at the center of the locking range.

The jitter of ILRO  $\sigma_{\Delta T}$  can be expressed in terms of its phase noise power spectral density  $S_{\phi}(f)$ , as shown from [19], [29]:

$$\sigma_{\Delta T}^{2} = \frac{8}{\omega_0^2} \int_0^\infty S_{\phi}(f) \sin^2(\pi f \Delta T) df \qquad (3.28)$$

where  $\omega_0$  is the free-running oscillation frequency. Assume long delays ( $\Delta T \rightarrow \infty$ ), (3.28) can be simplified as

$$\sigma_T^{\ 2} = \frac{4}{\omega_0^2} \int_0^\infty S_{\phi}(f) df$$
 (3.29)

where  $\sigma_T$  is the RMS jitter. (3.29) shows the jitter is determined by the area under the phase noise profile of the output spectrum. Injection-locking to a low-phase-noise source is also beneficial to reduce jitter since the noise power under phase noise profile is decreased. This is especially advantageous for ring oscillator as it has high intrinsic phase noise, however it is easily implemented on chip with very small area. The phase noise of a ring oscillator with and without injection locking is shown in Fig. 3.2, the integrated jitter is decreased from 289.37 ps(RMS) to 0.147 ps(RMS) with an ideal injected signal. The locking range can be read from the transition point of phase

noise profile after injection-locked (red line). For a ring oscillator, the injection strength  $S_{inj}/S_{osc}$  is approximately as the size ratio of injection buffer versus the inverter forming the oscillator ( $W_{inj}/W_{osc}$ ) as shown in Fig. 3.3. The locking range of a 4-delay-cell ring oscillator running at 3 GHz is simulated as 340 MHz, which is close to the calculation from (3.12) as 375 MHz ( $W_{inj}/W_{osc} = 1/4$ ).



Figure 3.2. Phase noise of a ring oscillator with and without injection locking.



Figure 3.3. Injection-locked ring oscillator.

#### 3.3 Fast Wakeup

In the applications with multiple links, not all the links are fully occupied all the same, therefore, it is energy-efficient to idle the un-used links and wake up them fast if burst data comes. It is important to understand the transient behavior of the ILRO, especially when it is start up.

When the ILRO system reaches the stable point  $(d\theta / dt = 0)$ , the solution of (3.8) is:

$$\theta_{ss} = \sin^{-1}(\frac{\omega_0 - \omega_{inj}}{\omega_L}). \tag{3.30}$$

where  $\theta_{ss}$  is the steady-state phase difference between the resultant oscillation signal and the injected signal, while  $\pi - \theta_{ss}$  is the meta-stable point. In order to simplify the calculation, a linearized solution of (3.8) is presented in [30], which is valid when the injection frequency is very close to the free running frequency ( $\omega_0 - \omega_{inj} \approx 0$ ). The settling of the phase difference and the frequency difference exhibits exponential behavior in nature, as shown:

$$\theta(t) = \theta_{ss} + (\theta(0) - \theta_{ss}) \cdot e^{-\omega_L t}$$
(3.31)

$$\omega_{osc}(t) = \omega_{inj} + \omega_L \cdot (\theta_{ss} - \theta(0)) \cdot e^{-\omega_L t}$$
(3.32)

where  $\theta(t)$  and  $\omega_{osc}(t)$  are the instantaneous phase and frequency differences between the resultant and injected signal, and  $\theta(0)$  is the initial phase difference at the point of injection. We can drew the conclusion from the above two equations that the ILRO will lock instantaneously if the initial phase difference  $\theta(0)$  equals to the steady-state phase difference  $\theta_{ss}$ . The locking time of ILRO is determined by two factors: 1) the time constant of exponential decay  $1/\omega_{\rm L}$  and 2) the initial phase difference  $\theta(0)$ . The locking of ILRO is potentially faster than PLL since it usually has larger loop bandwidth  $\omega_{\rm L}$  and further maximizing the locking range can result in a fast locking. When the initial phase difference between the resultant and the injected signal equals to the steady-state phase difference, ILRO will lock immediately, while if  $\theta(0) = \pi - \theta_{\rm ss}$ , the resultant oscillation frequency  $\omega_{\rm osc}$  will equal to the injection frequency  $\omega_{\rm inj}$  and the  $\theta$  will remains at  $\pi - \theta_{\rm ss}$  forever. In reality, phase noise will disturb the meta-stable point and slowly changes phase difference to  $\theta_{\rm ss}$ , leading to a large locking time.



Figure 3.4. Settling behavior of ILRO with different locking ranges.

Fig. 3.4 shows the settling behavior of ILRO with different single-sided locking range  $\omega_{\rm L}$ . Larger locking range will lead to a faster settling. The influence of initial phase difference  $\theta(0)$  on the settling behavior is shown in Fig. 3.5. The minimum locking time is observed when the initial phase difference  $\theta(0)$  is close to the final stable phase difference  $\theta_{\rm ss}$ , while the maximum one occurs at  $\theta(0) = \pi - \theta_{\rm ss}$ . From this conclusion, we can think about ways to control the initial phase difference, in order to get great improvement in the locking speed.



Figure 3.5. Settling behavior of ILRO with different initial phase difference between the resultant oscillation signal and the injected signal.

### **3.4 Harmonic Injection Locking**

There are the cases that the injection frequency is the sub-harmonics and harmonics of free running frequency of ring oscillator, such as the frequency multiplier and divider. Their behaviors can also be predicted by the previously discussed firstharmonic injection-locking theory with some modification.

For the sub-harmonic injection locking, the incident frequency is a sub-harmonic of the oscillator free-running frequency ( $\omega_{inj} = \omega_0/n$ ), which means the injection occurs once every *n* cycles. The locking range  $\omega_L$  can be easily modified by dividing the injection strength in (3.12) by a factor of *n*, as follows:

$$\omega_L = \frac{2\omega_0}{N\sin(2\pi/N)} \frac{S_{inj}}{S_{osc}} \frac{1}{n}.$$
(3.33)

The in-band phase noise is constrained to  $L_{inj}+20 \log_{10} n$ , according to the derivations in [31]. As the division ratio *n* increases, the noise rejection degrades accordingly, as corrections from the injected signal are too sparse to clean-up the oscillation.

In the modeling of super-harmonic injection locking, Fig. 3.1 should be modified as Fig. 3.6 [32], [33]. It consists of an injector which mixes the injected signal ( $\omega_{inj}$ ) with the output signal ( $\omega_0$ ) and generates harmonic tones ( $n\omega_{inj} \pm m\omega_0$ ), and a frequency selective block filtering the undesired tones and passing tone with the frequency of  $\omega_{inj}/N$ . The single-sided locking range  $\omega_L$  can be expressed as [33], [34]:

$$\omega_{L} = \gamma \cdot \alpha_{N} \frac{2\omega_{0}}{N\sin(2\pi/N)} V_{inj}$$
(3.34)



Figure 3.6. Model of super-harmonic injection locking.

where  $\gamma$  is the injection efficiency,  $\alpha_N$  is the *N*-th order harmonic coefficient, *N* is also the number of stages, and  $V_{inj}$  is the amplitude of the excitation signal.

#### 3.4.1 Implementation in Wireless Transmitter

Frequency multiplying is usually needed for the generation of RF carrier and local oscillator (LO). Injection locking frequency multiplier (ILFM) based on sub-harmonic injection locking and edge combining has attracted much attention recently, especially for the applications allowing very small power consumption and only requiring very relaxed frequency stability.

One example [35] is shown in Fig. 3.7 used in a MICS (Medical Implant Communications Service) band near-threshold transmitter. It consists of an ACcoupled injection stage with a pulse generator, and a five-stage single-ended currentstarved ring oscillator, where each stage shares a tunable current source. Phase noise degradation of the injection-locked ring oscillator is one important factor that will limit the potential supply scaling. Fig. 3.8 shows the simulated phase noise of the ring oscillator running at two operating conditions, both with their 5<sup>th</sup> sub-



Figure 3.7. Schematic of the sub-harmonic injection-locked oscillator.



Figure 3.8. Phase noise with and without injection locking at VDD=0.6V, 1V for an ILFM.

harmonics used as the input injection frequencies: VDD = 1.0 V,  $f_{osc}$  = 400 MHz; VDD = 0.6 V,  $f_{osc}$  = 80 MHz. These two cases are compared as the 400 MHz frequency can generated either directly by the 1V-400MHz-SHILRO with higher power consumption, or by the 0.6V-80MHz-SHILRO with edge combiner. Note that in order to compare these two conditions for a required similar output frequency (400 MHz), the phase noise of the multiplied output for the 80 MHz oscillator should be higher than in Fig. 3.8 (blue line) by 20 log<sub>10</sub> 5  $\approx$  14 dB. At 300 kHz offset, the injection-locked oscillator phase noise is -87.91 dBc/Hz, when VDD = 1.0 V,  $f_{osc}$  = 400 MHz. For VDD = 0.6 V and  $f_{osc}$  = 80 MHz, the phase noise is -81.31 dBc/Hz. As the spectral mask of MICS band only requires the attenuation of 20 dB at the edge of the 300 kHz-channel, the phase noise requirement of the transmitter is relaxed. Therefore, the phase noise degradation is tolerable at VDD = 0.6 V, while ~3x power saving is achieved.

Spur suppression is another important consideration for wireless communication. With sub-harmonic injection locking, the injected signal is typically a square-wave pulse waveform consisting of large harmonic content. However, only the *n*-th harmonic locks the oscillator while the others appear at the output as spurs with limited suppression. The relative spur level is as follows [36]:

$$\frac{\left|A_{spur}\right|}{\left|A_{out,\omega_{inj}}\right|} = \frac{\omega_L}{\omega_m},\tag{3.35}$$

where  $\omega_{\rm L}$  is the locking range and  $\omega_{\rm m}$  is the frequency difference between the spurious tone and the desired one. Hence, decreasing the locking range can increase the spur suppression, at the cost of increased locking time and the increased probability of losing lock. There is also a tradeoff with phase noise performance, since the locking range  $\omega_{\rm L}$  also determines the loop bandwidth for the amount of phase noise rejection. Multi-phase asymmetry can also contribute to significant increases in spur-to-carrier ratio. These unequal phase spaces can arise due to asymmetric single-phase injection into the ring [11], [12], device variations such as  $V_{\rm TH}$  mismatches under low supply voltage, and capacitive layout mismatches observed in the wiring breakout from the ring oscillator. These phase offsets are low-frequency in nature and can be minimized at chip startup.

#### 3.4.2 Implementation in Wireless Receiver

Super-harmonic injection locking or injection-locked frequency divider (ILFD) can be utilized in wireless receiver for synchronized clock generation. It can significantly reduce the power and area of frequency division, which has been verified by [32] in MICS band (~400 MHz) to divide the input wireless clock by 5. It only consumes 3  $\mu$ W of power at a 1.0 V supply and 100  $\mu$ m<sup>2</sup> of area.

A divide-by-5 ILFD is shown in Fig. 3.9, which consists of a 5-stage ring oscillator and an injection stage at the tail connected with all the 5 delay cells. Therefore, each of the 5 stages draws current sequentially when the external signal is applied to the



Figure 3.9. 5-stage ring oscillator based ILFD.

common point. The non-linearity occurs in the circuit when  $V_{G1}$  is applied at the gate of MN1 and the injected signal  $V_{inj}$  is applied to the tail transistor MS. Assuming weak injection, the harmonics of injected signal can be ignored so that *n* equals to 1. Therefore, harmonic of  $\omega_0$  will mix with the fundamental component of  $\omega_{inj}$ . The parasitic capacitance and the output resistance of the delay cell form the low-pass filtering with a cut-off frequency close to  $\omega_0$ . High-frequency components are filtered and the DC and low-frequency components are rejected by the oscillation loop. Thus, only the component has a frequency of  $\omega_0$  left, where m equals to (*N*-1). It can also be analyzed in another way that the lowest operating frequency sustained at V<sub>S</sub> is *N*-th harmonic of V<sub>Gj</sub> (j=1,2..5). Therefore, when an injected signal at the vicinity of N $\omega_0$  is incident, locking occurs. The injection efficiency  $\gamma$  in (3.34) for this implementation is the ratio of the current caused by the injected signal and the DC bias current. Therefore, reducing the number of delay stage, improving the injection efficiency and using large-swing injected signal can increase the locking range.

A divide-by-5 ILFD is applied to a MICS band super-regenerative receiver [35], [37] to generate periodic quench signal from the synchronized wireless clock. It was measured that only -32 dBm sensitivity is required when the injection frequency is close to the *N*-th harmonic of free-running frequency. Therefore, a very small voltage swing of injected signal is desired for the super-harmonic injection locking.

### 3.5 Summary

A quasi-linear model of ILRO is used to mathematically formulate the frequency and time domain characteristics of the system, as well as the phase noise shaping and jitter tracking behavior. After injection locked, the system shows a great improvement of in-band phase noise and the total integrated jitter decreases in an order of 10<sup>3</sup>. The settling behavior of ILRO is also exploited and shows a strong dependence on the locking range and the initial phase difference of the injected and the resultant signals. Some simulation results are presented to certify the accuracy of theoretical model. A sub-harmonic injection-locked ring oscillator and a super-harmonic injection-locked ring oscillator are applied to a wireless transmitter for frequency multiplying and to a wireless receiver for frequency division respectively. Prominent power and area savings are achieved with tolerable phase noise and spur degradation.

# CHAPTER 4. PHOTONIC FORWARDED-CLOCK INTERCONNECT SYNCHRONIZATION USING INJECTION-LOCKED RING OSCILLATOR

In the previous two chapters, design considerations of ring oscillator and injection locking have been discussed. In this chapter, an injection-locked ring oscillator is applied to a photonic interconnect system to perform clock synchronization. Simulation results of a 2.5 GHz first-harmonic ILRO based clock receiver specially designed for optical interconnect are shown. The ILRO occupies 970  $\mu$ m<sup>2</sup> of die area and consumes 2 mW of power under 1.0 V supple voltage. The input-referred locking range is 350 MHz, the locked phase noise is -134 dBc/Hz at a frequency offset of 1 MHz, and the corresponding integrated jitter is 0.246 ps (RMS) excluding input reference clock jitter.

# 4.1 Introduction

Monolithic silicon photonic interconnects have emerged recently that show great potentials for on-chip core-to-core [38] and off-chip core-to-memory and core-to-other processors [39] applications in future many-core systems. They exhibit significant advantages over electrical interconnects especially for bandwidth×distance over 100 Gbps×meters [40], as the electrical communications suffer from severe high-frequency loss of electrical traces, reflections from impedance discontinuities, and signal crosstalk, limiting the energy efficiency. Besides, features of silicon photonic such as wavelength-division-multiplexing (WDM), THz-bandwidth low-loss waveguide [41] and distance-insensitive energy-per-bit are beneficial to meet the bandwidth requirements without exceeding the power budget.

Fig. 4.1 shows a WDM link, multiple channels of data can be transmitted along a single waveguide without interference so that much greater bandwidth and lower



Figure 4.1. A WDM photonic interconnect.

latency can be achieved [42]. A continuous-wave (CW) laser couples multiple wavelengths to the TX through a fiber. Each of the ring resonators is tuned to a separate wavelength and modulates the data of each channel. Each of the wavelengths propagates along the same waveguide until it is routed through a matching ring to a photodiode (PD)/amplifier which outputs the digital electrical data based on the PD photocurrent. Clock signal is also routed through the same waveguide occurring on  $\lambda_0$ and converted to electrical signal by the optical clock receiver to sample the received data. The resonances of the modulator and drop rings are tuned by the ring tuning control circuits.

The clock generation in Fig. 4.1 is based on the forwarded-clock architecture [12] that the clock in the receiver is propagated directly from the transmitter and synchronizes the channel's data transmission. It ensures that the frequencies of the receiver and transmitter are exactly the same and the clock recovery just needs to rotate the phases so that the clock samples the data at the center. In the traditional embedded-clock architecture shown in Fig. 4.2, the clock is recovered from the



Figure 4.2. One channel of an embedded-clock architecture.

received data and needs to tolerate the frequency offset of the TX and RX caused by the variations of different crystal oscillators. The jitter of the recovered clock will tracks that of the data up to the CDR loop bandwidth; however, the bandwidth is usually smaller than the jitter tracking range of the forwarded-clock structure [43]. The embedded-clock architecture generally consumes much more power than the forwarded-clock one, since recovering clock from the data typically requires 2x oversampling [44]. The forwarded-clock synchronization provides advantages of low power and complexity, while at the cost of an additional forwarded link. However, the power and pin overhead of the clock link can be amortized by many parallel channels.

#### **4.2 Photonic Interconnect Features**

Compared with electrical link, photonic interconnect for short-haul communication has some unique features making forwarded-clock architecture more advantageous. Electrical interconnects usually require timing adjustment for each channel as there are inter-channel delays, while for the optical links, single clock recovery can be used for the whole receiver bank due to optical clock and data signals travelling on a single low-latency waveguide, leading to a large amount of power and area saving. Furthermore, the relative jitter between the clock and data is very small since clock TX and data TX share the same clock fabric and no jitter is added in the channel (optical signaling does not suffer from rail injected noise and crosstalk) [42]. It is very beneficial as the phase/delay locked loop (PLL/DLL) and phase rotator deskewing the clock phase for recovery of the data can be eliminated, signifiantly saving power and area overheard. Phase rotation itself occupies almost half the total receiver power in [8].

Most of the design considerations of optical clock receivers are the same as the electrical ones except the optical front-end and optical-to-electrical conversion. However, they directly affect the bandwidth, sensitivity, area, and power consumption of the link. The optical front-end converts optically modulated clock into electrical domain by sensing the photocurrent produced by the PD and then converting the current to voltage, which will be discussed in the following sections.

#### 4.2.1 Photodiode

The optical clock receiver tunes the drop ring to filter the wavelength of the clock signal and couples the light of the clock to a photodiode, resulting in a current proportional to the number of photons absorbed per second. The *p-i-n* diode is commonly used as the photodiode, which is reverse-biased to ensure a strong field in the *i* region and a very small current in absence of light. Thus, the DC-level of the PD output should be well controlled to ensure adequate reverse-biased voltage.

Fig. 4.3 shows the equivalent electrical model of the PD, which consists of a capacitor parallel with a photocurrent source. As we assume that the PD is connected to the electrical chip through wire bonding, the parasitic elements of bond wire and bond pad are also included. If flip-chip bonding is used, the larger parasitic inductance



Figure 4.3. Equivalent electrical model of a reverse-biased photodiode

and capacitance of the connections can be alleviated. The capacitance of the PD is usually the dominant input load of the optical receiver, affecting the sensitivity, bandwidth and power consumption of the front-end. We choose the parameters for simulations from a commercial photodiode chip PDCS20T [45], which shows a parasitic capacitance of 100 fF. The photocurrent  $I_p$  is proportional to the input optical power  $P_p$  and the diode responsivity R is defined as:

$$R = I_p / P_p. \tag{4.1}$$

If a target sensitivity of -15 dBm is desired, 25  $\mu$ A current should be able to be detected with responsivity *R* of 0.8 A/W. Therefore, the input signal to the following electrical circuit is a very small single-ended current, which challenges the design since the total input-referred noise current from the following circuits and the photodiode itself should be well below the photocurrent and a large transimpedance

gain is required. It will be especially hard for the high-speed links and power performance should be sacrificed to achieve that.

#### 4.2.2 Optical Front-End

As mentioned before, the front-end circuits convert the optically generated current to a voltage waveform. The simplest way to achieve the conversion is by adding a resistance  $R_{in}$  to the input node, thereby limiting the bandwidth as:

$$BW = \frac{1}{2\pi R_{in}C_{in}} \tag{4.2}$$

where  $C_{in}$  is the equivalent input capacitance, mainly determined by the PD capacitance and the parasitic capacitance of bond pad. On the other hand, the maximum resultant voltage swing at the input node is equal to  $V = R_{in} \cdot I_p$ , It is clear that there is a strong tradeoff between bandwidth and sensitivity, which can be alleviated by the current-integrating receiver and transimpedance amplifier (TIA) discussed as follows.

*Current-Integrating Clock Receiver:* The photocurrent is converted to a voltage by integrating onto the capacitor  $C_{in}$  seen at the input, shown in Fig. 4.4. Since it is a clock receiver, there is no external signal to reset the input node. Some discharging mechanism should be added, which is achieved by subtracting an adjustable current from the input node. The DC current is controlled by a feedback loop monitoring the DC value of the voltage of input node  $V_{in}$  [46]. The feedback loop not only adjusts the



Figure 4.4. Current-integrating clock receiver front-end.

DC current but also sets the average voltage of input node, which is necessary in order to maintain the reverse bias of the PD. The front-end gain is given by:

$$R_{INT} = \frac{\eta \cdot T_{CLK} / 2}{C_{in}}$$
(4.3)

where  $\eta$  (<1) is the fraction caused by the discharging and  $T_{\text{CLK}}$  is the period of the clock signal. A small  $C_{\text{in}}$  is preferred to achieve good sensitivity when receiving a very high-frequency clock.

*TIA:* The simplest TIA can be implemented by an inverter and a feedback resistor (Fig. 4.5), which can significantly reduce the effective input resistance while maintain a high transimpedance gain. The effective input resistance and the gain of the TIA are calculated as [47]:

$$R_{in} = \frac{R_f + r_o}{1+A} \tag{4.4}$$



Figure 4.5. TIA based clock receiver front-end.

$$R_{TIA} = \frac{r_o - A \cdot R_f}{1 + A} \tag{4.5}$$

where  $r_0$  and A are the output resistance and the voltage gain of the inverter. Therefore, large A is required to meet bandwidth requirement for certain  $R_{\text{TIA}}$  and  $C_{\text{in}}$ .

The voltage transient slope (voltage versus time) should also be noticed as it determines the effect of voltage noise on jitter. The maximum achievable slope is  $I_p$  /  $C_{in}$ , by removing the feedback resistor and keeping an open between the input and output (the case of current-integrating receiver). A steep slope is desired, as the same amount of voltage noise caused by the transistor or power supple noise will lead to relative less jitter [42].

#### 4.3 Architecture of Optical Clock Receiver

The unique features and front-end implementations of photonic interconnect lead us to come up with possible design choices for the optical clock receiver. One is implemented with current-integrating structure to save power, and the other utilizes TIA as it has better sensitivity.

# 4.3.1 Current-Integrating Clock Receiver with Super-Harmonic Injection Locking

The voltage swing at the input node is relative small when a small optical power is available to the PD when using integrating receiver, indicating the using of an injection locking scheme which requires very small input swing. Utilizing current-integrating structure enables the operation under near-threshold voltage since there is no strong constraint caused by analog front-end. Besides, half-rate forwarded clock matching the Nyquist frequency of the data channel is desirable for better jitter tracking between clock and data, thus 5-GHz forwarded clock is chosen for the data rate of 10 Gb/s [48]. In order to address the problem of slower transistor at low voltage, the data receiver should utilize a highly parallel architecture of 1:8 demultiplexing, so that the sampling clock and quantizers of the data receivers can operate at a much lower frequency (1.25 GHz).

According to the frequency plan, a super-harmonic injection-locked ring oscillator (SHILRO) can be utilized to generate multiply clock phases, since the injection frequency is 4-th harmonic of free-running frequency and it generally does not need full-swing voltage at the injection node. As shown in Fig. 4.6, it consists of an injection stage and a four-stage differential ring oscillator, which generates eight



Figure 4.6. Current-integrating clock receiver with super-harmonic injection locking.

evenly-spaced phases (CK[0:7]) with free-running frequency of 1.25 GHz so that phase spacing between two adjacent phases equals to 1UI (100 ps for 10 Gb/s). For the front-end current-integrating, the simplest way to build a low-pass filter is the single pole *RC* circuit, and the DC value of  $V_{in}$  can be set by  $V_{set}$ . The small-swing voltage  $V_{in}$  is then AC-coupled to the injection node of the SHILRO.

By eliminating the analog front-end (TIA), the integrating receiver significantly saves power, area and complexity, and enables operating in the near-threshold region. However, the sensitivity degrades when the parasitic capacitance of the input node is large and the frequency of received clock is high, which is the case in this application. Since we will not integrate the PD on-die, the equivalent input capacitance will be at least 200 fF considering the parasitic capacitance of PD and wire bonding. If  $\eta = 0.5$  and  $T_{\text{CLK}} = 200$  ps, the gain is 250 V/A by calculating (4.3). Thus, the voltage swing at the injection node of SHILRO is 6.25 mV, if the sensitivity of -15 dBm is required. It is quite hard to get the SHILRO locked with such a small signal and the locking range will be very small according to (3.34).

Even though its application to this project is limited, the current-integrating clock receiver is an energy-efficient architecture for the monolithic photonic receiver, in which the photonic devices are integrated with the standard CMOS process.

#### 4.3.2 Proposed Architecture

Energy-efficiency and bandwidth need to be sacrificed to achieve the requirement of sensitivity, therefore optical receiver based on TIA is utilized. For simplicity, the same TIA and limiting amplifier as those in the optical data receiver are used for the clock receiver.

The scaling of CMOS technology actually adversely affects analog front-end such as the output resistance  $r_0$  which influences the amplifier gain A. Another issue arises when it is operating under lower supply voltage is that the division of transimpedance over power will decrease especially when a large bandwidth is desired. The reduced efficiency of TIA demands an increasing number of following limiting amplifier stages to achieve the required sensitivity, thereby resulting in excessive power and area consumption. Therefore, the optical receiver will be designed in a moderate supply voltage (1.0 V) to alleviate the tradeoff between power and performance and demultiplexing ratio of 1:4 is chosen for the data receiver (2.5 GHz sampling clock). The generation of multiply clock phases is realized by the first-harmonic injection locking because of the robustness, larger locking range, better phase noise and jitter, and the achievable full-swing input. However, a single-to-differential conversation is necessary before the clock signal injects into the two-stage ring oscillator.

The architecture of the proposed clock receiver is shown in Fig. 4.7. TIA and limiting amplifier are the same as that of data receiver with different bandwidth, converting the photocurrent into full swing voltage signal. After the single-to-differential conversion, it is AC-coupled into a two-stage, cross-coupled, pseudo-



Figure 4.7. TIA based clock receiver with first-harmonic injection locking.

differential current-starved ring oscillator to generate four evenly-spaced phases. There are tunable buffers for each phase to trim the phase mismatch caused by the injection and the device variations. The four clock phases are then delivered across the 1600-µm long clock distribution to drive the four quantizers of each receiver, thereby demuxplexing the 10 Gb/s data. There is a 3-bit delay control for each clock phase in order to trim the delay difference of the clock distributions. Four receivers are integrated for the experimental demonstration of a multiple serial link architecture using the same ILRO. The phase deskewing can be done either by the ILRO or by the global tunable buffer (6-bit delay line). However, the ILRO will need to lock to the injected clock at the edge of the locking range when a large phase deskew is required, thereby leading to the degradation of phase noise and yielding more jitter.

### **4.4 Circuit Implementations**

#### 4.4.1 TIA and Limiting Amplifier

TIA and limiting amplifier are the same as what are designed for the data receivers and I simply use them for the clock receiver. TIA is an inverter with shunt feedback which also employs a DC level feedback control to maintain the DC level of the input node. The full swing voltage is achieved by the following limiting amplifier which has two identical stages; each stage consists of two inverters. The second inverter has a similar shunt-feedback like the TIA [49]. This improves the output bandwidth of both the first and the second inverters, eliminating the need for power-hungry CML buffers or area-consuming peaking inductors. However, the improvement in bandwidth comes at the cost of reduced gain, which is compensated by cascading two identical stages.

#### 4.4.2 Single-to-Differential Converter

Differential injection signals are desired for the ILRO since they can intrinsically reject common-mode noise and introduce less phase mismatch due to the symmetrical injection. Therefore, a single-to-differential conversion is needed for this particular application on photonic interconnect.

The CML buffer with a low-pass filter is commonly used for this purpose, as shown in Fig. 4.8. However, the swings of the differential outputs are not equal leading to the duty-cycle problems, because of the finite common mode rejection ration (CMRR). Besides, CML buffer are quite power hungry and area-consuming (implementation of



Figure 4.8. Single-to-differential conversion with CML buffer.



Figure 4.9. Proposed single-to-differential conversion with transmission gate.



Figure 4.10. Monte Carlo simulation of duty cycle caused by single-to-differential conversion: (a) 100 runs of transient simulations and (b) distributions of half-cycle of the output ( $\mu = 201.2 \text{ ps}, \sigma = 1.1 \text{ ps}$ ).

the load resistors) especially when high bandwidth is desired. A simple substitution is shown in Fig. 4.9, by using a transmission gate to control the delay of negative output.

The cross-coupled inverters are utilized to further trim the duty-cycle problem caused by the mismatch. The Monte Carlo simulation result shows that the duty cycle is within 49.5%~51% by running 100 times of transient simulations at the speed of 2.5 GHz (Fig. 4.10). Compared with CML buffer based single-to-differential conversion, this implementation requires full swing input, which means larger gain demanding of optical front-end. However, this overhead is much smaller than what caused by the CML buffer.

#### 4.4.3 ILRO

The ILRO consists of four injectors and a two-stage, cross-coupled, pseudodifferential current-starved ring oscillator, as shown in Fig. 4.11. Of these four



Figure 4.11. Schematic of ILRO.

injectors, two of them couple the received differential clock signals, while the other two are dummies (connected to ground) to make sure all the four phase outputs are with same load to alleviate the phase asymmetry caused by injection-locking interpolation.

All the delay cells in the ring oscillator share a single current source, which can be changed off-chip enabling fine tuning of the free-running frequency of the oscillator. The supple voltage can also be tuned for coarse tuning of the free-running VCO frequency. The phase deskewing can be achieved by tuning the free-running frequency while locked so that the clock phases can sample the date at the center of a bit period. Two pairs of cross-coupled inverters are designed to force complementary phases and the size is carefully chosen to maintain the desired strength of coupling. Half of the size of delay cell is required here since two-stage pseudo-differential structure is used, demanding a relative strong coupling strength to enable oscillation. The crossed-coupling also helps to alleviate the problem of phase asymmetry.

The first stage of clock buffers is used to correct the duty-cycle of the differential clocks as the current source consumes about 200 mV of the voltage headroom, by mismatching the sizes of PMOS and NMOS of an inverter and the following stage of the clock buffers is used to drive the 1600 µm clock distribution. The 6-bit digital-controlled delay line (DCDL) utilized to achieve global phase deskew, is simulated having average resolution of 1.74 ps. It is designed to cover the whole UI (100ps) so that together with the 3-bit local phase trim all the four phases can be deskewed to

sample the incoming data without affecting the phase noise shaping performance of ILRO as discussed before.

#### 4.4.4 Other Building Blocks for Testing

The number of measurement pins is usually limited, thus the open-drain output buffer with pass gates should be utilized to mux out the desired signal among all of which share the same external testing pin and drive the pad.

The standalone electrical testing can be achieved by designing an on-chip PD emulator which has programmable output current to simulate various input power level.

For those nodes that are easily disturbed by off-chip testing like the output of TIA, an on-die oscilloscope is useful to monitor its behavior. The sweeping of both the time and voltage axis should be achieved with desired resolution. The time sweeping can be implemented with the 6-bit DCDL to control the phase delay, and the voltage sweeping can be achieved by change the reference voltage of a sense-amplifier.

## 4.5 Results

The optical receiver bank has been designed in a 65 nm CMOS process and the whole system is shown in Fig. 4.12 (a), which integrates four data receivers, the 1600- $\mu$ m global clock distribution, and a clock receiver (Fig. 4.12 (b)). The layout of ILRO

is designed as symmetrically as possible to further reduce the phase mismatch and only occupies 970  $\mu$ m<sup>2</sup> of the area. Each phase of the clock can be individually selected to drive the output pad CKOUT for testing.



(a)

Go

Deb

(b)

Figure 4.12. Layout of (a) the whole receiver bank and (b) the clock receiver.

The TIA designed for the optical data receiver achieves sensitivity less than -18 dBm with 100 fF parasitic capacitance of PD at 10 Gb/s and consumes 1.5 mW of power. If 2.5 GHz clock is received for the optical receiver, either the sensitivity or the power consumption can be reduced.

The ILRO can be tuned from 2 GHz to 3.1 GHz by tuning its bias current as shown in Fig. 4.13 (bias current can be controlled off-chip continuously), while tuning the supply voltage can provide additional frequency range. The simulated locking range of the ILRO is 350 MHz at 2.5 GHz. The phase deskew versus free-running frequency of ILRO is shown in Fig. 4.14, with a total deskew range of >120 ps by changing the free-running frequency, covering the full UI of 100 ps. The deskewing step can be much smaller than what is shown in Fig. 4.14 as the bias current can be continuously



Figure 4.13. Frequency tuning of ILRO with bias current.



Figure 4.14. Deskewing of ILRO by tuning free-running frequency.



Figure 4.15. Phase spacings when 2.5 GHz clock is injected.

tuned by the off-chip resistor, thereby achieving fine deskewing. To verify the phase symmetry, buffered quadrature output waveforms of the ILRO are overlaid in Fig. 4.15, showing a maximum I/Q phase imbalance of 2.7°.

It has been discussed in Chapter 3 (equation (3.27)) that the phase noise shaping of ILRO will degrade at the edge of locking range, intuitively because  $\omega_L$  will reduce slightly due to the change of  $\theta$  and then more noise will pass through. The jitter and phase noise performance is simulated by fixing the free-running frequency at 2.5GHz, while sweeping the injection frequency. Fig. 4.16 shows that the jitter of ILRO will slightly get worse when the injection frequency moves away from the free-running frequency, and then get dramatically worse at the edge of locking range. Therefore,



Figure 4.16. Jitter performance of ILRO.



Figure 4.17. Phase noise performance of ILRO: (a) single-sided band phase noise profile and (b) phase noise at frequency offset of 1 MHz for different injection frequencies.

ILRO can be used as fine deskewing when the free-running frequency is close to the injection frequency and the following 6-bit DCDL performs coarse phase tuning when a large range of deskewing is needed. Corresponding simulations of the phase noise show similar results in Fig. 4.17.

The phase deskewing can be done without affecting the jitter performance of the ILRO by using the 6-bit DCDL. It provides skew compensation to cover the whole UI with step of 1.74 ps (full tuning range of 109.59 ps).

The power breakdown of the clock receiver is shown in Table 4.1. It should be noted that the power of the clock generation and distribution can be amortized by four receiver channels so that the power consumption of clocking for each channel is very small.

## 4.6 Summary

A forwarded-clock synchronization based on injection locking is designed for 10 Gb/s photonic interconnect according to its specific features. A single clock recovery can be used for all the four channels, resulting in a large amount of power and area saving. Besides, the strong tradeoff of sensitivity, bandwidth and power consumption, leads to choosing the TIA-based clock receiver in order to satisfy the requirement of sensitivity (< -15 dBm). Simulation results show that the optical clock receiver consumes 5.4 mW total power when receiving 2.5 GHz clock, and generating and

distributing four clock phases to all the four channels. Therefore, the amortized power by the four channels is very small (1.35 mW).

TABLE 4.1Power Breakdown

| Supply voltage | Building block                          | Power consumption | Amortized power<br>(4 channels) |
|----------------|-----------------------------------------|-------------------|---------------------------------|
| 1.0 V          | TIA                                     | 1.5 mW            | 0.375 mW                        |
|                | ILRO                                    | 2 mW              | 0.5 mW                          |
|                | 6-bit DCDL                              | 0.2 mW            | 0.05 mW                         |
|                | Clock distribution and local phase trim | 1.7 m             | 0.425 mW                        |
|                | Total                                   | 5.4 mW            | 1.35 mW                         |

## **CHAPTER 5. CONCLUSION**

The rapid advancing of communication and computing systems makes the design of high-performance, high-speed clock generation and distribution more challenging. Therefore, an energy-efficient clocking is desired while maintain the adequate performances regarding phase noise, jitter, area and complexity. Injection locking is very promising since it can suppress the phase noise of the ring oscillator, improve the energy-efficiency, enable a fast startup and conveniently generate multiple time-interleaved phases.

The pseudo-differential structure is a compromising choice of three different kinds of ways to implement a ring oscillation, regarding power, voltage swing and jitter. The tradeoff of power, oscillation frequency and phase noise are exploited and higher power is basically required to achieve higher frequency and lower phase noise.

A quasi-linear model of ILRO is used to mathematically formulate the frequency and time domain characteristics of the system, as well as the phase noise shaping and jitter tracking behavior. After injection locked, the system shows a great improvement of in-band phase noise and the total integrated jitter decreases in an order of  $10^3$ . The settling behavior of ILRO is also exploited and shows a strong dependence on the locking range and the initial phase difference of the resultant oscillating signal and the injected signal. A sub-harmonic injection-locked ring oscillator and a super-harmonic injection-locked ring oscillator are applied to a wireless transmitter for frequency multiplying and to a wireless receiver for frequency division respectively. Prominent power and area saving is achieved with tolerable phase noise and spur degradation.

A forwarded-clock synchronization based on injection locking is designed for 10 Gb/s photonic interconnect according to its specific features. A single clock recovery can be used for all the four channels, resulting in a large amount of power and area saving. Besides, the strong tradeoff of sensitivity, bandwidth and power consumption, results in utilizing the TIA-based clock receiver in order to satisfy the requirement of sensitivity (< -15 dBm). Simulation results of a first-harmonic ILRO based clock receiver show that the optical clock receiver consumes 5.4 mW total power when receiving 2.5 GHz clock, and generating and distributing four clock phases to the four parallel channels. Therefore, the amortized power by the four channels is very small (1.35 mW). The ILRO occupies 970  $\mu$ m<sup>2</sup> of die area and consumes 2 mW of power under 1.0 V supple voltage. The input-referred locking range is 350 MHz, the locked phase noise is -134 dBc/Hz at a frequency offset of 1 MHz, and the corresponding integrated jitter is 0.246 ps (RMS) excluding input reference clock jitter.

There are several issues of the designs, such as PVT variations and phase mismatch, relying on the off-chip calibrations. The phase offsets are low-frequency in nature and can be minimized at chip startup by implementing an on-chip, closed-loop multi-phase timing detection/calibration, which has been previously shown in [50], demonstrating sub-2ps phase resolution. Besides, the wakeup speed also depends on the initial phase

calibration. The phase difference of injected signal and resultant signal can be detected and feed back to control the phase of the injected signal, so that the initial phase difference tracks the stable phase difference in order to enable fast wakeup.

For photonic interconnect, the integration of photonic structure with standard CMOS process is highly desired, as great power and area savings will be achieved by eliminating the interface elements. The monolithic designs also enable the implementation of integrating-current optical receiver as discussed in Chapter 4 in order to further reduce the energy cost and complexity, since the parasitic capacitance can be significantly diminished. There is a large amount of effort should be taken to satisfy the power budget while maintain a high performance clocking.

## **Bibliography**

- [1] http://www.ieee802.org/3/.
- [2] International Roadmap Committee (IRC), International Technology Roadmap for Semiconductors, 2010 edition [Online]. Available: http://www.itrs.net/Links/2010ITRS/Home2010.htm.
- [3] B. Razavi, Monolithic Phase-Locked Loops and Clock Recovery Circuits: Theory and Design, IEEE Press, 1996.
- [4] C.-K. Yang, R. Farjad, and M. Horowitz, "A 0.5-µm CMOS 4-Gb/s serial link transceiver," *IEEE J. Solid-State Circuits*, vol. 33, pp. 713–722, May 1998.
- [5] D.A. Hodges, H.G. Jackson, and R.A. Saleh, *Analysis and Design of Digital Integrated Circuits*, McGraw-Hill, 2000.
- [6] B. Razavi, *RF Microelectronics*, Prentice-Hall, 1998.
- [7] A. Agrawal, P. K. Hanumolu, and G.-Y. Wei, "A  $8 \times 5$  Gb/s source-synchronous receiver with clock generator phase error correction," in *IEEE Proc. CICC*, Sep. 2008, pp. 459–462.
- [8] J. Poulton, R. Palmer, and A. M. Fuller et al., "A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 42, no. 12, pp. 2745–2757, Dec. 2007.
- [9] L. Huang, M. Ashouei, F. Yazicioglu, *et al.*, "Ultra-low power sensor design for wireless body area networks: challenges, potential solutions, and applications," *Int. J. Digital Content Technol. and Appl.*, vol. 3, no. 3, pp. 136–148, Sep. 2009.
- [10]A. C. W. Wong, D. McDonagh, O. Omeni, et al., "Sensium: An ultra-low-power wireless body sensor network platform: design & application challenges," in Proc. 31st Ann. Int. Conf. IEEE EMBS, Sep. 2009, pp. 6576-6579.
- [11] J. Pandey, and B. P. Otis, "A sub-100 μW MICS/ISM band transmitter based on injection-locking and frequency multiplication," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 1049-1058, May 2011.
- [12]K. Hu, T. Jiang, J. Wang, et al., "A 0.6 mW/Gb/s, 6.4–7.2 Gb/s serial link receiver using local, injection-locked ring oscillators in 90 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 899–908, Apr. 2010.

- [13]B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill, 2000.
- [14] John A. McNeill and David S. Ricketts, *The Designer's Guide to Jitter in Ring Oscillators*. New York: Springer, 2000.
- [15] P. Hanumolu, "ECE599: Phase-Locked Loops –I", Class Notes, 2011.
- [16]J. McNeill, "Jitter in ring oscillators," *IEEE J. Solid-State Circuits*, vol. 32, pp. 870–879, June 1997.
- [17] T. C. Weigandt, B. Kim, and P. R. Gray, "Analysis of timing jitter in CMOS ring oscillators," in *Proc. ISCAS*, June 1994.
- [18]B. Razavi, "A study of phase noise in CMOS oscillators," *IEEE J. Solid-State Circuits*, vol. 31, pp. 331–343, Mar. 1996.
- [19] A. Hajimiri, S. Limotyrakis, and T. H. Lee, "Phase noise in multigigahertz CMOS ring oscillators," in Proc. Custom Integrated Circuits Conf., May 1998, pp. 49–52.
- [20]D. B. Leeson, "Simple model of a feedback oscillator noise spectrum," *Proc. IEEE*, vol. 54, no. 2, pp. 329–330, Feb. 1966.
- [21] A. Hajimiri, S. Limotyrakis and T. H. Lee, "Jitter and phase noise in ring oscillators," *IEEE J. Solid-State Circuits*, vol. 34, no. 6, pp. 790-804, June 1999.
- [22] A. E. Siegman, Lasers. Mill Valley, CA: University Science Books, 1986.
- [23] R. Alder, "A study of locking phenomena in oscillators," *Proc. IRE*, vol. 34, pp. 351–356, Jun. 1946, reprinted in *Proc. IEEE*, vol. 61, pp. 1380–1385, Oct. 1973.
- [24]G. R. Gangasani and P. R. Kinget, "Time-domain model for injection locking in nonharmonic oscillators," *IEEE Trans. Circuits Syst. I*, Reg. Papers, vol. 55, no. 6, pp. 1648–1658, July 2008.
- [25]X. Lai and J. Roychowdhury, "Analytical equations for predicting injection locking in LC and ring oscillators," in *IEEE Proc. CICC*, Sep. 2005, pp. 461–464.
- [26]B. Mesgarzadeh, and A. Alvandpour, "A study of injection locking in ring oscillators," in *Proc. IEEE Int. Symp. Circuits and Systems*, vol. 6, pp. 5465-5468, May 2005.
- [27]B. Razavi, "A study of injection locking and pulling in oscillators," *IEEE J. Solid-State Circuits*, vol. 39, pp. 1415 1424, Sept. 2004.
- [28] B. Mesgarzadeh, Low-Power Low-Jitter Clock Generation and Distribution, Ph.D.

thesis, 2008.

- [29]M. Mansuri and Chih-Kong Ken, "Jitter optimization based on phase-locked loop design parameters," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 11, pp. 1375-1382, Nov. 2002.
- [30] N. Lanka, S. Patnaik, and R. Harjani, "Understanding the behavior of injection locked LC oscillators," in *IEEE Proc. CICC*, Sep. 2007, pp. 667–670.
- [31]J. Lee and H. Wang, "Study of subharmonically injection-locked PLLs," *IEEE J. of Solid-State Circuits*, vol. 44, no. 5, pp. 1539-1553, May 2009.
- [32]J. Hu and B. Otis, "A 3μW, 400 MHz Divide-by-5 Injection-Locked Frequency Divider with 56% Lock Range in 90nm CMOS," *IEEE Radio Frequency Integrated Circuit (RFIC) Symposium*, June 2008, pp. 665-668.
- [33] W.-Z. Chen and C.-L. Kuo, "18 GHz and 7 GHz superharmonic injection-locked dividers in 0.25 μm CMOS technology," *Proc. of ESSCIRC*, Sep. 2002, pp. 89-92.
- [34] Hamid R. Rategh and Thomas H. Lee, "Superharmonic injection-locked frequency dividers", *IEEE Journal of Solid-State Circuits*, vol. 34, no 6, pp. 813-821, June.1999.
- [35]C. Ma *et al.*, "A Near-Threshold, 0.16 nJ/b MICS Transmitter with 0.18 nJ/b Noise-Cancelling Super-Regenerative Receiver for the Medical Implant Communications Service", submitted to *IEEE Trans. Biomed. Circuits Syst.*.
- [36]S. Dal Toso, A. Bevilacqua, M. Tiebout, *et al.*, "UWB fast-hopping frequency generation based on sub-harmonic injection locking," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2844–2852, Dec. 2008.
- [37]C. Hu, *Energy-Effcient, Short-Range Ultra-Wideband Radio Transceivers*, Ph.D. thesis, Oregon State University, 2011.
- [38]G. Kurian *et al.*, "ATAC: a 1000-core cache-coherent processor with on-chip optical network," in *Proceedings of PACT'10*. NewYork, NY, USA: ACM, 2010, pp. 477-488.
- [39]C. Batten *et al.*, "Building manycore processor-to-dram networks with monolithic silicon photonics," *HOTI '08*, pp. 21-30, Aug. 2008.
- [40] A. Krishnamoorthy *et al.*, "Progress in low-power switched optical interconnects," *IEEE JSTQE*, vol. 17, no. 2, pp. 357-376, Apr. 2011.
- [41]I. A. Young et al., "Optical I/O technology for tera-scale computing," IEEE

Journal of Solid-State Circuits, vol. 45, no. 1, Jan. 2010, pp. 235-248.

- [42] M. Georgas, J. Leu, B. Moss, C. Sun, and V. Stojanovic, "Addressing link-level design tradeoffs for integrated photonic interconnects," in *IEEE Custom Integrated Circuits Conf. (CICC)*, Sep. 2011, pp. 1–8.
- [43]B. Casper and F. O'Mahony, "Clocking analysis, implementation and measurement techniques for high-speed data links-a tutorial," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 56, no. 1, pp. 17-39, Jan. 2009.
- [44] J. Jaussi, B. Casper, and M. Mansuri *et al.*, "A 20 Gb/s embedded clock transceiver in 90 nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2006, pp. 340–341.
- [45]http://www.enablence.com/components/solutions/transmission/photodiodes/produ ct/25-gbs-ingaas-photodiode.
- [46] A. Emami-Neyestanak *et al.*, "A 1.6 Gb/s, 3 mW CMOS receiver for optical communication," in *IEEE Symp. VLSI Circuits Dig.*, Jun. 2002, pp. 84–87.
- [47]C. Kromer et al., "A low-power 20-ghz 52-db-ohm transimpedance amplifier in 80-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 6, pp. 885-894, 2004.
- [48]K. Hu et al., "0.16-0.25 pJ/bit, 8 Gb/s Near-Threshold Serial Link Receiver With Super-Harmonic Injection-Locking," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 8, pp. 1842-1853, 2012.
- [49] J. Proesel, C. Schow, and A. Rylyakov, "25 Gb/s 3.6 pJ/b and 15 Gb/s 1.37 pJ/b VCSEL-based optical links in 90 nm CMOS," in *IEEE Int. Solid-State Conf. Dig. Tech. Papers*, Feb. 2012, pp. 418–419.
- [50]L. Xia, J. Wang, W. Beattie, J. Postman, and P. Chiang, "Sub-2ps, static phase error calibration technique incorporating measurement uncertainty cancellation for multi-gigahertz time-interleaved T/H circuits", *IEEE Trans. on Circuits and System-I*, vol. 59, no. 2, pp. 276–284, Aug. 2011.