Abstract: This paper presents a new standing-wave clock distribution scheme utilizing an inductor loading technique. The scheme can distribute multi-ten GHz uniform-phase/amplitude global clocks over a whole chip. By loading an adequate inductor, the grid pitch can be made finer than that for the conventional standing-wave technique, and clock frequency can be designed to be independent of the grid pitch. We designed a 20 GHz 6 x 6 grid clock distribution network using a 0.18 µm CMOS technology. For all 36 clocks, low skew less than 1.3% of the clock period and low amplitude deviation of 2% were obtained from circuit simulation.
Introduction
Global clock distribution becomes increasingly difficult for multi-GHz microprocessors. Timing uncertainty must be reduced with clock period, but skew and jitter are proportional to latency, which does not scale with clock period for the conventional H-tree structure. In addition, clock power consumption dominates the total power because the clock signal is most active signal. Almost global clock distributions today take the form of tree-driven grids. These trends worsen clock skew and jitter, and make global clock power consumption a growing concern. There are resonant techniques [1] - [3] as one of the answers. Traveling-wave clock distributions [1] use coupled transmission line rings to generate low-skew and low-jitter clocks, but must contend with nonuniform phase across the distribution. Standing-wave clock distributions have also been proposed at chip-level [2] . The distribution scheme attained low-skew and low-jitter clocks, but its amplitude varied spatially across the network. In addition, the grid pitch of the distribution network has physical restrictions from the resonant frequency.
To overcome these problems, we propose a new standing-wave clock distribution scheme. Principle of the technique is explained by analyzing phase behavior of the transmission line with inductive load, and simulation results of a 20 GHz clock distribution network with a 0.18 µm CMOS is demonstrated. Figure 1 shows the proposed standing-wave clock distribution scheme. For simplicity, a 6 x 6 clock grid is shown. It has a 2D grid structure of differential transmission lines and 36 grids of the transmission lines. An inductor is placed at each grid with a cross-coupled transistor pair, which realizes negative resistance for compensating the loss of transmission line. The inductor realizes inductive load and it leads the phase of the reflected wave. Thus, the key idea is to cut the low-amplitude segment away from a conventional standing wave by employing a lumped inductive load. The proposed clock distribution scheme provides uniform-phase and almost uniform-amplitude global standing-wave clock to the whole of a chip. By loading an adequate inductor, the grid pitch can be made finer than that for the conventional standing-wave technique, and clock frequency can be designed to be independent of the grid pitch. Since the fine pitch driver structure does not require deep tree-driving, we can reduce the latency, skew, jitter and power consumption. Figure 2 (a) shows a simple structure that has a transmission line with two inductive loads at both ends. As a counterpart, a long transmission line shorted out without inductive load is also shown in Fig. 2 (b) . It has the conventional standing-wave-resonance mode and the first resonant standing wave is illustrated in Fig. 2 (b) . As known well, the standing wave is caused by superposing the incident wave and the reflected wave. The conventional standing-wave technique.
A Standing-Wave Clock Distribution with Inductor loading

Topology of a Standing-wave Clock Distribution
Analysis of standing wave on transmission line with inductive load
Let us consider an incident wave V r (Y ) and a reflected wave V l (Y ) at location "Y ", as shown in Fig. 2 (b) . The reflected wave phase θ l (Y ) can be expressed with the incident wave phase θ r (Y ):
where
and β is the phase constant of the transmission line and l is the distance between the location "Y " and the end of the transmission line. Γ sh and Γ sh are the reflection coefficient of shorted transmission line and its phase property, respectively.
On the other hand, the incident wave and the reflected wave at location "X" shown in Fig. 2 (a) can be expressed as
and ω is the resonance angular frequency. Here, if the inductive load satisfies the following equation
the phase differences between the incident and the reflected waves are identical in (a) and (b)
In this case, (a) begets an interesting standing wave with the low-amplitude segment cut away from the conventional standing wave, as depicted in Fig. 2 (a) . Furthermore, (a) can have the same resonance frequency as (b), which we explain shortly, in spite of the shorter transmission line. The proposed standing-wave technique with inductive load results in uniform-phase and almost uniform-amplitude standing-wave oscillation by short transmission line, that is, finer grid can be used than conventional standing-wave technique. By similar analysis, the resonance frequency can be directly calculated as
where l t is the length of the transmission line of Fig. 2 (a) . It shows that the resonance frequency can be controlled by the value of inductive load while keeping the length of the transmission line short.
Test Chip Design and Simulation Results
A 20 GHz clock distribution network with the new scheme was designed using a 0.18 µm digital CMOS technology. The transmission line and the inductor are designed, as shown in Fig. 3 (a) . The inductor is implemented as a spiral inductor with 4th and 5th metal layers. The differential transmission lines are implemented as a coplanar structure with 6th metal layer. VDD and GND lines are placed under the transmission line by 2nd and 1st metal layer, respectively. The outside diameter and internal diameter of inductors are 70 µm and 50 µm, respectively, and the number of turns is 1.75. The transmission line length between the inductors is 400 µm. The cross-coupled oscillators are distributed around the inductors. Figure 3 (b) shows the layout of the test chip with 6 x 6 grid structure. Circuit simulations are carried out for 6 x 6 grid structure. Inductor, differential transmissions line and VDD-GND line were modeled as rational functions for time-domain simulation. The parameters in the rational functions have been approximated from S-parameters calculated by the electromagnetic field solver [4] . Figure 3 (c) shows simulated 20 GHz oscillating waves at the center and both ends on one differential transmission lines. The difference of the amplitudes was less than 32 mV, which is 2% of the peakpeak amplitude. Figure 3 (d) shows simulated oscillating waves at all the 36 grids. When inductors have ±10% variation, the skew is less than 670 fs, which is 1.3% of the clock period. The power consumption is 10.8 mW/grid at 1.8 V supply voltage.
Conclusion
The standing-wave clock distribution scheme with inductive loads has been presented. The key technique is to cut low-amplitude segment away from conventional standing wave by employing the phase shift of inductive load. We designed a 20 GHz global clock distribution network in a 0.18 µm digital CMOS technology and confirmed the high quality clock distribution i.e. the clock skew among all grids was less than 1.3% to of the clock period, the amplitude deviation was less than 32 mV; the power consumption was 10.8 mW/grid.
