Abstract-We analyze the synchronizability of synchronous distributed oscillators (SDOs) [3], a novel clocking scheme for microprocessors. A computer-aided perturbation analysis is developed for such systems, where analytically tractable equations of the clock phases are reduced from experimental data reflecting all circuit details. Using this reduction, a theory is constructed to explain the underlying mechanism of the synchronization in SDOs. It systematically explains the observed phenomena, the existence and stability of the (mutually) synchronized states, and the transition from the in-phase synchronized state to the out-of-phase (but still synchronized) state. Furthermore, the present theory of phase reduction provides a new design principle of coupled oscillators based on "the equation"; a precise delay control (less than the gate delay) circuit can be designed in a simple and general form.
I. INTRODUCTION
D IGITAL large-scale-integrated circuits (LSIs) are generally based on a synchronous scheme: a global clock signal acts as the "conductor of the orchestra," and each computing element acts as a "member of the orchestra," performing its operations synchronously at the command of the "conductor." Distribution of the clock signal in an LSI is thus an issue directly affecting the computing power of the LSI. However, the increasing size of circuits and rising clock frequencies are making it harder for only one "conductor" to distribute the clock signal (hereafter "the clock" for short) within an allowable phase error due to the skew and jitter. Clock networks using distributed voltagecontrolled oscillators are good candidates for overcoming this difficulty because they can make multiple, mutually synchronized "conductors" that distribute the clock to all the "members" in unison [1] - [3] . Namely, this synchronous distributed oscillator (SDO) approach has advantages: it reduces skew and jitter coming from clock buffers and inter-line coupling in the conventional clock distribution approaches. Also, in [3] , static jitter and skew caused by variation in the threshold voltage (as well as the supply voltage at one of the oscillators) is numerically investigated, where both jitter and skew are always reduced, compared to those in the conventional (noncoupled) multiple phase-locked loop (PLL) method. Although experimental data [2] , [3] supports the feasibility of such distributed clock oscillators for high-performance circuits working at gigahertz frequencies, there is still insufficient understanding of the ability and limit of their synchronization. For this purpose, a general and practical theory of such distributed systems is required. However, in such experimental environments, even though synchronization phenomena can be observed clearly, neither the governing equation nor an analytically tractable model is available from the system, and this hampers theoretical insights into the synchronization.
Here, a method is developed to cope with such situations. We derive a set of simple equations for the clock phases, from experimental data reflecting all circuit details. This derivation is based on the phase response curve theory and the averaging method, which has been established in studies of nonlinear physics and mathematical biology [10] - [14] . What is new and important in the present study is the use of the experimental data about the phase response from the (weak) impulsive perturbations to the oscillating element. Such impulsive perturbation analysis has been recently devised in the insightful study by Hajimiri and Lee [4] on the phase noise in electrical oscillators. The present study, on the other hand, focuses on the effect of certain regular perturbations applied to the oscillator, rather than on statistical properties derived from random perturbation. The resulting phase equation of the clock explains the synchronization ability of SDOs and predicts the limit of synchronization due to signal delay and distortion. This prediction agrees with observations by Mizuno and Ishibashi [3] , and it shows the validity of the phase reduction method for this system. The phase reduction method developed for SDOs can also be used as a design methodology for certain coupled oscillators; we can design a circuit of coupled oscillators from its phase equation with desired properties, and time-consuming exploration for circuits with circuit simulators can be eliminated.
In Section II, we briefly review the background and some observations for an SDO circuit. Section III introduces basic ideas and definitions required for the analysis of the system. In Section IV, a computer-aided impulsive perturbation analysis is developed for the phase equation of the system. As a natural consequence, a systematic explanation is obtained for the synchronization observed in SDOs. Furthermore, based on the phase equation, we are led to a new design methodology for coupled multiple oscillators; a precise phase resolution (less than the gate delay) generator is proposed as an example.
II. SDOS AND EXPERIMENTAL RESULTS
The core circuit of the SDO is simple: it consists of CMOS oscillators and wires interconnecting them (Fig. 1) . The oscillators can be ring oscillators (ROs) or differential ring oscillators depending on the application. As shown in Fig. 1(a) , such oscillators generally have an odd number of inverting devices (e.g., NOT-gates) in a closed loop. This creates an unstable state, resulting in robust oscillation with a rectangular waveform being used as the clock signal. Qualitatively, each inverting device can be roughly modeled as a switch (with a finite delay) that inverts a digital HIGH '1'/LOW '0' input to a LOW '0'/HIGH '1' output. However, modeling this with quantitative accuracy is not an easy task because each inverter is a highly nonlinear device, reflecting the physics of solid-state circuits. Here we consider an RO with a CMOS inverter [i.e., a pair consisting of a pMOS gate and an nMOS gate; see Fig. 1(c) ], as used in the experiments of Mizuno and Ishibashi [3] . Semi-empirical models are used in the circuit simulator for the gates to reflect the measured characteristics of the devices (we used the LEVEL3 model in HSPICE [5] , [6] ), so a numerical transient analysis of the system can be done, taking into account the details of the gates. Mizuno and Ishibashi [3] used 0.25-m CMOS gates in their simulations, but we used 1.6/1.2-m (pMOS/nMOS) CMOS gates to make it easier to reproduce the simulation.
Also, the interconnection between oscillators is a dynamical system. In the simplest case, it can be modeled as a chain of resistors and capacitors [an RC chain shown in Fig. 1(b) ] described by linear ordinary differential equations (ODEs). However, the interconnection between oscillators often includes multiple inverters for waveform regeneration, and nonlinear ODEs are required for this case. Thus, a precise description of the core circuit of the SDO requires a large number of lines of code in the circuit simulator, which can have an "if then, else" structure. Thus, the detailed description of the circuit itself is quite complicated and far from analytically tractable. However, certain clear patterns of mutual synchronization were observed in the experiments done by Mizuno and Ishibashi (using a test chip operating at 200-400 MHz [3] ). They also made systematic (transistor-level) circuit simulations to explore the synchronization ability of the SDOs. We followed their simulations (with different CMOS parameters as mentioned above) and obtained the same patterns of the mutual synchronization, which are summarized as follows.
1) The interaction between oscillators did not alter the original waveform in each (uncoupled) oscillating element 1 ; the waveform itself was robust under the interaction. 2) Depending on the length of the wire between adjacent oscillators, the system showed two different stabilized states: complete synchronization with zero phase difference (for shorter ) and synchronization with a constant phase difference (for longer ) (Fig. 2 ).
3) The above two different synchronized states showed a sharp transition at a certain wire length (Fig. 2 ).
The mechanism underlying these phenomena will be clarified in Section IV.
III. BASIC IDEAS AND DEFINITIONS
To make the description simpler, we focus on the case of a one-dimensional (1-D) SDO [ Fig. 1(a) ] here. This simplification is not essential and the same analysis can be made for any network topology.
While the system can be analyzed by dividing it into oscillators (ROs) and interconnections (wires), we use virtual "cells" [the dotted rectangles in Fig. 1(a) ] to group oscillators and wires together. This simple idea turns out to be powerful; it enables the following impulsive perturbation analysis and facilitates its extension to higher dimensional networks and to more complex interconnections having nonlinear effects.
Given a cell, we need information about its phase response to impulsive perturbations, 2 i.e. If we have a stable periodic oscillation, an impulsive perturbation (e.g., a current injection) can be applied at any phase (timing) of the one-cycle oscillation. The perturbed oscillator is pulled back to the original oscillation immediately after the impulse hits it, but a small phase shift (compared to the nonperturbed oscillator) remains. Thus, we can define a function of the phase shift with respect to the timing at which the impulse is applied, and this function is periodic with the period of the clock signal. We call it the ISF here. If we fix the timing of the perturbation, the amount of the phase shift is determined by the amount of the perturbation (the pulse height). For all the cases (of circuits, circuit parameters, and the timing of perturbation) that we considered, a linear relationship held between the phase shift and the pulse height, in a certain region around the zero pulse height [see the dotted central region in Fig. 3(a) ]. We call this region with linear characteristics the linear response region, and it can be numerically (or experimentally) identified. Because of this linearity, the ISF is uniquely defined in the LRR. Interestingly, the ideas of ISF and LRR have been used in studies related to limit-cycle oscillators in nonlinear physics and theoretical biology [9] - [13] . In those cases, the ISF is commonly called the phase response curve (PRC), and the LRR is considered as the neighborhood of the limit-cycle where an isochrone (i.e., a manifold having the same phase) can be defined.
The notions of LRR and ISF for the voltage-controlled oscillator (VCO) are devised in the recent study of phase noise by Hajimiri and Lee. They first showed that a combination of circuit simulators and the notions of LRR and ISF provide a qualitatively and quantitatively powerful tool for jitter analysis and for designing VCOs with low jitter. More recently, Demir et al. presented a rigorous nonlinear analysis of phase noise based on the Floquet theory [14] . They pointed out that the (small) perturbation to the limit-cycle oscillator can be decomposed into the tangential direction of the limit-cycle and the subspace spanned by the remaining Floquet eigenvectors. Thus, for a weakly perturbed single VCO (limit-cycle oscillator), an experimental approach by Hajimiri and Lee and a mathematical foundation by Demir et al. is now established. Then, we are naturally led to the analysis of interacting multiple VCOs. In the next section, we consider the phase dynamics of the distributed, interacting VCOs.
IV. PHASE MODEL AND ITS IMPLICATIONS
In an LSI, the interconnecting wires have high resistance (more than 100 /cm), and the maximum current in them is so small (less than a few milliamperes) that the interacting oscillating elements in the SDO fall into the LRR. Here we consider the LRR and ISF of the virtual cell shown in Fig. 1(a) . As discussed in Section III, the LRR is defined by the relationship between the injected impulse height and the phase shift of the oscillator in the cell, and the ISF is defined by the timing of the impulse and the corresponding phase shift. Then, the (normalized) phase shift at time (due to the unit impulse injected at time ) is given by (1) where is the maximum charge displacement across the capacitor on the node, is the unit step at , and is the ISF, which has a period of . 3 Because each oscillator has a robust oscillation (a limit-cycle), the waveform of the oscillation (and the natural angular frequency ) is robust under the weak interaction between oscillators. However, the instantaneous phase of each ( th) oscillator can be gradually shifted by the perturbation, and this accumulated phase shift can be expressed by (2) where is the current injected into the cell and is the (fixed) time delay for the injected impulse to reach the oscillator. 4 Thus, the phase is determined by the current . Conversely, this current is determined by the interaction of the adjacent cells, the phases and as follows. In each cell ( th cell), the oscillator has a robust waveform (i.e., the clock signal) and drives the voltages on the interconnecting wire. Then, the oscillating voltage on the outer nodes of the th cell is a certain function of the instantaneous oscillation phase of the oscillator:
, as far as the oscillation is stable and the interconnection is fixed. It should be noted that the waveform is not exactly the same as the clock signal on the oscillator, since the clock is delayed and distorted along the wire until it reaches the outer node. As the outer nodes of the cells are resistively connected, the current (from the th cell to the th cell) is given by , where is the resistance between adjacent cells [ Fig. 1(a) ]. Then, (2) becomes (3) or equivalently (4) which implies that a closed relationship holds between and . By setting we obtain
In (5), is small (less than a few mA), and is very large (several hundred megahertz-1 GHz); the time evolution of is much slower than that of . This situation leads naturally to an application of the averaging method [6] [well-known in the ordinary differential (ODE) theory], where the slow motion of can be reduced by integrating with one cycle of . This integration can be handled by using the (numerically obtained) Fourier series of and , which leads to the following form: and (7) where is a (small) frequency deviation from the synchronized frequency ; this frequency deviation comes from the interaction (perturbation) effect as well as the threshold voltage and the supply voltage variation. Process variation may possibly affect the shape of slightly. However, as we will see, introducing a small variation in as well as does not alter the system's synchronization ability.
As in (6), computing requires and . Although has been defined by the maximum charge displacement , the exact value of is not necessarily needed for computing . Instead of , we considered a fixed charge displacement ( that corresponds to the pulse size) and systematically obtained the phase shift for different cases as in Fig. 6 (later). It should be noted that the time delay of the impulse (the phase delay ) vanishes in (6) and (7). However, and contain information about the effects of the time delay, and the shape of reflects the signal delay and distortion.
A. Synchronization Limit of the SDO
In the 1-D linear SDO, the interconnecting wire (RC chain) has a symmetric bidirectional nature, and the currents from both (left and right) adjacent cells do not interact with each other. Thus, for this particular case, the phase equation (7) takes the following form: (8) As shown in Fig. 4 , the form of strongly depends on wire length . Although holds for any wire length (the in-phase state always exists, see Appendix I), the slope at can be positive or negative. For wire lengths less than 40 mm, this slope is positive, and for longer cases ( mm), it becomes negative and another zero crossing point appears as shown in Fig. 4 .
In Mizuno and Ishibashi's study [3] , the global effect of variation (as well as supply voltage variation) at one of the oscillators is numerically investigated, where it is observed that the static skew is proportional to deviation. The static skew is the phase difference between oscillators and it corresponds to in (8) . Also, the natural frequency deviation is proportional to deviation. From (8) , it is clear that the static skew is proportional to if the oscillators are mutually synchronized . Thus, the presented theory is consistent to the observation in [3] regarding the effect of deviation. If we consider a 1-D circular network of SDO, a synchronized state with uniform phase differences: with ( : integer) is possible. For such a solution of (7), linear stability theory proves that for positive the synchronized state is stable and for negative it becomes unstable (see Appendix II). For example, let's consider circular SDOs with five or six (an odd or even number of) oscillators. Numerical simulations (with HSPICE) show that, for mm, only in-phase synchronized states are observed for both cases. This can be explained theoretically as follows: for , and implies that this state is stable. On the other hand, for a (formal) solution with , a certain amount of the frequency deviation is required because must be satisfied. However, is restricted to some small range around because in ROs the oscillation period is proportional to the gate delay of each invertor and this cannot be altered much by the weak interaction due to
. Then, such a phase-lagged solution is not realized. Thus, the in-phase solution with is the only stable state for the shorter wire case. In contrast to the shorter wire case, for larger ( mm), only phase-lagged synchronized states are observed numerically for the case of . Although the in-phase state is possible ( , as shown in Fig. 4 ), the slope of is negative and this implies that the in-phase state is unstable and cannot be realized. On the other hand, the phase-lagged (anti-phase) state with is realized for the case of , because are positive, and has a zero crossing point close to and becomes small (as observed in the case of 80 mm in Fig. 4 ). For the case of , the phase-lagged state with does not exist, and neither the in-phase nor the phase-lagged synchronized states are observed.
As we have observed, there is a critical length ( mm) between the stable in-phase state and the phase-lagged state, at which holds. This critical length is unpractically large for the CMOS parameters we considered. However, it scales down as the CMOS gates are scaled. As shown in Fig. 2 , is around 14 mm for 0.25-m CMOS parameters (with an operating frequency of around 600 MHz), where the value of is in the "practical" range. Although we have chosen a particular parameter set for the computer-assisted derivation of the phase models, the predicted in-phase to phase-lagged state transition does not depend on a particular set of parameters. Our theory thus agrees with Mizuno and Ishibashi's results [3] . Although the shape of gradually approaches a sinusoidal wave as becomes larger, the position of the (on these curves) is observed to gradually shift to the right as becomes larger. This can be understood by the fact that and reflect the internal delay of the cell and this delay is mapped to the phase shift of . Thus, the emergence of the phase-lagged synchronized state can be explained by the deformation of , and the underlying mechanism of a sharp transition at (but still supposed to be continuous) is clarified. The above predictions guarantee the correct function of the SDO for , that is, there is a unique, stable, in-phase state. The theory also predicts that a limit of the SDO appears at . Beyond , there is no stable in-phase state, which means that the SDO no longer provides correct synchronous clocking for LSIs. Although we have not proved that matches half the wavelength of the clock signal on the wire, we do have numerical evidence for it. Therefore, for the SDO to function correctly, the wire length should be shorter than a certain length, which is thought to be inversely proportional to the clock frequency, that is, there is an upper limit on the operating frequency in the SDO.
B. Novel Circuit Design
Having obtained a method for reducing the interacting oscillators into a set of equations in terms of the phases, it is now possible to design and analyze circuits based on "the equation." We consider here a precise delay generating circuit as a practical example. In a single RO, the delay resolution (minimum phase difference between two voltage signals on the nodes) is bounded by a single gate delay . However, if multiple ROs are coupled in a certain manner, the delay resolution can be of the order of . In the previous (and first) circuit proposed for this purpose, special 2-input, 1-output buffers (instead of usual CMOS inverters) were developed, and the resulting circuit had two-dimenisonal network topology with many interconnections [8] . Their circuit seems beautiful and somewhat ingenious. However, a question arises: if certain good oscillators (in particular, ones having noise immunity) are available, is it not possible to achieve the function by simply interconnecting those oscillators (instead of developing a special buffer)?
Based on "the equation" discussed above and an analogy to the central pattern generator (CPG) circuit [11] , a simpler and general formulation of circuits is possible, where the network topology is 1-D (fewer interconnections), and each oscillator can be any type of oscillator (in practice, noise immunity is essential). As a realization of the above idea, we consider here the simplest case, i.e., ROs and one inverter on the wire (Fig. 5) . The cell can be defined as in Fig. 5 . For this choice of the cell, we have checked that the voltage waveforms at nodes (l), (a), and (r) in the isolated cell (see 5 ) coincide with those in the full system (when the cells are interconnected). Thus, this choice of cell is well defined. In contrast to the symmetric interaction in the SDO, in this CPG-like circuit, the interaction is highly asymmetric; the LRR and ISF must be defined separately for the impulses from the left and right directions. Then, the phase equation for this system takes the following form: (9) In (9), unlike (8) for the SDO, the effect from the left ( )th cell dominates and from the right ( th) cell is negligible, because the phase shift is negligible for any timing of the perturbation from the node (r). has multi-modal characteristics and there are two zero crossing points as shown in Fig. 7 ; one is very close to (this is analytically obtained in Appendix I). At this point, the slope is negative, and the other one is close to and and here is positive and can be close to for a certain . This implies that in (9) , so the phase-lagged state is the only stable synchronized state (see Appendix II). Numerical simulation of this system verifies this prediction (as shown in Fig. 8 for the case). Thus, a difference between our CPG-like circuit and the SDO can be clarified from their phase equations (8) and (9) . For the CPG-like circuit in Fig. 5 , the phase-lagged state (uniform phase difference of radian) is the only stable state, as opposed to the in-phase state (no phase difference) for the SDO. This prediction was verified using HSPICE under the same simulation (9)) with resistance R = 100 . protocol used in the above SDO case (Fig. 8) , including the simulation in a slightly noisy environment.
V. CONCLUSION
This study has presented a computer-aided phase reduction method for coupled oscillators; the SDO. The method is based on the fact that the oscillators interact weakly in a practical situation on an LSI chip, and on the use of the ISF (or PRC) from experimental data. Explicit dynamical phase equations for ROs and coupled ROs can be derived in the first place, which enables theoretical insights into their synchronization ability as well as synchronization limit. The presented phase reduction has general applicability, and this benefits an "equation-based" design of coupled oscillator systems that have a simple and general structure.
APPENDIX I ANALYTICAL DESCRIPTION OF THE PHASE COMPARISON CHARACTERISTICS
This appendix shows an analytical approach to describe some properties of : a zero crossing point of and the slope of at that point. From the definition (6), the phase comparison characteristics can be determined by and . For the 1-D SDO, the waveforms of and are exactly the same and can be given by (10) and the corresponding ISF becomes (11) with . Then, is obtained as (12) From (12), we have , which explains why all phase comparison characteristics in Fig. 4 (obtained from the experimental data of the ISFs) pass the .
For the CPG-like circuit, the waveforms of are slightly different and the phase shift due to the inverter should be taken into account as follows: (13) and (14) with . Then we have (15) If and , then holds. However, as and , and and are slightly different, the zero crossing point of (15) is slightly shifted from . This explains why the zero crossing point in the experimental data is close to , as shown in Fig. 7 . Now we consider the situation where a "cell" (defined in Section III) is perturbed by an external impulse. For this cell, a limit cycle is defined in the (high-dimensional) state space and the external impulse is given by a perturbation vector at some point of the limit cycle . In the neighborhood of , a local coordinate is given by the tangential direction of and the "isochrone: which defines the phase of the oscillation. On this coordinate, the perturbation is decomposed into two elements: one is tangential to and the remaining is tangential to the isochrone. In the present situation we consider, the limit cycle is strongly attractive and the remaining element dies out immediately (after a signal delay in the cell). It is noted that the tangential direction of is not necessarily orthogonal to the isochrone. (A simple model is analyzed to consider this nature in [15] .) However, if this orthogonality is satisfied (at least approximately), a great deal of simplification can be made in the expression of . Here we will consider such a hypothetical situation and see what can be concluded.
If the orthogonality is assumed, the phase shift (due to ) is simply given by (16) where denotes the time derivative of and it is tangential to . If we consider the situation where the state variables are node voltages and an impulse is applied to the th node, then , and are respectively given as , and , where denotes the derivative with respect to phase . Then, the phase shift (16) becomes (17) By the definition of the ISF in (1), corresponds to . In our particular examples (1-D SDO and CPG-like circuit), corresponds to the interaction of the adjacent cells;
. It should be noted that the cell in these cases is spatially extended and the denominator contains information about the signal delay due to the spatially extended interconnections.
If the denominator is a constant, then holds, so and is obtained. This implies that the phase comparison characteristics are an odd function and do not match the experimental data shown in Fig. 4 . Here, the above insight that reflects the spatially extended nature of the cell is essential to explain the characteristics of . This insight and (12) predict that the slope is positive for shorter wire lengths (where can be considered to be nearly a constant and ) and can be negative for longer wire lengths because reflects a larger signal delay in the cell and deviates from a constant. This prediction is consistent with the experimental data in Fig. 4 where changes from positive to negative as the wire length becomes larger.
APPENDIX II
This appendix shows a criterion for the linear stability of the mutually synchronized states. We assume a synchronized state and introduce a small perturbation to such that . Then, (8) 
in which , and . The matrix in (19) is known as a circulant matrix and its eigenvalues and eigenvectors can be explicitly obtained as (20) Then, holds for any , and , and the stability of the synchronized states can be determined by in (19). For the in-phase state with and implies it is stable, and for it is unstable. Also, if the phase-lagged synchronized state satisfies , then it is stable (unstable). The same analysis can be made for the stability for (9) . If the term in (9) is neglected, the linear ODE for the perturbation is obtained from (19) by setting , and the same stability criterion is obtained. For the synchronized state is stable and for , it is unstable.
