Abstract-We describe a new design technique for efficient harmonic resonant rail drivers. The proposed circuit implementation is coupled to a standard pulse source and uses only discrete passive components and no external dc power supply. It can thus be externally tuned to minimize the consumed power in the target IC. A new design technique based on current-fed voltage pulse-forming network theory is proposed to find the value of each discrete component for a target frequency and a given load capacitance. The proposed circuit topology can be used to generate any desired periodic 50% duty-cycle waveform by superimposing multiple harmonics of the desired waveform, however, this paper focuses on the generation of trapezoidal-wave clock signals. We have tested the driver with a capacitive load between 38.3 and 97.8 pF with clock frequency ranging between 0.8 and 15 MHz. The overall power dissipation for our second-order harmonic rail driver is 19% of 2 at 15 MHz and 97.8 pF load.
Voltage-Pulse Driven Harmonic Resonant Rail
Drivers for Low-Power Applications
I. INTRODUCTION

L
OW POWER has become a critical feature of many CMOS-VLSI systems because of the increasing demand for a longer battery life and the high costs of heat removal. Because clocking circuitry is typically a significant source of power dissipation [1] , reducing the power consumed by clock drivers and clock nets has become an important focus. Because clock nets are mostly capacitive, resonant charging techniques that recycle most of the energy stored in clock nets are increasingly promising. The simplest resonant charging technique uses the flyback circuit shown in Fig. 1 to generate a sinusoidal clock signal [2] . Although simple, if the nMOS transistor is driven nonresonantly, which has generally been the case, the energy efficiency of this clock driver is poor. The blip circuit [3] , illustrated in Fig. 2 because it is all-resonant, i.e., the energy used to drive every transistor is recycled. This circuit successfully has been used as an efficient power source for drivers of large on-chip signal lines of microprocessors [2] , [4] but can also be used to generate two-phase almost-nonoverlapping sinusoidal clocks. A common disadvantage of both these clock drivers is that the output signal frequency and magnitude depend heavily on the load capacitances . Because the value of may be data-dependent and can thus vary from cycle to cycle, the clock frequency may also fluctuate, thereby decreasing performance and increasing design effort [5] . Of the two drivers, the frequency fluctuation in the blip driver is more pronounced because of the positive-feedback nature of the two outputs. Another disadvantage of both of these drivers is the need for a distinct dc power supply whose value is determined by the load capacitance and the target frequency.
Lastly, while sinusoidal clock signals are well-suited for special adiabatic circuits [6] , [7] , the slow slew rates cause two problems for conventional clock nets. In particular, while adiabatic circuits have special circuitry that prevents the slow slew rates from causing high short-circuit current, conventional clock buffers, flip-flops, and latches do not have these features and thus, have relatively fast clock slew rate requirements. Secondly, slow slew rates cause increased variations on effective clock skew and clock-output delay which may considerably affect potential performance and system stability. Younis and Knight [8] developed an incremental design approach for a class of efficient harmonic rail drivers that solves these problems. Their drivers approximate a desired square wave (with 50% duty cycle) by superpositioning its first harmonics, as illustrated by the third-order driver in Fig. 3 . These drivers, however, require distinct dc power supplies, which is prohibitive for most practical implementations.
In this paper, we present a new systematic design approach for th-order harmonic resonant rail drivers that do not require additional dc power supplies. Linear network theory is normally applied to predict the waveform generated by a network of passive components. Our design approach applies it for the inverse problem. That is, we use linear network theory to systematically derive a network of passive components that generates th-order approximations of any given desired clock waveform with 50% duty cycle that can be expressed as a periodic trapezoid. In this way, we can achieve approximations of both ideal square waves and more practical waveforms with finite rise and fall times. In particular, we use linear network theory to develop a noniterative method for calculating the component values given the desired waveform shape and the nominal value of the load capacitance.
The topology of our proposed driver is based on a modified current-fed voltage pulse-forming network (CFVPN) [9] . This network is traditionally connected to a constant current source, which internally consumes significant power. In contrast, we propose using a conventional pulse generator that consumes much less internal power and is readily available in most systems. Moreover, it requires no additional distinct dc voltage/current supply and reduces the impact of variations in load capacitance on fluctuations in output magnitude and frequency. Self-oscillating resonant circuits such as flyback and blip circuits cannot be trivially synchronized to an external clock signal connected to other blocks in the system. However, this can be easily achieved in our design because it is driven by an external pulse generator.
Our proposed design approach has been implemented and tested for frequencies up to 15 MHz with various load capacitances. The worst case overall power dissipation of the second-order driver is 19% of at 15 MHz with a 97.8-pF load. Magnitude and frequency fluctuation due to a broad range of load capacitances variation are observed to be minimal. In addition, the power efficiency as a function of load capacitance and input pulse frequency variations is quantified. The remainder of this paper is organized as follows. In Section II, we briefly review the theory of waveform synthesis using current-fed voltage pulse-forming networks. Section III describes our systematic approach to identify the value of all driver components. Then, Section IV discusses practical implementations, Section V presents laboratory measurement results, and Section VI concludes with a discussion of potential applications and future work.
II. CURRENT-FED VOLTAGE PULSE-FORMING NETWORK
This section reviews standard implementations of Fourier series approximations of periodic trapezoidal waveforms using current-fed voltage pulse-forming networks.
A trapezoidal wave , shown in Fig. 4 , can be defined by the following:
where is integer.
(
Because the trapezoidal waveform is an odd function, the Fourier series for contains only sine terms as follows:
where where
In practice, only the first few terms are needed to yield a waveform that closely approximates an ideal trapezoidal wave. Notice that the model approximates a square wave as becomes zero. A CFVPN that can generate an output voltage consisting of the superposition of harmonics is shown in Fig. 5 [9] . To analyze , first assume that switch S opens at and there is no energy initially stored in the network. The voltage across the th section is shown in (4)
Cascading such sections in series yields the following for : (5) With this analysis, it is straightforward to determine the values of all network components to approximate a trapezoidal waveform defined by (2) and (3). In particular, by comparing (5) with (2) , the values of , , and for both square and trapezoidal waveforms can be easily determined, as summarized in Table I . As is, however, this network cannot be directly used as a clock rail driver because none of the capacitances in the network represents a load capacitance that resides between the output node and ground. To meet this requirement, an equivalent network can be derived through mathematical transformations of impedance and admittance functions of the output, as shown in the following:
Notice that the impedance function has zeros at and , which in turn appear as poles in the admittance function . We, therefore, can rewrite (7) as follows: (8) 
This transformation enables to be generated using the alternative circuit topology illustrated in Fig. 6 , which is now suitable as a clock rail driver because it has an explicit clock load capacitance that lies between the output and ground. To find the values of other components of Fig. 6 , we can use a partial fraction expansion of the admittance function [10] , which is an iterative numerical procedure that provides little insight into the operation of the network. In Section III, we present a characteristic equation that constrains component values so that only desired frequency components are produced in the network and together with a set of linear equations provides a more insightful closed-form expression for component values.
III. THEORETICAL ANALYSIS OF CFVPN
There are three steps in our theoretical and algorithmic analysis of the CFVPN and its desired component values. First, we convert all of node voltage and branch current equations from time-domain to frequency-domain using the Laplace transform. Second, the branch current equations are simplified to find a characteristic equation whose roots are the product of and of each branch such that all unwanted frequency components are suppressed. Using these roots, the third step is to establish a set of linear equations that can be found by applying on the output node to identify all of the inductor values in the network. These values are combined with the roots of the characteristic equation to identify all of the capacitor values.
A. Step 1: Convert Voltage and Current Equations to Frequency-Domain Representation
To ease the tedium and complexity of solving the integral and differential equations, we use the Laplace transform to convert voltage and current equations into the frequency domain. Since two networks shown in Figs. 5 and 6 are equivalent, we write the Laplace transform of the output voltage of Fig. 6 by approximating (2) to the th order, i.e., where (11) By noting that voltages across all of the branches in Fig. 6 equal , it is straightforward to derive the Laplace transform for each branch current as follows: (12 
By applying on the output node of the network, the relationship of branch currents can be defined by (20) Because no term exists in the right side of this equation, the term in each branch current must evaluate to zero, implying the following additional constraint (21) Equations (18) and (21) are combined to produce the characteristic equation shown in (23), (22) where (23) Notice that the numerator of the characteristic equation is an order-( ) polynomial of variable . The roots of this numerator polynomial, to , are the roots of the entire characteristic equation, which can be represented as follows:
(24)
C. Step 3: Setup Linear Equations to Find a Set of and Combine With the Roots of the Characteristic Equation to Find a Set of
Using (24), we can substitute with the product and in (22) and use the result to simplify (20) as follows: (25) Comparing both sides of (25), the linear equations shown in (26) determine the inductor values . Note that these values can be combined with (24) to calculate the capacitance values (26) As an example, consider the task of finding the value of all components of the second-order square-wave driver for a 1-MHz clock and a 100-pF load. From (23), we have (27) Using this value, we can rewrite (26) as follows:
(28) By solving these equations, we find inductor values, uH and uH. Lastly, since , it follows that pF.
IV. VOLTAGE-PULSE DRIVEN PULSE-FORMING NETWORK
Even though the CFVPN shown in Fig. 6 has an appropriate configuration for our target applications, two problems preclude the network from being directly applied as a clock rail driver. First, a dc current source is required to drive the network that in practice consumes large amounts of power internally, canceling out the benefits of the CFVPN clock rail driver. Second, the waveform swings between and as opposed to 0 and required for driving CMOS clock nets. We propose a unique solution that overcomes these impediments. 
A. Voltage-Pulse Driven Network for dc Current Source Elimination
In theory, to eliminate the dc current source of the CFVPN we can use the equivalent network shown in Fig. 7 , which is triggered by a voltage source generating a waveform identical to the desired output that provides no current, and thus consumes no power. However, it is impractical to build a voltage source that generates a waveform matching the desired th-order harmonic voltage waveform. Thus, we propose a solution that uses a more practical voltage pulse whose undesired harmonics are effectively absorbed using a series resistor.
The cheapest source of the voltage pulses is a conventional clock oscillator that, given finite rise and fall times, approximates a trapezoidal wave. The first harmonics of the trapezoidal waveform should match that of the network. However, the trapezoidal-wave clock signal will also contain higher order harmonics than those generated by the network. This will cause significant current draw from the voltage source, reducing the power efficiency of the proposed rail driver.
We propose to reject these higher order harmonics from the input pulse generator by placing a resistor between the input and output of the network as depicted in Fig. 8 . To understand the benefits of adding this resistor we first write the impedance function of the original network as follows: (29) Note that (29) is the impedance function of the network shown in Fig. 5 that is the equivalent network of Fig. 6 . Then, by adding the resistance , the overall impedance seen by the input pulse generator is (30) The transfer function of the network, represented as the ratio of impedance and , is as follows:
where is real
where . The magnitude and phase shift of this transfer function can then be expressed as follows:
Thus, for all frequencies other than the harmonic frequencies (where is finite), the magnitude approaches zero and the phase shift approaches 90 as the value of increases. Moreover, the magnitude and phase of the transfer function at each of the harmonic frequencies can be calculated as follows:
where for . Thus, the value of does not affect the phase or magnitude of any of the generated harmonics. A more detailed analytical proof of (33) for the second-order driver example is presented in the Appendix. Fig. 9 depicts the frequency response of the second-order driver for a 1-MHz clock signal with different resistance values. It clearly shows that no distortion is incurred at two resonant frequencies (1 and 3 MHz) for all resistance values. From this graph, it seems beneficial to increase the resistance to reject higher harmonics. However, the parasitic resistances of the components and wires unfortunately reduce the voltage level of the output signal as becomes larger because of the inherent voltage divider present between the parasitic resistances and . To understand this more clearly, the driver is redrawn in Fig. 10 with a parasitic dc resistor associated with each inductor for the second-order. Other parasitic components whose values are negligible compared with components used are not considered to simplify the analysis. If we apply on the output node, we can write the following:
(34) By arranging for (35) Fig. 11 shows the magnitude and phase of at the first two harmonic frequencies. and are used for the inductor parasitic resistances, as specified by the data sheet for the inductor used in our implementation [11] . Though there is negligible change in the phase, the magnitude decreases from 0.96 to 0.69 when increases from 1 to 10 k . This is in sharp contrast to the ideal network whose frequency response at the harmonic frequencies is not affected by the resistance value as demonstrated in Fig. 9 . Consequently, it is important to use an adequate resistance value while maintaining proper voltage levels of the output signal for low-power dissipation. For driving 97.8-pF load capacitance at 1 MHz, our test measurement demonstrates that only 15% of is dissipated with a 2-k resistance with negligible degradation in output voltage. 
B. A Tank Capacitor for a Positive-Swing Waveform
The output of the network in Fig. 8 swings between and because one branch between the output and ground contains a single inductor . To redesign the network to swing from 0 to we must introduce a dc offset to the output. We propose accomplishing this by introducing a dc offset at the pulse generator input and adding a tank capacitance in series with .
To understand how a dc offset from the input pulse generator affects the network, two circuits that differ only in an existence of at the dc steady-state condition are shown in Fig. 12 . Without , the induced output dc voltage is zero because the output and ground node are shorted by the branch as shown in Fig. 12(a) . As a result, the dc current flows into the network, creating significant unwanted dc power. For the network shown in Fig. 12(b) , the tank capacitance connected to in series induces a matching dc offset voltage at the output node, eliminating the dc current into the network.
Moreover, the introduction of the tank capacitor has a negligible impact on the overall frequency response of the rail driver. To see this, notice that the impedance functions for the branches in Fig. 12 that contain can be written as follows:
Assuming is very large, the impedance of is negligibly affected by when
In our laboratory test, a 10 nF off-the-shelf capacitor was sufficient to achieve the desired dc offset voltage within an 0.8-to 15-MHz frequency range. The final proposed voltage-pulse driven positive-swing driver is shown in Fig. 13 .
V. MEASUREMENT
The proposed harmonic resonant square-wave rail drivers containing up to four terms (i.e., fourth-order) were designed and tested on a wire-wrap board that included tunable inductors and capacitors. We varied the frequency from 0.8 to 15 MHz by setting these components to theoretical values we calculated using (23) and (26). We then tuned each component to achieve minimum measured power dissipation and compared them with their theoretical value. Testing at higher frequencies was limited by the test setup and equipments that are available to the authors. Table II summarizes the laboratory measurement results for various configurations. In most cases, the measured values of the components are within 7% of the theoretical values. Deviation between the theoretical and tuned capacitance values is larger than for the inductors presumably because of the large parasitic capacitances in our wire-wrapped board. As reported in Table II , approximately 19% of the calculated conventional power dissipation was dissipated for the second-order driver at 15 MHz to drive 97.8 pF load capacitance. Power dissipation increases as the order of the driver increases. This effect appears to be due to more parasitic components in the test board. In addition, tuning the circuit for minimum measured power dissipation is increasingly error prone since more design variables are involved. Note that as we increase the order of the driver, we must include additional capacitance such as and for the second-order driver. However, this does not increase the power dissipation significantly because only a small fraction of the current is drawn from the input pulse generator. In particular, the pulse generator needs to provide only a very small current sufficient to compensate the energy loss due to the parasitics of the components.
The last row in Table II shows the measurement data of the second-order driver for different load capacitances at 1 MHz. Resistance values are reduced to achieve 10% rising and falling times of the total cycle time. Power dissipation is increased by approximately 7% for this case while rising and falling times are shortened by 3% from the minimal power dissipation mode. This result suggests that by changing resistance value, we can control the rising and falling times at the expense of power dissipation. Fig. 14 illustrates the measured power dissipation as we changed the resistance value for 1 MHz and 100 pF. The transition time with 2-k resistance was measured as 110 ns, which is 11% of the total cycle time. Notice that transition times in Fig. 14 are normalized to this value. At 285 , the transition time drops to 50 ns (45%) while the power dissipation increases from 15% to 57.9% of . Figs. 15 and 16 show oscilloscope traces of the output signal of the driver for the second-and third-order harmonics. To see how the output signal is synchronized, the input pulse is also shown. A fast Fourier (FFT)-enabled oscilloscope trace for the fourth-order driver output is presented in Fig. 17 . The figure shows that only four harmonic frequencies are present in the output signal. Fig. 18 presents the trace of the output signal of the second-order harmonic driver for 10-MHz frequency.
When the driver is directly connected to the clock network, the nonlinear characteristic of the transistors can cause load capacitance variations. To measure power dissipation as a function of the load capacitance variation, we varied from 30% to 30% of the nominal value while keeping all other components the same. The power was then measured.
The results for a 1-MHz clock and a 100-pF load capacitance are plotted in Fig. 19 . Normalized power dissipation in the graph is the ratio between the measured power dissipation and . Power dissipation at 100 pF is minimum because the circuit is designed to harmonically resonate at this value. No frequency variation was noticed for this range of capacitances as is expected for any externally-driven driver.
Unlike the self-oscillating rail drivers whose frequency varies proportional to the square root of variations in capacitance [5] , this beneficial characteristic of our drivers significantly increases system stability. For capacitance greater than 130%, however, significant voltage-level degradation is observed. On the other hand, if we reduce the load capacitance below 70% of nominal, the power dissipation increases rapidly because current from the input pulse generator mostly charges the load capacitance instead of it being charged resonantly. In addition to the increased power dissipation by the load ca- pacitance variation, the phase shift between the input pulse and the generated output causes clock jitter. To quantify this effect, we measured the delay time of the output with respect to the input pulse at the voltage level while varying the load capacitance from 30% to 30% of the nominal value. The measurement results are shown in Fig. 20 . We observed clock jitter ranging from 47 to 43 ns for 1-MHz clock frequency. This relatively high clock jitter can be com- pensated by an increased clock cycle time. Therefore, 5% to 10% performance loss is expected for applications with high load capacitance variation.
Another experiment was carried out to measure power dissipation as a function of frequency change of the input pulse generator. We varied the frequency of the input pulse generator ( ) from 10% to 10% from its nominal value then the power was measured. Fig. 21 shows the measurement results of the power dissipation. At the nominal frequency (1 MHz), the normalized power dissipation is 14% of . When is reduced by 10% from its nominal value, the power dissipation is increased to 39% of . 36% of was measured when is increased by 10%. For 2% frequency fluctuation, the power dissipation is increased to 15% of .
VI. CONCLUSION
In this paper, we presented a new algorithm and a prototype implementation of a harmonic resonant rail driver. The design goal was to produce an energy-efficient harmonic resonant clock signal using a simple network topology requiring no additional dc power supply. The experiment result shows that a significant amount of energy for driving the clock load can be recycled and saved by the resonant characteristic of the proposed driver. Depending on the number of harmonics in the driver, we were able to save 70% to 85% of the conventional power dissipation. Moreover, our driver is much more robust than previously reported resonant clock drivers, in that it greatly reduces the frequency variation caused by changes in load capacitance.
Implementation with integrated inductor and capacitor requires further work. The straightforward procedure for implementing the external clock driver is to measure the load capacitance by driving it with a conventional clock driver and tune other components correspondingly. Therefore, high-accuracy capacitance extraction is crucial to integrate the driver inside ICs. A phase comparator and varactors to compensate error caused by capacitance estimation and load capacitance variation could be a feasible solution, but power dissipation will be increased by these additional circuitries. In addition, the low-Q characteristic of the internal inductor may cause low power efficiency of the harmonic clock driver.
For high-frequency applications, the parasitic components of each inductor and capacitor will significantly affect the output waveform and power dissipation. Therefore, a simulation-based optimization strategy is needed to achieve optimum performance and power dissipation. Nevertheless, the theoretically-optimal circuit derived from our analysis can be the initial circuit for a more detailed nonlinear simulation-based optimization strategy. Providing such a good initial condition can dramatically reduce the run time of the simulation-based approach. In addition, the theoretic analysis may also be useful in guiding the search strategy.
APPENDIX PROOF OF (33) FOR THE SECOND-ORDER DRIVER
By applying on the output node of Fig. 8 (for the second-order driver), we can write the following: 
From this equation, we can conclude the first two harmonics from the input pulse generator are not affected by the series resistor and thus appear at the output node with the same magnitude and phase. Fig. 22 shows the time-domain waveform of for various values. As shown in the figure, the magnitude of always converges to zero regardless of value.
Therefore, when the circuit reaches steady-state condition, only the first two harmonics are present at the output node and no power is dissipated if the input is composed of these two harmonics. 
Joong-Seok Moon
