Abstract-An analog synchronous mirror delay (ASMD) is proposed, which provides fast locking characteristics in recovery from power-down mode in a DRAM application. As an openloop fast locking system, ASMD measures and compensates the skew between external and internal clocks in analog operation mode within two cycles of an input clock using a charge-pumping scheme. This ASMD has no static phase error problem, which is related to the path selection operation of previously implemented SMD schemes. To enhance the linearity of delay characteristics and to increase the maximum operating frequency, dual pumping and multiple folding schemes are also proposed. An experimental chip with basic ASMD configuration is fabricated using 0.6-m Index Terms-Charge-pump phase-locked loop (PLL), clock synchronization, comparator, synchronous mirror delay.
I. INTRODUCTION
W IDE-bandwidth information and communication systems require more and more capacity and speed of memory systems. Additionally, the prevalent use of personal devices in multimedia system restrains the amount of power consumption, in standby mode as well as in working mode. The fundamental reason of this emphasis on memory devices is because the signal-processing devices and CPU's operate over several hundred megahertz, outperforming the access time of memory systems.
To alleviate the problem of speed discrepancy between the CPU and main memories, for example, high-speed SRAM's usually have been used as a cache memory. Developers of DRAM also are trying to adapt to changes in environment by modifying the internal architecture, as is the case with SDRAM, RDRAM, and SLDRAM. All of these are using system clocks to synchronize and speed up their operation.
In implementing these kinds of higher bandwidth memory systems, the main concern from the viewpoint of circuit technique is how to minimize the skew between I/O data and Publisher Item Identifier S 0018-9200(99)02431-2.
the clock. The common methods of phase-locked loop (PLL) or delay-locked loop (DLL) used as a means of deskewing are not applicable here, since the special requirement of memory devices, i.e., a need for a fast locking clock generator from the power-down state, cannot be met. Therefore, the recently introduced concept of synchronous mirror delay (SMD) schemes [1] - [3] is attracting the interest of high-speed memory designers.
The main advantage of SMD is the fast locking characteristics in recovering from power-down or standby mode within a few cycles of the system clock. In other words, the power supply can be turned off completely in standby mode, and the recovery from standby mode can be obtained within a few clocks after power settling time. This is in contrast to the PLL or DLL methods, where at least several hundred cycles are required to arrive at the locked state. So the shutting down the power supply of PLL or DLL is not allowed, which causes the power consumption in standby mode.
Nevertheless, special care is needed regarding the phasetracking methods for SMD. Since the fast locking characteristics are originated from the open-loop architecture of SMD, the phase error of I/O cannot be controlled accurately as in the case of PLL and DLL.
In this paper, we propose an analog SMD (ASMD), which enhances the performance of the SMD while reducing the chip area overhead. Section II discusses the SMD scheme and introduces the concept of an ASMD scheme. The design aspects and related performance-enhancing techniques are explained in Section III. Section IV presents the basic building blocks of ASMD. The measured results of an experimental ASMD system will be discussed in Section V. Conclusions are drawn in Section VI.
II. THE CONCEPT OF ANALOG SYNCHRONOUS MIRROR DELAY

A. Conventional SMD Revisited
Before discussing the ASMD scheme, the concept of SMD will be reviewed briefly. Fig. 1 shows the representative example among the conventional SMD's, which we call digital SMD (DSMD). This DSMD scheme, proposed by Saeki et al. [1] , is composed of a forward delay array (FDA), a backward delay array (BDA), and a mirror control circuit (MCC). The function of the MCC is to detect and produce mirror-delayed output by measuring delay of delay monitor (DM) block. This mirror control signal is used for phase synchronization of the output of the clock driver to input clock. As a result 0018-9200/99$10.00 © 1999 IEEE of the delay mirroring from FDA to BDA, it is claimed that the input and output clock can be locked within two clock cycles.
There are some drawbacks with DSMD's of this configuration. Fig. 2 shows the internal circuit of DSMD, which has an open-loop configuration. Since the output delay is determined by the path-selection function of MCC, the output phase is quantized by the step of the unit delay element. Another requirement in DSMD is that the input pulse width or duty cycle must be kept smaller than the delay value of the DM. Otherwise, DSMD cannot operate correctly. In implementing DSMD, the total size of the delay array must cover the minimum clock period. Also, the propagation delay of the unit delay element in FDA and BDA must be small enough to have fine phase characteristics. The result is a large array size and increased power consumption.
Han et al. [2] proposed an SMD of hierarchical structure, where a coarse delay element and a fine delay element are used to cope with the size and power problem. The hierarchical SMD is successful in extending the operating frequency range, but the penalty thereof is an increase of locking time of four cycles. In addition, the increased complexity of the control circuit influences the cost of overhead. Saeki et al. [3] proposed an interleaved SMD scheme that has multiple arrays with initial delay difference to reduce the output delay quantization effect. The penalty is the use of a double or quadruple array. Generally, the approaches up to now have involved increased area and power consumption to reduce the output quantization effect.
B. ASMD
We propose an ASMD concept that overcomes these drawbacks of DSMD's simultaneously. The fundamental difference is that ASMD measures and mirrors the delay of the DM block in an analog manner, which has no output phase quantization problem by nature. ASMD also uses toggled clocks to eliminate the duty-cycle dependency of the input clock. The basic operation mode of ASMD is depicted in Fig. 3 .
In ASMD's basic scheme, the input clock is fed to a toggle flip-flop (T-FF) to make signal and . Also, the input clock is fed to the DM block followed by another T-FF to make signal and . The delay difference between toggled signal and is maintained the same as the delay of the DM block . Then the measurement of the delay of and mirroring can be carried out by using current sources and capacitors, as in the case of a charge-pump PLL.
First, we examine the capacitor voltage of the left side, "left," with the pump-up and pump-down procedure as follows. 1) For : eq-period Initialize the voltage of the left node to .
2) For
: up-period Pump up the voltage of the left node from toward using a current source whose interval is .
3) For
: down-period Pump down the voltage of the left node from the final up-period value toward ground using the current source and measure the time when the "left" crosses the and generates an output pulse. Then go back to the eq-period. Since the generated pulse ( and outputs) period is two clock cycles, we define another signal "right," i.e., the voltage of the right-side capacitor whose operation is similar to that of "left" and whose timing is shown in Fig. 3 . By the interleaved operation of "left" and "right" using a pump-up and pump-down operation, we can obtain the final output of ASMD, whose delay is advancing by the amount of to the rising edge of the input clock. Fig. 4 shows the schematic of the proposed ASMD. Due to its analog pumping scheme and symmetric topology using toggled clocks, neither the output phase quantization problem mentioned above nor the restriction to the duty cycle of the input clock is of concern anymore.
With a minor variation of ASMD, the measuring and mirroring of the delay can be reduced one cycle further. This is possible by using similar timing as Fig. 3 , where the duty cycle of the input clock is assumed as 50%, since we do not have to use T-FF's to generate 50% duty signals. But in a practical situation, the variation of duty cycle can affect output jitter directly.
III. DESIGN ASPECTS FOR ASMD
The performance of ASMD is determined mainly by two subblocks: a linear current source and a high-speed comparator. Since the output pulse must be generated when the "left" and "right" signal crosses , there is a concern that the conversion delay of a comparator may cause an extra delay, phase error, of the ASMD output. This problem is avoided here by using the replica of the comparator and other gate delays in the DM block, so that the added conversion delay to is canceled out. The second concern is that the total delay of DM block, the sum of and added conversion delay of the comparator, may become greater than as the input clock frequency increases. Fortunately, this problem can also be cleared up by generating control signals using modulo of the total delay of the DM block to clock period.
The third concern with the ASMD is that the unwanted difference between the pump-up and pump-down current may render the timing reproduction invalid. This also would be the case with the DSMD when the unit delay of FDA differs from that of BDA. To prevent this kind of current mismatch problem, we propose the dual pumping scheme to have more accurate pumping and detecting characteristics (Section III-A) of ASMD.
The fourth point of concern is that an insufficient eq-period or up-period in the DM block may result in mistracked output of ASMD. To use ASMD as a general locking solution for high frequency over 500 MHz, we have to consider this situation. In this aspect, we propose a multiple folding scheme of ASMD (Section III-B) to guarantee a stable operation for an extended range of input delay value.
Also in a practical situation, the delay of the DM block cannot exactly track to that of the clock driver due to openloop characteristics like SMD, which causes static phase error. The effect of a modified pumping currents ratio is discussed in Section III-C. It can be used further to compensate delay mismatch of the DM block to an actual delay of the clock driver.
A. Dual Pumping Scheme
The current pumping scheme for the basic ASMD configuration is shown in Fig. 5(a) , where the up-down waveform and are fed to the comparator to generate ASMD output pulses. To obtain an exact reproduction of delay measurement, the pump-up and pump-down current must be matched exactly. But in fact it is hard to match the PMOS and NMOS current mirror for a different working environment, not to mention the variations induced by the fabrication process. To solve this current matching problem, we employ a dual pumping scheme as shown in Fig. 5(b) . It uses upper and lower triangular waveforms rather than a single one and fixed reference , as shown in Fig. 5(a) . The ASMD output is generated by the use of these two waveforms as inputs to the comparator. The operation of the dual pumping scheme is as follows. During the eq-period, the voltages of two waveforms are initialized to . During the up-period, the waveform A is charging up and the waveform B is charging down. Last, during the down-period, A goes down and B goes up, and a crossing point is made. By the use of symmetry of waveform, the crossing time of two waveforms always coincides to the value of , regardless of the levels of two current sources.
Here, the use mirrored currents of the same type of transistor sources provides a more stable operation than using those of a different type. However, the current deviation due to a drain-source voltage dependency remains still even in the dual pumping scheme, which results in nonlinear waveforms of A and B.
To examine how this factor influences timing behavior, we define and as the charging slope of A and B in the upperiod, respectively. The output resistance of a MOS transistor causes the discharging slopes to become and , respectively. The amount of the crossing-point shift can be expressed for dual pumping and single pumping schemes as follows:
The performance improvement is evident: the shift of the crossing point with a single pumping scheme depends on the absolute amount of current errors, while in a dual pumping scheme it depends only on relative errors, which means it is more tolerant to random device mismatches.
B. Multiple Folding Scheme
The basic ASMD uses a two-way folding scheme to control left and right signals in an interleaving manner, and the delay of the DM block contains a replica of the conversion time.
Since the effective value of the DM delay is a modulo of , the effective delay may be any value from zero to if we consider the unknown clock driver delay and high clock frequency over 500 MHz, for example. So the eq-period may be even smaller than its minimum value if the effective delay is small, or sometimes the up-period may be smaller than the minimum input voltage range of the comparator if the effective delay is near . Moreover, input clock jitter makes things worse. Fig. 6 shows the eq-period and up-period variation to the effective DM delay.
To guarantee a stable operation of ASMD to be used as a general locking system, we have inserted the extra eq-period and up-period in the basic timing chart as shown in Fig. 7 . These extra periods may take up more than one each. The larger timing margin can ensure stable operation of ASMD over higher frequency, but it also increases the complexity of control circuitry, hardware size, and total power consumption.
As a marginal solution, we propose a four-way folding scheme, as shown in Fig. 7 , which uses clock high and low period as extra eq-timing and up-timing, respectively. Since the total period takes up to four clock cycles, the four-way folding scheme requires four clock phases named phase_0, phase_1, phase_2 and phase_3 of the clock input phase instead of the basic two-way folding scheme. Also, we have to define another four-phase clock using the output of the DM block, named ph_0, ph_1, ph_2, and ph_3.
The four-way folding ASMD architecture can share the comparator as in the timing of Fig. 7 and as shown in Fig. 8 , so the area penalty is only on the pumping block. The reshaping block may be used to correct output duty cycle if necessary.
The phase_ and ph_ pair is input to each pumping block followed by a comparator to generate each output pulse, which is the same action as in basic ASMD. Care must be taken to prohibit the generation of initial ph_0 pulse in an extra eq-period, so the initial phase correction circuit as shown in Fig. 9 is used to provide proper timing. Fig. 9 also shows the four-way clock generation blocks. We use an ASMD enable signal, en_org in Fig. 9 , to initiate the operation of ASMD.
C. The Effect of Modified Pumping Current Ratio
There is another aspect of ASMD to be considered in practical application. Since the ASMD works in open-loop architecture, the DM delay cannot exactly track the delay of the clock driver because of the variation of fabrication process and operating temperature, even using a replica of related blocks. Also, we have to consider the added delay to compensate for the conversion time of ASMD into the DM block. All of those delay differences cause the static phase error of the ASMD output clock. Let us consider the delay deviation of the DM block from the ideal value. Then the output of ASMD followed by the clock driver has a phase error of [rad] . As the clock frequency increases, so does the resulting phase error. To minimize this delay deviation problem in open-loop architecture, the effect of pumping current ratio modification can be used to compensate for the time difference between the DM block delay and real clock driver. Fig. 10 shows the effect of ASMD output timing when the charging down current is larger than the ideal value. By increasing the ratio of downcurrent level to up-current, we can change the crossing point earlier than in of the ideal case. In Fig. 10 , the DM block delay is considered as 2 ns, for example. The dotted line shows the ideal case of ASMD operation, and the solid line shows the case of modified current ratio. If "time margin" is defined as the time difference between the original crossing point and modified crossing point, then time margin can be represented as a linear function of the clock period.
So if we can confine the range of delay difference of the DM block and clock driver as a finite value, then we can also adjust the current ratio to minimize static phase error of the open-loop system. By choosing the nominal delay of ASMD as the center of the minimum and maximum time margin, as shown in Fig. 10 , the static phase error of ASMD output can be reduced by 50% at maximum, compared to that of the uncorrected case.
IV. THE BUILDING BLOCKS
Even the multiple folding architecture performs better as a general open-loop locking system. The basic ASMD configuration can be adopted to have enough performance in restricted application, such as fixed clock range and known clock driver delay for a given fabrication to have minimum area and power consumption. The linearity of the current pumping block and the conversion speed of the comparison block are essential factors to implement the ASMD scheme. The pumping block is very similar to that of charge-pump PLL, so we can borrow some design hints from charge-pump PLL's if necessary. The primitive circuit of the pumping block of the ASMD scheme is shown in Fig. 11 . Due to the architectural symmetry of the ASMD scheme, the MOS gate capacitors of same size can be used in the left and right nodes to convert the pumping current to a voltage level. So the digital fabrication is enough to implement the ASMD scheme without any device trimming.
In Fig. 11, the transistors M1-M4, M11 , and M18 build pump-up and pump-down current sources. By the action of switches defined by Fig. 3 , the capacitor can equalize to and charge up and down to make signal left or right. To reduce the voltage jump at the switching time, M7, M8, and the buffer are used to initialize the "upper" and "lower" node in Fig. 11 . The value of and the current level are determined by the maximum and minimum charging time, i.e., the input frequency range. The steeper the slope of voltage, the smaller the operating frequency range and the conversion time, which means ASMD can operate at higher frequency. For the implementation of a dual pumping scheme, we have to add another circuit similar to M7-M18 for waveform B with appropriate switching configuration. Fig. 12 shows the circuit for a high-speed comparator. Since the comparator must have small conversion delay and high gain to generate an output clock pulse, we use a single stage differential pair and CMOS inverter as a high gain stage. To track down a process and operating condition variation, we adopt a replica technique also. So the single-ended output voltage of a differential stage is controlled to become a logical threshold voltage of M3 and M4, when inputs of the comparator cross . The effect of process and operating condition variations can be effectively absorbed by the adaptation of bias current using a replica and negative feedback of the op-amp as shown in Fig. 12 .
V. MEASUREMENT RESULTS
An experimental ASMD with a basic configuration is implemented using 3.3-V, 0.6-m double-metal CMOS technology. The chip size is 1.1 0.7 mm with a half of an analog block and another half of a digital block to control the ASMD, whose microphotograph is shown in Fig. 13 . The digital block of ASMD is composed of seven NAND's, seven NOR's, one inverter, and eight F/F's added to resolve the initial hazard state, and the analog block is composed of two current pumps and two comparators. For the representative configuration of DSMD, it requires several tens of stages of a building block, which is composed of three NAND's and two inverters, according to the frequency range of DSMD, as shown in Fig. 2  [1] , which means bigger implementation area.
Since the ASMD has an open-loop architecture, we employed the modified pumping scheme explained in Section II-C to reduce delay mismatch of the DM block to that of the internal real clock driver. Also, as a test condition of the ASMD application, we have chosen the range of the input clock period to be 3-13 ns, and the delay difference of the DM block from the ideal value as 1 ns. So the optimal ratio of down-to up-current is about 1.1. This ratio approaches 1.0 when we set the smaller for the fine DM block matching to the clock driver.
For experimental purposes, ASMD with a modified single pumping scheme is also implemented using the circuit of Figs. 10 and 11, whose working range is chosen from 80 to 330 MHz in simulation. Also, for the case of a four-way folding structure, the operation frequency can be up to 600 MHz using the same design parameter. To measure the delay characteristics of experimental ASMD, we use two external inputs of CLKIN and DMCLKIN as an input clock and DM buffered clock, respectively. The delay characteristics can be measured by the difference between the clock output and ASMD output as a function of the delay difference between the CLKIN and DMCLKIN input, where the measured working range is measured from 100 to 300 MHz. Fig. 14 shows the delay characteristics of experimental ASMD. Since we have chosen the down slope 1.1 times faster then up slope, the estimated slope of delay characteristics should be 1/1.1, and the measured slope is 0.86. The group of measured data has the gap of delay for a given input frequency. By the nature of an analog pumping scheme, ASMD does not show any output phase quantization phenomena as shown in Figs. 2 and 14 .
In an experimental ASMD, we have chosen the pumping current level of 200 A. By changing the current level, we can speed up the conversion time and also increase the maximum working frequency at the expense of reduced minimum working frequency. As the variation of process and operating condition can change the current level, so do the output delay characteristics. But as shown in Fig. 15 for the 125-MHz input clock, the delay difference becomes 200 ps if the current level of 25% changes. For the case of nominal 10% process variation, this means that the static phase error due to process variation is less than 0.1 ns from chip to chip. Since the maximum clock frequency of ASMD is determined by the magnitude of up-down waveform, we have to increase the current level or reduce left and right node capacitance to get a higher frequency range. Also, the comparator cannot generate a valid output pulse for a small input, which limits operation frequency, and input offset voltage directly affects phase error. Using the current level and capacitance of the experimental design and 100 or 200 mV of input offset voltage of the comparator, the simulation results show a phase error of 50-100 ps.
The jitter characteristics of ASMD are affected by the noise of the internal reference voltage and pumping current level. The differential and symmetric configuration of the pumping block and replica biasing of the comparison block are used to effectively reject common-mode noise of the ASMD scheme. The measured peak-to-peak jitter is about 100 ps without additional supply noise, and peak-to-peak jitter of 140 ps with additional 200-mV, 1-MHz sinusoidal supply noise at the clock frequency of 125 MHz. The measured jitter characteristics are shown in Figs. 16 and 17 . For the supply noise of higher frequency, the internal capacitor reduces noise amplitude, as does the jitter of the ASMD output. Fig. 18 shows the jitter histogram of 400-mV added supply noise at the pin, which results in peak-to-peak jitter of 126 ps at 125-MHz input clock frequency. The measured results of prototype ASMD are summarized in Table I .
VI. CONCLUSION
A new SMD structure of analog timing, the analog synchronous mirror delay scheme, has been discussed. Unlike the conventional DSMD's, where buffer chains as a multiple array and the smaller delay cells are required to implement fine phase characteristics, this ASMD scheme needs a smaller implementation area having fixed architecture compared to the conventional digital SMD's. The ASMD scheme shows better output phase characteristics and also a fast locking time from standby mode. Since the working range of ASMD depends only on the current level and capacitance value, its application can be extended to higher frequencies without changing the internal structure.
We also proposed a dual pumping scheme to reduce undesirable current mismatch errors and a multiple folding scheme to be used as a general locking device at the expense of some circuit complexity.
For an experimental verification, we designed the basic ASMD having a modified single pumping scheme to cope with the delay uncertainty of the DM block with real clock drivers. The measurement reveals the desired proportionality of the input delay to output advancing delay characteristics defined by the design goal.
The experimental ASMD with a basic configuration is designed using a 0.6-m CMOS process at 3.3-V power supply. The working range is measured from 100 to 300 MHz with measured peak-to-peak jitter of 100 ps without additional supply noise, and peak-to-peak jitter of 140 ps with supply noise of 1 MHz 200 mV sinusoid. The operating frequency range of ASMD can be increased using the multiple folding scheme. Using a four-way folding scheme with the same device parameters, ASMD can operate up to 600 MHz in simulation.
The application area of the ASMD scheme is not restricted to high-bandwidth DRAM's. It is also possible to be use it as a general clock generator or locking system in CPU and other multimedia devices.
