We propose a transfer gate phase coupler for a low-power multi-phase oscillator (MPOSC). The phase coupler is an nMOS transfer gate, which does not waste charge to the ground and thus achieves low power. The proposed MPOSC can set the number of outputs to an arbitrary number. The test circuit in a 180-nm process and a 65-nm process exhibits 20 phases, including 90 • different angles. The designs in a 180-nm CMOS process and a 65-nm CMOS process were fabricated to confirm its process scalability; in the respective designs, we observed 36.6% and 38.3% improvements in a power-delay products, compared with the conventional MPOSCs using inverters and nMOS latches. In a 65-nm process, the measured DNL and 3σ period jitter are, respectively, less than ±1.22 • and 5.82 ps. The power is 284 μW at 1.85 GHz.
Introduction
Ring oscillators, which output multiple phases, are widely used in signal processing. Particularly, multiple phases with a small clock skew ring oscillators are used: the highresolution phases enable writing to optical disks [1] - [5] . A serial link receiver for a compact disk recordable (CD-R) requires a multi-phase generator and a number of phase rotators to produce data and edge clocks.
There are four methods for generating multiple phases as shown in Table 1 . First, a ring oscillator consists of multiple-stage inverters. This method is very simple, but it cannot generate multiple phases at high frequency [4] . Second, multiple phases can be generated by an inductor-capacitor voltage controlled oscillator (LC-VCO) with phase interpolator. This has lower-power consumption and higher resolution than the ring oscillator. Unfortunately, it has a narrow-band frequency due to inductor's high quality factor (Q). The third phase generator is base on a poly-phase filter. This method has lower power and lower cost than other methods. However, it needs external clocks [5] . The other way is a multi-phase oscillator (MPOSC) that uses phase coupled inverter chains [1] - [3] , [6] , [7] . High frequency with wide-band output can be achieved using the MPOSC. However, the conventional inverter-structured and latch-structured phase couplers draw the charge to the ground, meaning that the phase couplers Manuscript consume unnecessary power themselves [1] , [2] , [6] . There is another option like a resistance element as a phase coupler [8] : Resister, nMOS transistor, pMOS transistor, or CMOS switch. However, it draws static current in a locked state and thus consumes needless power. To make matters worse, the transistor-type couplers require an external analog input voltage.
In this paper, we propose to adopt a transfer gate phase coupler (TGPC) that wastes no charge to the ground, which enables a low-power MPOSC; it fully achieves the benefits from process scaling. We also explain the charge operation of the TGPC. By using the TGPC, we can set the number of inverter chains to an even number. It is confirmed that the TGPC possesses process scalability with implementations in 180-nm and 65-nm processes.
This paper is organized as follows: Sect. 2 describes phase couplers in the conventional and proposed low-power MPOSC. Careful design for layout is mentioned in Sect. 3. Section 4 states measurement results in 180-nm and 65-nm test chips. The final section concludes this paper.
Phase Coupler Design

Relation between Oscillation and Phase in MPOSC
The period of a ring oscillator, T , is determined by a delay of inverter:
Copyright c 2011 The Institute of Electronics, Information and Communication Engineers where n is the number of stages in the ring oscillator, and it must be a odd number, thus three or more. On the other hand, as for a phase resolution, θ res , it signifies a phase difference between an input and output of the two inverters, and it is defined as 360 • /n. The oscillating frequency and phase resolution depend on only the number of inverter stages For a higher oscillating frequency, large-sized inverters are necessitated, resulting in larger power.
The MPOSC can consist of coupled inverter chains. In other words, the frequency and the phases are defined respectively by inverter chains and phase couplers. Those are made to one eventual ring, where phase couplers are link to inverter chains [2] .
The phase coupled MPOSC obtains n × m phases from m sets of n-stage ring oscillators. It can be, therefore, considered as one set of n×m-stage ring oscillator: θ res becomes 360
• /(n × m).
Phase Couplers in Conventional MPOSC
The conventional current-controlled MPOSC is shown in Fig. 1 , with two kinds of phase couplers. The inverterstructured phase coupler in Fig. 1 (b) does a full swing of charge between inverter chains. For this reason, oscillation frequency depends on not only inverter chains but also the ring of phase couplers. Thus, when making many phases, the ring of the phase coupler delays larger, which limits oscillating frequency.
Another conventional approach is a latch-structured phase coupler shown in Fig. 1(c) . The latch of two nMOS makes a half swing of charge between inverter chains. Although the phase couplers have certain relation, it does not give an impact like the inverter coupler. The oscillating frequency depends on only inverter chains, and it can oscillate at higher frequency [2] .
In this way, the conventional MPOSCs consume power for the swing of charge by the phase couplers between the inverter chains. When nMOS of the inverter-structured phase coupler is turned on, it discharges through the inverter. The latch-structured phase coupler does so. The conventional phase coupler draws current to the ground through itself and consumes power even if the MPOSC is locked. Thereby, the conventional MPOSC wastes more power than the ring oscillator composed of the same number of stages. We explain the 5 × 5 MPOSC with the inverter phase coupler ( Fig. 1 ) in detail using Fig. 2 . As for this MPOSC, P 1 is connected to the input of the phase coupler PC 1 , and its output P 14 is connected to the output of the inverter INV 1 . Additionally, P 14 is connected to the input of the PC 2 . The nMOS or pMOS of the phase coupler becomes active by the P 1 state. When the nMOS becomes active, P 14 is discharged. On the other hand, when the pMOS becomes active, P 14 is charged. When P 14 was delayed, it is discharged by the nMOS of phase coupler (on the rising edge, P 14 will be charged after the pMOS becomes active). In contrast, when P 14 was advanced, it is locked for the same reason.
The other conventional approach using the latch as a phase coupler is explained in Fig. 3 . P 1 and P 14 are crosscoupled by the latch. When P 1 is "high", the P 14 's charge flows to the ground; On the other hand, if P 14 is "high", the P 1 's charge flows to the ground. As a result, P 1 and P 14 try to be stable as they have inverse phases each other, which is the basis of the latch-structured phase coupler, similar to the inverter phase coupler's case. Note that the phase coupling operation is carried out only when P 1 is "high" and active (This is the reason why the latch-structured phase coupler does "the half-swing of charge") Then, P 14 cross-coupled with P 2 is processed in the same way. Consequently, P 14 is locked to the opposite phase of the middle phase between P 1 and P 2 .
Proposed Transfer Gate Phase Coupler
Now, m sets of n-stage ring oscillators in Fig. 1 are considered. Because the inverter or latch in Fig. 1(b) or 1(c) is used as a phase coupler in the conventional MPOSC, it inverts a phase signal, which means that m must be an odd number; otherwise, the ring is stabilized and is not oscillated. In contrast, our MPOSC scheme adopts the nMOS transfer gate presented in Fig. 4 ; it does not accompany the inverting operation as a phase coupler. Consequently, the proposed scheme can accommodate setting of m to an arbitrary number; we can thus set m to an even number in the proposed scheme, which enables even-numbered phase outputs.
For instance, the minimal components that create I/Q signals (90
• differential signal) are m = 4 and n = 3 in our MPOSC. The phases in this type of MPOSC is shown in Fig. 5 . In the practical design, we chose m = 4 and n = 5 (θ res = 18
• ) by increasing the number of stages in the inverter chain for finer phase resolution and lower jitter. A 72
• delayed signal is used to turn on an nMOS TGPC. A detailed schematic of our proposed design is presented in Fig. 6 .
The proposed TGPC uses an nMOS controlled by the outputs of the inverter chains. The waveforms of the pro- posed TGPC, which further explain the operation principle, are portrayed in Fig. 7 . In fact, P 4 and P 8 are derived from the same ring oscillator as shown in Fig. 4 . Their mutual phase relation is stable 72
• , and P 8 controls the phase coupler connected with P 4 . Because of the stable 72
• delay of P 8 behind P 4 , P 8 can always turn on the nMOS TGPC at the P 4 's falling edge. The TGPC equalizes P 3 and P 4 , and also P 4 and P 5 if they are not locked. Depending on the both voltages at the TGPC's drain and source, the time to be locked will vary.
For a case in which P 4 is delayed behind P 3 , the current from P 4 is drawn through TGPC M 1 to P 3 , by which the P 4 's falling edge becomes advanced. In contrast, if P 4 is advanced, P 4 is delayed by P 5 . For these reasons, P 4 is locked to the middle phase between P 3 and P 5 . The TGPC using P 3 , P 5 , P 8 and P 9 can determine the P 4 's phase. Once P 3 , P 4 and P 5 are locked, the current flows less because the voltage difference between them is quite small. The current is decreased with the increase of a phase number because the different voltage becomes smaller. Consequently, our proposed MPOSC is highly effective for developing the oscillation; the TGPC's power overhead can be minimized.
Layout Design
Because the goal of MPOSC is to align phase outputs, the layout of the MPOSC must be carefully designed for high phase resolution. A differential nonlinearity (DNL), which represents a phase error between the nearest two phases, is affected by loads of outputs, parasitic resistance and capacitance (RC) derived from wires and vias, and peripheral circuits. For this reason, a detailed RC extract from layout and circuits simulation using it have to be conducted in an MPOSC. In particular, in the proposed MPOSC, there are more ports with TGPCs, resulting in complicated wiring.
In the proposed MPOSC, the circuits can be divided into two blocks: One block includes inverters, and the other is a block comprising of TGPCs. The phase outputs from the inverter block are input to the TGPC block, in which the inputs are connected to TGPC's gates, drains and sources. We have to consider the layouts of the two circuit blocks, and level shifters (LS = output buffers) for external output.
We simulated two kinds of circuit topologies shown in Fig. 8 . Method 1 in Fig. 8(a) is a TGPC centered design. Load variation of the inverters' outputs is alleviating by placing the inverter near the LSs. Method 2 is an inverter centered design as shown in Fig. 8(b) . We separate the LSs' block from the inverters. The inverter block's outputs and the TGPC block's inputs are located in the same places; thus the parasitic resistance and capacitance, and via resistance can be reduced by simple wiring. The distance between LSs and inverters is, however, long, which delays signal transmission to the LSs. Figure 9 shows simulation results in the two layout methods: Figs. 9(a) and 9(b) correspond to cases that the inverters have no load and LSs as load, respectively. If no load, Method 2 is better than Method 1 in terms of DNL. The respective DNLs in Methods 1 and 2 are 0.044
• and 0.028
• on average. This is because the wiring resistance and capacitance, and via resistance to the phase couplers are lower in Method 2. In reality, the Method 2's oscillating frequency is higher.
In contrast, when an LS is considered as the inverter's load, the oscillating frequencies are lowered by 20.4% in the both methods. The DNLs exhibit different results from the case of no load: The average DNL in Method 1 is decreased to 0.012
• from 0.044
• in the no-load case, whereas that in Method 2 is increased from 0.028
• to 0.051
• . This is because the wiring distance between the inverter and LS are adversely affected. The LSs are prepared for driving external circuits and are also connected to source followers to measure waveforms. This gives affects to the load of the inverters, which may cause variation to the DNL performance. Consequently, we chose Method 1 as the MPOSC layout.
VLSI Implementation and Measurement Results
We implemented the proposed MPOSC in a 180-nm and 65-nm process technologies, using the layout described in the previous section.
180-nm Test Chip
Figure 10(a) portrays a test chip of the proposed MPOSC core with the 180-nm CMOS process technology. The output from the four sets of five-stage ring oscillators is 20 phases, including I/Q signals. The core layout is shown in Fig. 10(b) , and its area is 40.6 × 54.2 μm 2 , including the LSs that drive the external source followers. At a supply voltage of 1.5 V, our designed MPOSC outputs 433 MHz that complies with ISO/IEC 18000-7.
65-nm Test Chip
Figure 11(a) portrays a test chip layout in a 65-nm CMOS process technology. The structure is the same as the above. The core layout is shown in Fig. 11(b) , and its area is 15.92 × 7.69 μm 2 , including the LSs that drive the external source followers. This result shows that the proposed MPOSC has process scalability and that its chip area can be reduced through process scaling.
Performances
The result measurements of the 180-nm and 65-nm CMOS process technologies are shown in Table 2 . We confirmed that the oscillating frequencies in the 180-nm and 65-nm processes respectively achieve 581 MHz at a supply voltage of 1.8 V and 1.85 GHz at a supply voltage of 1.2 V. The measured 3σ period jitter in the 180-nm node is 12.0 ps at 581 MHz, and that in the 65-nm node is 5.82 ps at 1.85 GHz. Figures 12 and 13 shows waveforms from source followers as measured examples (P 3 and P 4 : 18
• different). Figure 14 shows the DNL measured at the same frequency. The maximum DNL is less than ±1.50
• in the 180-nm CMOS process. In the 65-nm CMOS process, the maximum DNL is less than ±1.22
• . In the 180-nm CMOS process, a comparison of the power-delay (PD) products between the conventional MPOSCs and the proposed MPOSC is depicted in Fig. 15 . The structures of the oscillators in the figure are the simple 1-set ring oscillator (Ring OSC) and the MPOSCs using the inverter type phase couplers (Inverter [1] ), the nMOS latch type ones (nMOS latch [2] ), and the proposed TGPC. Note that the MPOSCs consist of a same set number and a same stage number from 3×3 stages to 9×7 stages but the types of the phase couplers are merely different, and the simple 1-set ring oscillator has the same number of inverters. The results show that the proposed MPOSC with the TGPCs is superior to the other MPOSCs. As the phase number increase, it becomes more effective in power. It is comparable to the ring oscillator that does not use phase coupling. We observed that 36.6% and 38.3% improvements can be achieved, respectively, compared with the conventional MPOSCs with the inverters and nMOS latches. The PD products of the simple ring oscillator will be higher than the MPOSC with the TGPCs, at a phase number of 35 and more. This is because, in the simple ring oscillator, the oscillating frequency is decreased with increasing the number of phases and the number of inverters. To obtain a higher oscillating frequency and a greater number of phases, large-sized inverters are necessitated, which results in larger power dissipation. Figure 16 confirmed the process scalability of the 180- nm CMOS process and the 65-nm CMOS process. The PD product when setting oscillation frequency to 450 MHz is measured. It is understood that the proposed TGPC oscillator is superior to the conventional MPOSC in low-power consumption. In particular, when the same frequency is set in each process, the supply voltage can be lower by scaling.
As a result, we confirmed that the PD product can be made lower than that in the conventional one by this measure.
Conclusion
In this paper, we propose a low-power MPOSC with singleend inverters and the TGPCs. It can set the number of inverter chains to an arbitrary number. The proposed architecture is simply implemented by transistors, and does not use any analog elements. So, the architecture can benefit from process scaling. We have implemented the proposed MPOSCes in a 180-nm CMOS process and a 65-nm CMOS process, which respectively consumed power of 920 μW at 581 MHz and 284 μW at 1.85 GHz, indicating that our proposed MPOSC is suitable for process scaling.
