# TRUE-SINGLE-PHASE ALL-N-LOGIC DIFFERENTIAL LOGIC (TADL) FOR VERY HIGH-SPEED COMPLEX VLSI

<sup>†</sup>Hong-Yi Huang

<sup>‡</sup>Kuo-Hsin Chena

† Yuan-Hua Chu

S Chung-Yu Wu

<sup>†</sup>Comp. & Comm. Research Labs., Industrial Tech. Research Inst., Chutung, Taiwan 310, R.O.C. Email: hyhuang@vlsi.ccl.itri.org.tw

<sup>‡</sup>Dept. of Electrical Eng., Tam-Kang University, Tam-Shei, Taiwan, R.O.C.

<sup>§</sup>Institute of Electronics, National Chiao-Tung University, Hsinchu, Taiwan 300, R.O.C.

### ABSTRACT

A family of new logic circuits, called true-single-phase all-N-logic differential logic (TADL), are proposed and analyzed. The logic circuits are designed with only NMOS devices in the logic tree. Two kinds of sensing techniques are used for improving the speed operation, namely, the balanced sense amplifier for the differential-input TADL and the unbalanced sense amplifier for the single-input TADL. A complex function can be implemented in a TADL gate and high operation speed can be achieved without dc power dissipation. Only a true-single-phase clock is required to form the fully pipelined systems. Simulation results show that circuits designed by the TADL have the advantages of high-speed operation and low power-delay product.

### 1. INTRODUCTION

CMOS dynamic circuits have been widely used for highspeed VLSI [1]. Some design efforts are required to prevent the related problems, such as race, clock skew and clock slope, and charge sharing. The true-single-phase clock (TSPC) system has been proved to have the ability for very high-speed applications [2],[3]. However, the conventional TSPC circuits required a PMOS logic tree which may lead the speed limitation of the entire system. An all-N-logic single-phase logic was proposed with only NMOS logic tree [4] and was proved to be 2-3 times faster than the conventional ones.

The implementation of logic functions using CMOS differential logic has certain advantages over that using the single-ended logic [5]. The differential ones increase the logic flexibility by offering both the complementary outputs simultaneously. The packing density and speed performance are improved because the logic function is implemented using the NMOS logic tree. Moreover, complex logic can be implemented in a single gate so that that power dissipation can be reduced.

The enable/disable CMOS differential logic (ECDL) [6] and the latched CMOS differential logic (LCDL) [7],[8] have higher operation speed than the conventional dynamic DCVS circuits [1] due to the operation of the balanced sense amplifier. The devices in the NMOS differential network can be designed with smaller device dimensions and a complex logic function can be implemented in a single gate. The latched domino (Ldomino) [9] can alleviate the inversion problem inherent in the domino logic by the addition of an unbalanced sense amplifier to a basic domino gate. To avoid failure evaluation caused from the operation of the sense amplifier, the input signals of the above differential logic circuits have to be stable before the logic circuits turn into the evaluation phase. The Ldomino circuits can only be used in the first stage or the interface between static CMOS logic and domino logic [9]. Direct cascading of the circuits is not allowed when all logic gates are controlled by a global clock. The ECDL provides a solution for direct cascading by generating asynchronous clock in each logic gate [6]. However, the generation of the locally asynchronous clock has to be carefully designed and an extra hardware is also required.

To further improve the circuit techniques for very highspeed operation and increase the design flexibility. In this paper, new differential logic circuits, called true-singlephase all-N-logic differential logic (TADL), are introduced and analyzed. The TADL is a composition of the circuits techniques of ECDL, LCDL, Ldomino and some new techniques of circuits for more flexible design.

### 2. CIRCUIT STRUCTURES AND OPERATING PRINCIPLES

Fig. 1(a) and 1(b) show the general structure and clock timing of the true-single-phase clocking scheme, respectively. When the clock rises (falls), the N-section (P-section) is in the evaluation phase. When the clock falls (rises), the logic in the N-section (P-section) is in the precharge phase. The previous evaluation result is held at the output of the latch. Thus, a pipelined operation is formed in the way.



Fig. 1 (a) The general structure and (b) clock timing of the true-single-phase pipelined systems.

In the conventional TSPC system [2], the PMOS logic circuits are required in the P-section. The PMOS logic can be implemented with very simple function (NOT, 2-input NOR, or 2-input NAND) to break the speed limitation. However, this may result in more hardware requirement and

# 0-7803-3073-0/96/\$5.00 ©1996 IEEE

296

power dissipation. The all-N-logic single-phase logic is designed with only NMOS logic circuits in both N-section and P-section [4]. The clock slope sensitivity problem is modified and the speed performance is improved. But there is still a limitation of noninverting and inverting functions in each pipeline section.

2(a)-2(d) show the circuit structures of the Figs. differential-input TADL proposed in the paper. (Figs. 2(a) and 2(d) are called LCDL [7] and ECDL [6] in their original papers. They are renamed for a better description with the new modes of circuits in this paper.) The precharge-high differential-input mode-1 (PreH-Diff-1) circuit and precharge-high differential-input mode-2 (PreH-Diff-2) circuit are shown in Figs. 2(a) and 2(b), respectively. The NMOS differential network of PreH-Diff-1 is controlled with a clocked NMOS N4 connected to GND and that of the PreH-Diff-2 is connected to VDD without the control of a clocked device. When the clock is low, the PreH-Diff-1 and PreH-Diff-2 are in the precharge phase, the outputs are precharged to VDD. When the clock makes a low-tohigh transition, the PreH-Diff-1 and PreH-Diff-2 are turned into the evaluation phase. Depending on the input signal of the NMOS differential network, there exists a small voltage difference on the differential output nodes. The voltage difference is amplified by the balanced sense amplifier composed by P1, P2, N1, and N2. In the evaluation phase, the NMOS differential network of PreH-Diff-1 provides a pulldown path to GND and that of the PreH-Diff-2 doesn't. However, the clocked device used to control the NMOS differential network in the PreH-Diff-2 is not required.

Figs. 2(c) and 2(d) show the precharge-low differential-input mode-1 (PreL-Diff-1) circuit and precharge-low differential-input mode-2 (PreL-Diff-2) circuit, respectively. The NMOS differential network of the PreL-Diff-1 is controlled with a clocked PMOS P4 connected to VDD and that of the PreL-Diff-2 is connected to GND without the controlled of the clocked device. When the clock is high, the outputs of the PreL-Diff-1 and PreL-Diff-2 are precharged to GND. When the clock makes a high-to-low transition, the PreL-Diff-1 and PreL-Diff-2 are turned into the evaluation phase. Depending on the input signal of the NMOS differential network, there exists a small voltage difference on the differential output nodes. The voltage difference is amplified by the balanced sense amplifier. The NMOS differential network of PreL-Diff-1 provides a pull-high path to VDD and that of the PreH-Diff-2 doesn't. However, the clocked device used to control the NMOS differential network in the PreH-Diff-2 is not required.

Figs. 3(a)-3(d) show the circuit structures of the singleinput TADL circuits. The NMOS logic tree is connected to only one of the differential outputs. According to similar design techniques of the Ldomino [9], the unbalanced sense amplifiers composed by P1, P2, N1, and N2 are used. The design techniques of the unbalanced sense amplifier and the operation of the circuits are listed in Table I.

The precharge-high single-input mode-1 (PreL-Sing-1) and precharge-high single-input mode-2 (PreH-Sing-2) are shown in Figs. 3(a) and 3(b), respectively. In the precharge phase, the output nodes H and H are precharged to VDD. When the PreH-Sing-1 is in the evaluate phase, the pulldown current of N1 is less than that of N2 as the NMOS logic is turned off. The output H is discharged to GND



Fig. 2 (a) The circuit structures of (a) PreH-Diff-1, (b) PreH-Diff-2, (c) PreL-Diff-1, and (d) PreL-Diff-2 TADL.

through the sense amplifier while the output  $\bar{H}$  remains high. When the NMOS logic is turned on, the summation of the pull-down current of N1 and the NMOS logic is more than that of N2. The output  $\bar{H}$  is discharged to GND through the NMOS logic and the sense amplifier while the output H remains high. When the PreH-Sing-2 is in the evaluation phase, the pull-down current of N1 is more than that of N2 as the NMOS logic is turned off. The output  $\bar{H}$  is discharged to GND through the sense amplifier. When the NMOS logic is turned on, there exists a pull-high current in the NMOS logic. The pull-down current of N1 minuses the pull-high current of the NMOS logic is less than that of N2. The output H is discharged to GND while the output  $\bar{H}$  remains high.

The precharge-low single-input mode-1 (PreL-Sing-1) and the precharge-low single-input mode-2 (PreL-Sing-2) are shown in Figs. 3(c) and 3(d), respectively. In the precharge phase, the output nodes L and L are precharged to GND. When the PreL-Sing-1 is in the evaluation phase, the pull-high current of P2 is more than that of P1 as the NMOS logic is turned off. The output L is charged to VDD while the output L remains low. When the NMOS logic is turned on, the summation of the pull-high current of the NMOS logic and P1 is more then that of P2. The output L



Fig. 3 (a) The circuit structures of (a) PreH-Sing-1, (b) PreH-Sing-2, (c) PreL-Sing-1, and (d) PreL-Sing-2 TADL.

(d)

(C)

is charged to VDD while the output L remains low. When the PreL-Sing-2 is in the evaluate phase, the pull-high current of P1 is more than that of P2 as the NMOS logic is turned off. The output  $\bar{L}$  is charged to VDD while the output L remains low. When the NMOS logic is turned on, the pull-high current of P1 minuses the pull-down current of the NMOS logic is less than that of P2. The output L is charged to VDD while the output  $\bar{L}$  remains low.

The true-single-phase pipelined systems is shown in Fig. 1(a). The logic circuits in the N-section can be implemented by the PreH-Diff-1, PreH-Diff-2, PreH-Sing-1, and PreH-Sing-2. The logic circuits in the P-section can be implemented by the PreL-Diff-1, PreL-Diff-2, PreL-Sing-1, and PreL-Sing-2. The modified C<sup>2</sup>MOS latches used in the conventional TSPC systems may result in output dips and lead to more power dissipation at the beginning of the evaluate phase. Moreover, sharp clock slopes are required to prevent the race problems. A self-timed DCVS latches (SDL) [10] shown in Figs. 4(a) and 4(b) have been proved to be clockslope insensitive and the output data is latched without the control of the clock. The clock buffer can be easily designed using the SDL. Moreover, the speed performance can be improved. When the N-section is in the precharge phase, the input of the N-latch as shown in Fig. 4(a) are precharged to VDD. The PMOS devices of the N-latch are turned off and the results of the previous evaluation are held at the

Table I The design of unbalanced sense amplifier in the signal-input TADL.

|              |                    |               |                                   | and the second sec |
|--------------|--------------------|---------------|-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|              | Sense<br>Amplifier | NMOS<br>Logic | Initial                           | Result                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Deall Oine 4 | P1 = P2<br>N1 < N2 | OFF           | i <sub>n1</sub> < i <sub>n2</sub> | $H = 0, \overline{H} = 1$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Pren-Sing-1  |                    | ON            | $i_{N1} + i_{Logic} > i_{N2}$     | $H = 1, \overline{H} = 0$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|              | P1 = P2<br>N1 > N2 | OFF           | $i_{N1} > i_{N2}$                 | $H = 1, \overline{H} = 0$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Pren-Sing-2  |                    | ON            | $i_{N1} - i_{Logic} < i_{N2}$     | $H = 0, \overline{H} = 1$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| PreL-Sing-1  | P1 < P2<br>N1 = N2 | OFF           | i <sub>P1</sub> < i <sub>P2</sub> | $L=1, \overline{L}=0$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|              |                    | ON            | $i_{P1} + i_{Logio} > i_{P2}$     | $L=0, \overline{L}=1$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| Beat Olars 0 | P1 > P2<br>N1 = N2 | OFF           | i <sub>P1</sub> > i <sub>P2</sub> | $L=0, \overline{L}=1$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| mier-oud-s   |                    | ON            | ip1 - i Logic < i P2              | $L=1, \overline{L}=0$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |

outputs. When the P-section is in the precharge phase, the input of the P-latch are precharged to GND. The NMOS devices of the P-latch as shown in Fig. 4(b) are turned off and the results of the previous evaluation are held at the outputs.



Fig. 4 The self-timed DCVS latch (SDL): (a) N-latch and (b) P-latch.

## 3. SIMULATION RESULTS AND COMPARISONS

The total devices required to implement complex functions can be reduced by using the differential logic circuits because of the shared logic tree [5]. However, more devices maybe required to implement some simple logic functions, for example, NAND, NOR, AOI, OAI...., e.t.c, by using the differential logic. In these cases, the logic functions can be implemented by using the single-input TADL. Only a singleinput logic is required and a pair of differential outputs can be acquired simultaneously. Thus the design flexibility is increased.

The 3-, 5-, 7-, 9-input differential XOR/XNOR gates are designed by the differential-input TADL and the conventional dynamic DCVS logic for comparisons. The 3-, 5-, 7, 9-input NAND gates are designed by the single-input TADL and the conventional single-ended dynamic logic. Since the precharge-low TADL circuits are implemented in the P-section of the pipelined systems. The conventional p-type dynamic circuits are designed for comparisons. The conventional dynamic circuits are also designed with larger dimensions for less delay time but more power dissipation. All of the circuits are designed with a SDL [10] as the output load. The 0.6um single-poly triple-metal CMOS model parameters are used for HSPICE simulation. The power supply is 5V and the clock frequency is 50MHz. The simulation results of the differential-input TADL and singleinput TADL are listed in Tables II and III, respectively. It is seen that the TADL circuits have a much higher speed operation and lower power-delay product. The advantages can be further seen as the logic complexity increases. (The PreH-Diff-2 and PreH-Sing-2 have longer delay time and the data is not shown.)

|      | n-type DCVS   |                             | n-type DCVS<br>(larger size) |                            | PreH-Diff-1   |                             | PreH-Diff-2   |                             |
|------|---------------|-----------------------------|------------------------------|----------------------------|---------------|-----------------------------|---------------|-----------------------------|
|      | delay<br>(ne) | power-delay<br>product (uJ) | delay<br>(ne)                | power-delay<br>product (w) | delay<br>(ne) | power-delay<br>product (uJ) | delay<br>(ne) | power-delay<br>product (uJ) |
| XORS | 0.57          | 8.82                        | 0.44                         | 9.53                       | 0.31          | 8,61                        | -             | •                           |
| XOR5 | 0.78          | 28.04                       | 0.69                         | 31.96                      | 0.51          | 25.92                       |               | •                           |
| XOR7 | 1.38          | 79.54                       | 1.20                         | 93.17                      | 0.88          | 63.03                       | -             | -                           |
| XOR9 | 2.18          | 172.08                      | 1.84                         | 236.86                     | 1.36          | 142.17                      | •             | -                           |

Table II The comparisons of differential-input TADL circuits with conventional dynamic DCVS circuits.

|      | p-type DCVS   |                             | p-type DCVS<br>(larger size) |                            | PreL-Diff-1   |                             | ProL-Diff-2   |                            |
|------|---------------|-----------------------------|------------------------------|----------------------------|---------------|-----------------------------|---------------|----------------------------|
|      | delay<br>(ne) | power-delay<br>product (uJ) | delay<br>(ne)                | power-delay<br>product (W) | dolay<br>(ne) | power-delay<br>product (uJ) | detay<br>(TB) | power-delay<br>product (W) |
| XORS | 1.06          | 19.23                       | 0.88                         | 32.56                      | 0.44          | 16.62                       | 0.63          | 24.13                      |
| XOR6 | 1.91          | 40.72                       | 1.64                         | 1,09.23                    | 0.64          | 34.87                       | 0.98          | 55.03                      |
| XOR7 | 3.23          | 157.35                      | 2.73                         | 303.72                     | 0.95          | 78.32                       | 1.56          | 138.24                     |
| XOR9 | 6.09          | 373.62                      | 4.18                         | 695.64                     | 1.24          | 133.76                      | 2.07          | 239.58                     |

Table III The comparisons of single-input TADL circuits with conventional dynamic circuits.

|       | n-type logic  |                            | n-type logic<br>(larger size) |                            | PreH-Sing-1   |                            | Prell-Sing-2  |                             |
|-------|---------------|----------------------------|-------------------------------|----------------------------|---------------|----------------------------|---------------|-----------------------------|
|       | delay<br>(ne) | power-delay<br>product (w) | delay<br>(m)                  | power-delay<br>product (w) | delay<br>(mi) | power-delay<br>product (w) | doley<br>(ne) | power-delay<br>product (uJ) |
| NANDS | 0.40          | 5.02                       | 0.27                          | 5.34                       | 0.22          | 5.02                       | ·             | •                           |
| NAND5 | 0.65          | 10.25                      | 0.36                          | 13.63                      | 0.30          | 9.36                       | •             | •                           |
| NAND7 | 0.61          | 27.94                      | 0.51                          | 30.24                      | 0.38          | 22.15                      | •             | -                           |
| NAND9 | 0.81          | 53.13                      | 0.76                          | 73.64                      | 0.66          | 50.21                      |               | -                           |

|       | p-type logic  |                             | (alger alls) |              | PreL-Sing-1  |                             | PrelSing-2    |                             |
|-------|---------------|-----------------------------|--------------|--------------|--------------|-----------------------------|---------------|-----------------------------|
|       | delay<br>(ne) | power-delay<br>product (uJ) | delay<br>(m) | product (uJ) | delay<br>(m) | power-delay<br>product (uJ) | delay<br>(ne) | power-delay<br>product (UJ) |
| NAND3 | 0.75          | 10.13                       | 0.59         | 13.94        | 0.38         | 12.34                       | 0.58          | 11.64                       |
| NAND5 | 1.26          | 30.72                       | 1.03         | 44.72        | 0.89         | 39.52                       | 0.69          | 18.43                       |
| NAND7 | 1.84          | 71.03                       | 1.57         | 113.02       | 1.38         | 88.52                       | 1.07          | 40.02                       |
| NANDS | 2.45          | 165.64                      | 2.25         | 289.83       | 1.98         | 186.14                      | 1.51          | 81.95                       |

# 4. CONCLUSIONS

In this paper, new CMOS differential logic circuits, called true-single-phase all-N-logic differential logic (TADL), are proposed and analyzed. The logic circuits are implemented with only NMOS devices in the logic tree and controlled by true-single-phase clock to form pipelined systems. A complex function can be implemented in a single TADL. Moreover, the design flexibility is increased by generating the differential outputs simultaneously. Simulation results show that the TADL has the advantages of high-speed operation and low power-delay product. The TADL circuits can be implemented in very high-speed complex VLSI.

### REFERENCES

 R.H. Karambeck, C.M. Lee, and H.S. Law, "Highspeed compact circuits with CMOS," IEEE J. Solid-State Circuits, vol. SC-17, pp. 614-619, June 1982.

- [2] J. Yuan and C. Svensson, "High-Speed CMOS circuit technique," IEEE J. Solid-State Circuits, vol. SC-24, pp. 62-70, Feb. 1989.
- [3] D.W. Dobberpuhl et al, "A 200-MHz 64-b dual-issue CMOS processor," IEEE J. Solid-State Circuits, vol. SC-11, pp. 1555-1565, Nov. 1992.
- [4] R.X. Gu and M.I. Elmasry, "An all-N-logic high-speed single-phase dynamic CMOS logic," in Proc. IEEE IS-CAS, pp. 7-10, 1994.
- [5] L.G. Heller, W.R. Griffin, J.W. Davis, and N.G. Thomas, "Cascode voltage switch logic: A differential CMOS logic family," in IEEE ISSCC Dig. Tech. Papers, pp. 16-17, 1984.
- [6] S.L. Lu, "Implementation of iterative network with CMOS differential logic," IEEE J. Solid-State Circuits, vol. SC-23, pp. 1013-1017.
- [7] C.Y. Wu and K.H. Cheng, "Latched CMOS differential logic (LCDL) for complex high-speed VLSI," IEEE J. Solid-State Circuits, vol. SC-26, pp. 1325-1328, Sep. 1991.
- [8] H.Y. Huang, K.S. Cheng, J.S. Wang, Y.H. Chu, T.S. Wu, and C.Y. Wu, "Low-voltage low-power CMOS true-single-phase clocking scheme with locally asynchronous logic circuits," in Proc. IEEE ISCAS, pp. 1572-1575, 1995.
- [9] J. A. Pretorius, A.S. Shubat, and C.A. Salama, "Latched Domino CMOS logic," IEEE J. Solid-State Circuits, vol. SC-4, pp. 514-522, Aug. 1986.
- [10] H.Y. Huang and C.Y. Wu, "Clock-slope-insensitive self-timed DCVS latch (SDL) for true-single-phase pipelined systems," to appear in IEEE Trans. Circuits & Systems, Part II, Jan. 1996.