True-single-phase all-N-logic differential logic (TADL) for very high-speed complex VLSI by Huang, Hong-yi
TRUESINGLE-PHASE ALL-N-LOGIC DIFFERENTIAL LOGIC (TADL) 
FOR VERY HIGH-SPEED COMPLEX VLSI 
t Hong- Yi Huang Kuo-Hsin Cheng t Yuan-Hua Chu § Chung- Yu Wu 
tComp. & Comm. Research Labs., Industrial Tech. Research Inst., Chutung, Taiwan 310, R.O.C. 
Email: hyhuang@vlsi.ccl.itri.org.tw 
SDept. of Electrical Eng., Tam-Kang University, Tam-Shei, Taiwan, R.O.C. 
§Institute of Electronics, National Chiao-Tung University, Hsinchu, Taiwan 300, R.O.C. 
ABSTRACT 
A family of new logic circuits, called true-single-phase 
all-N-logic differential logic (TADL), are proposed and an- 
alyzed. The logic circuits are designed with only NMOS 
devices in the logic tree. Two kinds of sensing techniques 
are used for improving the speed operation, namely, the bal- 
anced sense amplifier for the Merentid-input TADL and 
the unbalanced sense amplifier for the single-input TADL. 
A complex function can be implemented in a TADL gate 
and high operation speed can be achieved without dc power 
dissipation. Only a true-single-phase clock is required to 
form the fully pipelined systems. Simulation results show 
that circuits designed by the TADL have the advantages of 
high-speed operation and low power-delay product. 
1. INTRODUCTION 
CMOS dynamic circuits have been widely used for high- 
speed VLSI [l]. Some design efforts are required to pre- 
vent the related problems, such as race, clock skew and 
clock slope, and charge sharing. The true-single-phase clock 
(TSPC) system has been proved to have the ability for very 
high-speed applications [2],[3]. However, the conventional 
TSPC circuits required a PMOS logic tree which may lead 
the speed limitation of the entire system. An all-N-logic 
single-phase logic was proposed with only NMOS logic tree 
141 and was proved to be 2-3 times faster than the conven- 
tional ones. 
The implementation of logic functions using CMOS dif- 
ferential logic has certain advantages over that using the 
singleended logic [5]. The differential ones increase the 
logic flexibility by offering both the complementary out- 
puts simultaneously. The packing density and speed per- 
formance are improved because the logic function is im- 
plemented using the NMOS logic tree. Moreover, complex 
logic can be implemented in a single gate so that that power 
dissipation can be reduced. 
The enable/disable CMOS differential logic (ECDL) [6] 
and the latched CMOS differential logic (LCDL) [7],[8] 
have higher operation speed than the conventional dynamic 
DCVS circuits [l] due to the operation of the balanced sense 
amplifier. The devices in the NMOS differential network 
can be designed with smaller device dimensions and a com- 
plex logic function can be implemented in a single gate. 
The latched domino (Ldomino) [9] can alleviate the inver- 
sion problem inherent in the domino logic by the addition of 
an unbalanced sense amplifier to a basic domino gate. To 
avoid failure evaluation caused from the operation of the 
0-7803-3073-0/96/$5 .OO '1996 IEEE 
sense amplifier, the input signals of the above differential 
logic circuits have to be stable before the logic circuits turn 
into the evaluation phase. The Ldomino circuits can only 
be used in the first stage or the interface between static 
CMOS logic and domino logic [9]. Direct cascading of the 
circuits is not allowed when all logic gates are controlled by 
a global clock. The ECDL provides a solution for direct cas- 
cading by generating asynchronous clock in each logic gate 
[6]. However, the generation of the locally asynchronous 
clock has to be carefully designed and an extra hardware is 
also required. 
To further improve the circuit techniques for very high- 
speed operation and increase the design flexibility. In 
this paper, new differential logic circuits, called true-single- 
phase all-N-logic differential logic (TADL), are introduced 
and analyzed. The TADL is a composition of the circuits 
techniques of ECDL, LCDL, Ldomino and some new tech- 
niques of circuits for more flexible design. 
2. CIRCUIT STRUCTURES AND 
Fig. l(a) and l(b) show the general structure and clock tim- 
ing of the true-single-phase clocking scheme, respectively. 
When the clock rises (falls), the N-section (P-section) is in 
the evaluation phase. When the clock falls (rises), the logic 
in the N-section (P-section) is in the precharge phase. The 
previous evaluation result is held at the output of the latch. 
Thus, a pipelined operation is formed in the way. 
OPERATING PRINCIPLES 
PSeotin P-latch 
d d = l  d&=O 
P - W i n  evaluate N-Section evaluate 
NSadion hold PSeGtion hold 
(b) 
Fig. 1 (a) The general structure and (b) clock timing of the 
true-single-phase pipelined systems. 
In the conventional TSPC system [2], the PMOS logic 
circuits are required in the P-section. The PMOS logic can 
be implemented with very simple function (NOT, 2-input 
NOR, or 2-input NAND) to break the speed limitation. 
However, this may result in more hardware requirement and 
296 
Authorized licensed use limited to: Tamkang University. Downloaded on March 23,2010 at 22:45:57 EDT from IEEE Xplore.  Restrictions apply. 
power dissipation. The all-N-logic single-phase logic is de- 
signed with only NMOS logic circuits in both N-section and 
P-section [4]. The clock slope sensitivity problem is mod- 
ified and the speed performance is improved. But there is 
still a limitation of noninverting and inverting functions in 
each pipeline section. 
Figs. 2(a)-2(d) show the circuit structures of the 
differential-input TADL proposed in the paper. (Figs. 
2(a) and 2(d) are called LCDL [7] and ECDL [SI in then 
original papers. They are renamed for a better descrip 
tion with the new modes of circuits in this paper.) The 
precharge-high differential-input mode-1 (PreH-DS-1) cir- 
cuit and precharge-high differential-input mode-2 (PreH- 
Diff-2) circuit are shown in Figs. 2(a) and 2(b), respectively. 
The NMOS differential network of PreH-DS-1 is controlled 
with a clocked NMOS N4 connected to GND and that of 
the PreH-Diff-2 is connected to VDD without the control 
of a clocked device. When the clock is low, the PreH-DS- 
1 and PreH-DS-2 are in the precharge phase, the outputs 
are precharged to VDD. When the clock makes a low-t+ 
high transition, the PreH-Diff-1 and PreH-Dfi-2 are turned 
into the evaluation phase. Depending on the input signal 
of the NMOS differential network, there exists a small volt- 
age difference on the differential output nodes. The voltage 
difference is amplified by the balanced sense amplifier com- 
posed by P1, P2, N1, and N2. In the evaluation phase, the 
NMOS differential network of PreH-Dfi-1 provides a pull- 
down path to  GND and that of the PreH-DS-2 doesn't. 
However, the clocked device used to control the NMOS dif- 
ferential network in the PreH-DS-2 is not required. 
Figs. 2(c) and 2(d) show the precharge-low differential- 
input mode-1 (PreL-DS-1) circuit and precharge-low 
differential-input mode-2 (PreL-Diff-2) circuit, respectively. 
The NMOS differential network of the PreL-DS-1 is con- 
trolled with a clocked PMOS P4 connected to VDD a d  
that of the PreL-DS-2 is connected to GND without the 
controlled of the clocked device. When the clock is high, the 
outputs of the PreL-Dfi-1 and PreL-DS-2 are precharged 
to GND. When the clock makes a high-to-low transition, 
the PreL-DS-1 and PreL-Dd-2 are turned into the evalu- 
ation phase. Depending on the input signal of the NMOS 
differential network, there exists a small voltage difference 
on the differential output nodes. The voltage difference is 
amplified by the balanced sense amplifier. The NMOS dif- 
ferential network of PreL-Dfi-1 provides a pull-high path 
to VDD and that of the PreH-Diff-2 doesn't. However, the 
clocked device used to control the NMOS differential net- 
work in the PreH-Diff-2 is not required. 
Figs. 3(a)-3(d) show the circuit structures of the single- 
input TADL circuits. The NMOS logic tree is connected to 
only one of the differential outputs. According to similar 
design techniques of the Ldomino [9], the unbalanced sense 
amplifiers composed by P1, P2, N1, and N2 are used. The 
design techniques of the unbalanced sense amplifier and the 
operation of the circuits are listed in Table I. 
The precharge-high single-input mode-1 (PreL-Sing-1) 
and precharge-high singleinput mode2 (PreH-Sing-2) are 
shown in Figs. 3(a) and 3(b), respectively. In the precharge 
phase, the output nodes H and are precharged to VDD. 
When the PreH-Sing-1 is in the evaluate phase, the pull- 
down current of N 1  is less than that of N2 as the NMOS 
logic is turned off. The output H is discharged to GND 
PreH-Diff - 1 PreH-Diff-2 
Differential 
(a) 
PreL-Diff- 1 
Differential 
Network 
Differential 
Network 
@) 
PreL-Diff -2 
, 
Differential I ,":: I 
4- 
(cl (a 
*I - .  
Fig. 2 (a) The circuit structures of (a) PreH-Diff-1, (b) 
PreH-DS-2, (c) PreL-DiR-1, and (d) PreL-DS-2 TADL. 
through the sense amplifier while the output B remains 
high. When the NMOS logic is turned on, the summa- 
tion of the pull-down current of Nl-and the NMOS logic is 
more than that of N2. The output H is discharged to GND 
through the NMOS logic and the sense amplifier while the 
output H remains high. When the PreH-Sing-2 is in the 
evaluation phase, the pull-down current of N 1  is more than 
that of N2 as the NMOS logic is turned off. The output H is 
discharged to GND through the sense amplifier. When the 
NMOS logic is turned on, there exists a pull-high current 
in the NMOS logic. The pull-down current of N 1  minnses 
the pull-high current of the NMOS logic is less than that of 
N2. The output H is discharged to GND while the output 
H remains high. 
The precharge-low single-input model  PreL-Sing-1) 
and the precharge-low single-input mode-2 [PreL-Si%-2) 
are shown in Figs. 3(c) and 3(d), respectively. In the 
precharge phase, the output nodes L and L are precharged 
to GND. When the PreL-Sing-1 is in the evaluation phase, 
the pull-high current of P2 is more than that of P1 as the 
NMOS logic is turned off. The output L is charged to VDD 
while the output L remains low. When the NMOS logic is 
turned on, the summation of the pull-high current of the 
NMOS logic and P1 is more then that of P2. The output f, 
297 
Authorized licensed use limited to: Tamkang University. Downloaded on March 23,2010 at 22:45:57 EDT from IEEE Xplore.  Restrictions apply. 
PreH-Sing-1 PreH-Sing-2 
T I I NlaN2, PI =P2 
(b) 
PreL-Sing-2 
I 
P1 >P2 
2 . 7  4- 
(c) (a 
Fig. 3 (a) The circuit structures of (a) PreH-Sing-1, (b) 
PreH-Sing-2, (c) PreL-Sing-1, and (d) PreL-Sing-2 TADL. 
is charged to VDD while the output L remains low. When 
the PreL-Sing-2 is in the evaluate phase, the pull-high cur- 
rent of P1 is more than- that of P2 as the NMOS logic is 
turned off. The output L is charged to VDD while the out- 
put L remains low. When the NMOS logic is turned on, 
the pull-high current of P1 minuses the pull-down current 
of the NMOS logic is less than that-of P2. The output L is 
charged to VDD while the output L remains low. 
The truesingle-phase pipelined systems is shown in Fig. 
l(a). The logic circuits in the N-section can be implemented 
by the PreH-DS-1, PreH-DS-2, PreH-Sing-1, and PreH- 
Sing-2. The logic circuits in the P-section can be imple- 
mented by the PreL-DH-1, PreL-DS-2, PreL-Sing-1, and 
PreL-Sing-2. The modified C2MOS latches used in the con- 
ventional TSPC systems may result in output dips and lead 
to more power dissipation at the beginning of the evaluate 
phase. Moreover, sharp clock slopes are required to prevent 
the race problems. A self-timed DCVS latches (SDL) [lo] 
shown in Figs. 4(a) and 4(b) have been proved to be clock- 
slope insensitive and the output data is latched without the 
control of the clock. The clock buffer can be easily designed 
using the SDL. Moreover, the speed performance can be im- 
proved. When the N-section is in the precharge phase, the 
input of the N-latch as shown in Fig. 4(a) are precharged 
to VDD. The PMOS devices of the N-latch are turned off 
and the results of the previous evaluation are held at the 
Table I The design of unbalanced sense amplifier in the 
signal-input TADL. 
outputs. When the P-section is in the precharge phase, the 
input of the P-latch are precharged to GND. The NMOS 
devices of the P-latch as shown in Fig. 4(b) are turned off 
and the results of the previous evaluation are held at the 
outputs. 
f 
(4 0 
Fig. 4 The self-timed DCVS latch (SDL): (a) N-latch and 
(b) P-latch. 
3. SIMULATION RESULTS AND 
The total devices required to implement complex functions 
can be reduced by using the differential logic circuits be- 
cause of the shared logic tree {5]. However, more devices 
maybe required to implement some simple logic functions, 
for example, NAND, NOR, AOI, OAI ...., e.t.c, by using the 
differential logic. In these cases, the logic functions can be 
implemented by using the single-input TADL. Only a single- 
input logic is required and a pair of differential outputs can 
be acquired simultaneously. Thus the design flexibility i s  
increased. 
COMPARISONS 
The 3-, 5-, 7-, 9-input differential XOR/XNOR gates are 
designed by the differential-input TADL and the conven- 
tional dynamic DCVS logic for comparisons. The 3-, 5-, 
7, %input NAND gates are designed by the single-input 
TADL and the conventional single-ended dynamic logic- 
Since the precharge-low TADL circuits are implemented in 
the P-section of the pipelined systems. The conventional 
ptype dynamic circuits are designed for comparisons. The 
conventional dynamic circuits are also designed with larger 
dimensions for less delay time but more power dissipation. 
AU of the circuits are designed with a SDL 1101 as the out- 
put load. The 0.6um singlepoly triple-metal CMOS mode€ 
parameters are used for HSPICE simulation. The power 
supply is 5V and the clock frequency is 5OMHz. The sim- 
ulation results of the differential-input TADL and single- 
input TADL are listed in Tables I1 and 111, respectively. It 
is seen that the TADL circuits have a much higher speed 
operation and lower power-delay product. The advantages 
can be further seen as the logic complexity increases. (The 
298 
Authorized licensed use limited to: Tamkang University. Downloaded on March 23,2010 at 22:45:57 EDT from IEEE Xplore.  Restrictions apply. 
PreH-Diff-2 and PreH-Sing-2 have longer delay time and 
the data is not shown.) 
Table I1 The comparisons of differential-input TADL 
circuits with conventional dynamic DCVS circuits. 
Table I11 The comparisons of single-input TADL circuits 
with conventional dynamic circuits. 
“7 0.61 p7.m 9024 0.31) 22.16 
0.81 63.13 0.76 73.64 0.80 €021 - 
4. CONCLUSIONS 
In this paper, new CMOS differential logic circuits, called 
truesinglephase all-N-logic differential logic (TADL), are 
proposed and analyzed. The logic circuits are implemented 
with only NMOS devices in the logic tree and controlled 
by true-single-phase clock to form pipelined systems. A 
complex function can be implemented in a single TADL. 
Moreover, the design flexibility is increased by generating 
the differential outputs simultaneously. Simulation results 
show that the TADL has the advantages of high-speed o g  
eration and low power-delay product. The TADL circuits 
can be implemented in very high-speed complex VLSI. 
[2] J. Yuan and C. Svensson, “High-speed CMOS circuit 
technique,” IEEE J. Solid-state Circuits, vol. SC-24, 
pp. 62-70, Feb. 1989. 
[3] D.W. Dobberpuhl et al, “A 200-MHz 64-b dual-issue 
CMOS processor,” IEEE J. Solid-state Circuits, vol. 
SC-11, pp. 1555-1565, NOV. 1992. 
[4] R.X. Gu and M.I. Elmasry, “An all-N-logic high-speed 
single-phase dynamic CMOS logic,” in Proc. IEEE IS- 
CAS, pp. 7-10, 1994. 
[SI L.G. Heller, W.R. Griffin, J.W. Davis, and N.G. 
Thomas, “Cascode voltage switch logic: A differen- 
tial CMOS logic family,” in IEEE ISSCC Dig. Tech. 
Papers; pp. 16-17, 1984. 
[6] S.L. Lu, “Implementation of iterative network with 
CMOS differential logic,” IEEE J. Solid-state Circuits, 
vol. SC-23, pp. 1013-1017. 
[7] C.Y. Wu and K.H. Cheng, “Latched CMOS differential 
logic (LCDL) for complex high-speed VLSI,” IEEE J. 
Solid-state Circuits, vol. SC-26, pp. 1325-1328, Sep. 
1991. 
[8] H.Y. Huang, K.S. Cheng, J.S. Wang, Y.H. Chu, T.S. 
Wu, and C.Y. Wu, “Low-voltage low-power CMOS 
truesinglephase clocking scheme with locally asyn- 
chronous logic circuits,” in Proc. IEEE ISCAS, pp. 
1572-1575, 1995. 
[9] J. A. Pretorius, A.S. Shubat, and C.A. Salama, 
“Latched Domino CMOS logic,” IEEE J. Solid-state 
Circuits, vol. SC-4, pp. 514522, Aug. 1986. 
[lo] H.Y. Huang and C.Y. Wu, “Clock-slope-insensitive 
self-timed DCVS latch (SDL) for true-single-phase 
pipelined systems,” to appear in IEEE Trans. Circuits 
& Systems, Part 11, Jan. 1996. 
REFERENCES 
[l] R.H. Karambeck, C.M. Lee, and H.S. Law, “High- 
speed compact circuits with CMOS,” IEEE J. Solid- 
State Circuits, vol. SC-17, pp. 614-619, June 1982. 
299 
Authorized licensed use limited to: Tamkang University. Downloaded on March 23,2010 at 22:45:57 EDT from IEEE Xplore.  Restrictions apply. 
