A suggestion for low-power current-sensing complementary pass-transistor logic interconnection by 鄭國興
1997 IEEE International Symposium on Circuits and Systems, June 9-12,1997, Hong Kong 
A Suggestion for Low-Power Current-Sensing Complementary 
Pass-Transistor Logic Interconnection 
Kuo-Hsing Cheng" , Liow Yu Yee (Yii-Yih Liaw) and Jiaii-Hung Chen # 
Dept. of Electrical Engineering, Tamkang University, Taipei Hsien, Taiwan, R.O.C. China 
TEL: 886-2-6215656 Ext. 731 FAX: 886-2-62215658 
EMAIL: cheng@ee. tku.edu. tw* 
Abstract 
In this paper, a new circuit interconnection 
scheme of the low-power current-sensing 
complementary pass-transistor logic ( LCSCPTL ) 
is proposed and analyzed. The proposed new 
circuit scheme using full-swing and non-full- 
swing output signals to control the NMOS pass 
transistor logic tree network. Due to the non-full- 
swing outputs and the current-sensing scheme, the 
new logtc circuit scheme can improve the power 
dissipation and operation speed. The non-full- 
swing LCSCITL is applied to the design of the 
parallel multiplier. The 4-2 compressors and the 
conditional carry selection scheme are used in this 
design to achieve regular layout and improve the 
operation speed. Moreover, the 1.2V 8*8-bit 
parallel multiplier can be fabricated wthout 
changing the conventional 5V CMOS process. The 
operation speed of the parallel multiplier is 32 ns 
for 1 . 2 ~  supply voltage. 
I. Introduction 
For low-voltage digital systems the different 
pass-logic CPL-like styles are promising for low 
power and high-speed applications. Many logic 
families and sensing scheme have been proposed 
to achieve high operation speed and reduce the 
power dissipation. The latched complementary 
pass-transistor logic (CPL) [l] as shown in Fig. 1 
have been proven to have potential in low power 
digital circuit design [2].But due to the delivered 
voltage from NMOS PTL logic tree of the CPL is 
only Vdd-Vth and the voltage slop is slow. The 
CPL does not keep the speed and low voltage 
operation., The CSCPTL circuit as shown in Fig. 2 
is proposed [3] to improve the speed performance 
on low-voltage operation. But the CSCFTL 
consumes more dynamic power dissipation than 
the latched CPL. The LCSCPTL circuit [4] has 
been proposed to reduce the power dissipation and 
also keep the speed performance. In this paper, an 
improved circuit scheme of the LCSCPTL, called 
the non-full-swing LCSCPTL is proposed and 
analyzed. The 1.2V S*S-bit multipliers are also 
given as the various log~c circuits design 
comparisons. 
december@ee.tku.edu. t d  
V D D  vi) E 
7- 7 
C o m p le rri en i o rc om 1 pe r r  e n t U ry 
Fig. 1 The circuit diagram of the latched CPL. 
Drain Input Gate i r ,put  
.... ............................. 1 .......... 
"t t IN 
.. ~~~~ L. 
P T L  L O ~ K  i re?  i;-1 
F1 I 
l 
vL- 
Complementary Comlpementary 
Drain Input Gate input  
Fig. 2 The circuit diagram of the CSCPTL, 
II. The LCSCPTL and The Non-Full-Swing 
LCSCfTL 
In the following, the architecture and operative 
principles of the LCSCPTL circuit and non-fiull- 
swing LCSCPTL circuit are shown. 
A. The architecture af the LCYCPTL 
Fig. 3 shows the current-sensing buffer of the 
LCSCFTL. The nodes IN and INb are output 
nodes of the pass-transistor logic tree. The internal 
nodes S1 and S2 are the cross storage nodes. 
0-7803-3583-X/97 $10.00 01997 IEEE PM8 Authorized licensed use limited to: Tamkang University. Downloaded on March 24,2010 at 03:18:03 EDT from IEEE Xplore.  Restrictions apply. 
PMOS MP1-MP4 are the transconductance 
amplifiers which convert voltage to current MN2 
and MN3 are the current-sensing devices MN1 
and MN4 are used to mirror and amplify the 
sensed current from the PMOS transconductance 
amplifiers If INb==l and IN=O, PMOS MP1 and 
MP3 wl l  be turned off and MP2, MP4 are turned 
on Then node S1 starts to be charged by h4P2 and 
S2 is discharged MN5 and MN6 are used to cut 
off the dc current path after evaluation The 
NMOS MN7 and MN8 are used as the gate-to- 
sourceidrain capacitances, shunted w t h  MN2 and 
MN3. to reduce the negative feed-back effect of the 
MOS transistors MN2 and MN3 during logc  
swtching Thus the swtching speed of the nodes 
S1 and S2 are improved It also results that the 
dynamic power dissipation of the LCSCPTL 
circuit is decreased 
B The lVon-Full-LYwing LCSCPTL 
Obviously, the LCSCPTL circuit has the non- 
full-swing internal nodes S1 and S2 to accelerate 
the logc switcliiiig. The MOS transistor MP1, 
MN1, MP4 and MN4 are used as the output stage 
of the LCSCPTL to provide the full-swing output 
voltage. A new circuit interconnection scheme 
called non-full-swing LCSCPTL is shown in 
Fig. 4, where the MOS transistor MP1, MN1, MP4 
and MN4 are removed. The nodes S1 and S2 are 
connected to the OUT and OUTb as the output 
nodes respectively. It makes the new logc circuit 
has the non-full output voltages swing from Vtn to 
VDD. It saves the output stage MOS transistors, 
hence saves the output stage delay. Moreover, as 
shown in Fig. 4, the PTL logc tree has two 
different types of input signals, the complementary 
drain input and the complementary gate input. By 
using the non-full-swing output signals as the 
complementary gate input of the next logic gate to 
control the Gate of the PTL logic tree. Due to the 
reduced output voltage swing, it can accelerate the 
logc gate operation speed and reduce the power 
dissipation. 
111. Comparison of the Interconnection in 
PTL Tree 
The PTL logic tree has two different types 
of input signals, it makes the gate delay 
comparison is complex. In order to compare the 
gate delay and power dissipation, the fixed gate 
connection and fixed drain connection of the pass- 
transistor logic tree are shown. For example, in the 
SUM circuit as show in Fig. 5 ,  if the gate signal Y, 
Yb, Z or Zb is come from the critical delay signal 
(the longest delay signal), it is called as the fixed 
drain connection type. On the other hand, if the 
drain signal X or Xb is come from the critical 
delay signal to control to outputs , it is called as 
1949 
vzc -
I 
i VDC 
PTL ,ogic Tree 
8 ,  
Complementary Comlpementarv 
Gate Input Drain Input 
Fig. 3 The LCSCPTL 
OUT 
E] PTL Logic Tree  + ~- i' , !  
Lo m D I e m e n t a ry 
Draip  I i p u t  
i: o ml pe menta r) 
Gate ! r i p u t  
Fig. 4 The non-full-swing LCSCPTL 
the fixed gate connection type. Based upon 0.8um 
CMOS process HSPICE simulation, the gate delay 
and power dissipation comparison results of the 
SUM circuit of the CPL, LCSCPI'L and non-full- 
swing LCSCPTL are shown in Fig. 6. Due to the 
non-full-swing LCSCPTL only can be used as the 
complementary gate input, thus it has no fixed 
gate connection type. It is seen that the delay of 
LCSCPTL is 2.5 times higher than CPL in the 
fixed gate connection type. In the fixed drain 
connection type, the non-full-swing LCSCPTL 
consumes lowest dc power dissipation and least 
gate delay than other two. 
Fig. 5 The pass transistor logic tree of sum 
circuit 
Authorized licensed use limited to: Tamkang University. Downloaded on March 24,2010 at 03:18:03 EDT from IEEE Xplore.  Restrictions apply. 
fixed gate 
CPL 
fixed gate 
LCSCPTL 
fixed drain 
CPL 
fixed drain 
LCSCPTL 
fixed drain 
non-full-swing 
LCSCPTL 
5.46ns 
16uW 
2.1811s 
22uw 
2.82ns 
20uw 
3.84ns 
2ouw 
1.51ns 
7uw 
Fig. 6 The sum circuit delay in different logic 
family and its average power dissipation 
IV. Multiplier Architecture 
The Baugh-Wooley algorithm [5] is used to 
achieve the 2's-complement parallel multiplier. 
The structure of this parallel multiplier as shown 
in Fig. 7 is divided into two parts. The first part is 
the partial-product generation and carry-save 
addition array. The second part is the carry look- 
ahead adder circuit. Since this multiplier performs 
8 * 8-bit multiplicatioii, eight partial products are 
generated in this case. In order to improve the 
operational speed of the carry save addition array. 
The carry save addition array of the eight partial 
products are divided into two groups, the upper 
four partial products array and the lower four 
partial products array. The upper or lower four 
partial products array can be added into two 
products by adopting one 4-2 compressor addition 
stage, and only two 4-2 compressor addition stages 
are needed to add all the eight partial products into 
two products. Finally, two products are added with 
the 12-bit CLA to form the final result. As shown 
in Fig. 8, the 4-bit conditional carry selection 
(CCS) [6] [7] is used to implement the 12-bit CLA 
design. Fig. 9 shows the structure of the 12-bit 
CLA. The final result is generated by the 
conditional-sum selection ( CSS ) circuits. Fig. 10 
shows the logc trees of the pass-transistor of the 
multiplier. They include ANDNAND gate, 
OR/NOR gate, sum part, carry part, and 
multiplexers. 
V. Simulation Results 
The process time of the 8 * 8-bit parallel 
multiplier by using CPL, LCSCPTL and non-full- 
swing LCSCFTL is 66.5 ns, 55.6 ns and 31.9 ns 
individually as shown in Fig. 11. They are based 
upon the HSPICE simulation results under 1.2V 
where the threshold voltage of NMOS and PMOS 
transistors is 0.75V and -0.9V, respectively. Fig. 
12 shows the power dissipation comparisons. The 
Mult ipl ier & Mult ip l icund 
1 1  
............... 8 partial product ............... 
L . 2  
.................. Sum 3nd Carry ................. 
______ 
.................... 4-bit CL4 ...................... 
Fig, 8 Conditional Carry Selection (CCS) circuit 
Fig. 9 Block Diagram of the 12-b CLA 
Fig. 10 The Logic Tree of the PTL power dissipation simulation results are simulated 
1950 
Authorized licensed use limited to: Tamkang University. Downloaded on March 24,2010 at 03:18:03 EDT from IEEE Xplore.  Restrictions apply. 
under 20MElz operating frequency From the 
simulated results, it is seen that the operation 
speed of the non-full-swng LCSCPTL is about 2 1 
times higher than the latched CPL Moreover, the 
non-full-swng LCSCFTL has less power 
dissipation than the latched CPL Finally, the 
characteristics of this multiplier are summarized 
tn Table I 
fixed drain 
LCSCPTL 
, 2 
1 non-full-swrrvg 
i L C S C P T L  I 
1.2v - 
I 
1 ov I 31.9ns 
0.8~ I 
0 6v 
- I 
1 L C S C P T L  I 
1 F 
55.6ns I 
1 
1 CPL 0 4 ~  - - 1 665ns , 
0.2v II- 
! 
! 
! 
1 0 I_--_ 1 - 
200 220 240 260 280 30011s 
Fig 11 The Process Time of the 8x8 parallel 
I 
multiplier 
fixed drain 
CPL 
fixed drain 
non-full-swing 
LCSCPTL 
Fig. 11 The power dissipation of the 8x8 
parallel multiplier 
Table I 
Process Technology 
MOSFET gate length 0.8 um 
MOSFET gate oxide 19.0 nm 
NMOSFET threshold voltage 0.75 v 
PMOSFET threshold voltage -0.9 v 
Experimental Result 
delay power power delay 
time dissipation product 
normalized 
CPL 66.5 ns 48.1 UW I 
LCSCFTL 55.6 ns 39.6 UW 0.688 
non-full- 31.9 ns 35.1 UW 0.349 
swing 
LCSCFTL 
VI. Conclusion 
This paper describes a 1.2V 8 * 8-bit parallel 
multiplier by using the non-full-swing low-power 
current-sensing complementary pass-transistor 
logic (non-full-swing LCSCPTL) circuit. The 4-2 
compressors and conditional selection scheme are 
used in this design to achieve the regular structure 
of the layout and the fast operation speed. The 
non-full-swing LCSCFTL circuit is shown has 
advantages in both speed and power dissipation. In 
summary, the design of this work are quite 
promising for low-voltage low-power high speed 
VLSI applications. 
References 
[l]Kazuo Yano, Toshiaki Yamanaka, Takssh~ 
Nishida, Masayoshi Saito, Katsuhiro 
Shimohigashi, and Akihiro Shimizu, "A 3.8ns 
CMOS 16x 16-b Multiplier Using 
Complementary Pass-Transistor Logic," IEEE J.  
Solid-State Circuits, vol. 25, pp. 388-395, Apr. 
1990. 
[2]Abdellatif Bellaouar and Mohamed I.Elmasry, 
"Low-Power Digtal VLSI Design: Circuit and 
Systems," Kluwer Academic pul&,-, 
Nonvell, MA, 1995. 
[3]Chung-Yu Wu, Jr-Houng Lu, and Kuo-Hsing 
Cheng, "A New CMOS Current%&& 
Complem- entary Pass-Transistor Logic 
(CSCPTL) for High- Speed Low-Volt e 
Application," Proc. of I995 IEEE 3?Sdf%. 
Seattle, U.S.A., May 1995. pp. 25-28. 
[4]Kuo-Hsing Cheng, and Yii-Yih Liaw, "A Low- 
Power Current-Sensing Complementary Pass- 
Tran- sistor Logic (LCSCRL) for Low-Voltage 
High- Speed Applications," Proc. of' l Y Y 6  
Symposium on VLSl Circuits, Tech. Dig., 
Honolulu, Hawaii, June 1996. pp.16-17. 
[5]C. R. Baugh and B. A. Wooley, "A two's 
complem- ent parallel array multiplication 
algorithm," IEEE Trans. Comput., vol. C-22, pp. 
1045-1047, Dec. 1973. 
[6]M. Suzuki, N. Ohkubo, T. Yamanaka, A. 
Shimizu, and K. Sasaki, "A 1.5 ns 32b CMOS 
ALU in Double Pass-Transistor Logic," IEEE 
ISSCC. Digest of Technical Papers, pp. 90-91, 
1993. 
[7]M. Suzuki, N. Ohkubo, T. Shinbo, T. 
Yamanaka, A. Shimizu, K. Sasalu, and Y. 
Nakagome, "A 1.5 ns 32b CMOS ALU in 
Double Pass-Transistor Logic," IEEE J. Solid- 
State Circuits, vol. 28, no. 11, pp. 1145-1150, 
Nov. 1993. 
[8]Kuo-Hsing Cheng* and Liow Yu Yee, "The 
Design of Low Power Current-Sensing 
Complementary Pass-Transistor Logc and It's 
Application for Low-Voltage High-speed 
Multiplier" proc. of 1996 IEEE ICECS. 
1951 
Authorized licensed use limited to: Tamkang University. Downloaded on March 24,2010 at 03:18:03 EDT from IEEE Xplore.  Restrictions apply. 
