High Performance Logic for Arithmetic Circuits by Das, Neeharika
 
 
    HIGH PERFORMANCE LOGIC FOR 
ARITHMETIC CIRCUITS 
 
 
 
Submitted by: 
 
NEEHARIKA DAS 
 
         
 
Department of Electronics and Communication Engineering, 
National Institute of Technology, Rourkela 
Orissa, 769008 
 
 
 
 
High Performance Logic for Arithmetic Circuits 
 
A Thesis submitted in partial fulfilment of the requirements for the 
degree of 
 
Bachelor of Technology 
in 
Electronics and Instrumentation Engineering 
by 
Neeharika Das 
Roll No. 108EI015 
Under the supervision of 
Dr. Kamalakanta Mahapatra 
Professor 
         
Department of Electronics and Communication Engineering, 
National Institute of Technology, Rourkela 
Session 2011-2012 
 
 
 
 
 
National Institute of Technology, Rourkela 
 
CERTIFICATE 
 
This is to certify that the Thesis entitled, ‘High performance logic for arithmetic circuits’ 
submitted by Neeharika Das in partial fulfilment of the requirements for the award of 
Bachelor of Technology Degree in Electronics and Instrumentation Engineering at the 
National Institute of Technology, Rourkela is an authentic work carried out by her under 
my supervision. To the best of my knowledge and belief, the matter embodied in the Thesis 
has not been submitted by her to any other University/Institute for the award of any 
Degree/Diploma. 
 
 
     Date                      Prof. Kamalakanta Mahapatra 
         Dept. of Electronics and Communication Engg., 
National Institute of Technology, Rourkela 
 
 
 
 
 
ACKNOWLEDGEMENTS 
 
 
This project in itself is an acknowledgement to the inspiration, drive and the technical 
assistance contributed to it by many people. It would have never been possible without the 
help and guidance that it received from them.  
Firstly, I would like to express my sincere thanks and deepest regards to my guide Dr. K K 
Mahapatra, Professor, Department of Electronics and Communication Engineering, 
NIT Rourkela, who has been the driving force behind this work. I thank him for giving me 
the opportunity to work under him by putting a trust in my credentials and capabilities, and 
helping me in exploring my potential to the fullest.  
I am grateful to Prof. Sukadev Meher, Head of the Department of Electronics and 
Communication Engineering, for permitting me to make use of the facilities available in the 
department to carry out the project successfully.  
I am thankful to Mr. Sauvagya Ranjan Sahoo, second year M tech student for discussing 
about the project throughout the duration of project and helping me with his expertise in the 
field. I would also like to thank Mr. Jaganath Mohanty and Mr. Ayaskant Swain for their 
generous help in familiarising me with the software and continuous encouragement in various 
ways towards the completion of this project.  
 Finally, I would thank all of them who have been associated with helped me during this 
project.  
      
 Neeharika Das 
 
 
 
ABSTRACT 
The objective of this project is to design high performance arithmetic circuits which are faster 
and have lower power consumption using a new dynamic logic family of CMOS and to analyze 
its performance for sequential circuits and effects upon cascading. This new dynamic logic 
family is known as Feedthrough logic. It has two basic structures: high speed (HS0) and low 
power (LP0). It allows for commencement of evaluation in a computational block before its 
evaluation phase begins, and quickly performs a final evaluation as soon as the inputs are valid. 
This dynamic logic family is best suited to arithmetic circuits because the critical path is made 
of a long chain of cascaded inverting gates.  As the major advantage of this logic which is higher 
speed is observed upon cascading, it’s most suitable for arithmetic circuits. We compare a set 
of ripple carry adders 4 bit and 16 bit in domino logic with the two basic structures derived. 
Experimental results have shown that the lower power structure provides for smaller power 
delay product when compared with domino logic.  
                              
             Certain modifications in the logic style are proposed to optimize the performance 
when applied to a single ended or double ended flip flops. The effects upon cascading are 
analyzed by using a 4-bit register. As delay is not propagated in a register circuit or any other 
synchronous sequential circuit (the circuit being edge triggered), the major advantage of this 
logic which is observed upon cascading cannot possibly be observed for sequential circuits. So 
even though the circuit can be optimised by feedthrough logic, this logic is not preferred for 
sequential circuits.  
                            
 So finally we have carried out the tapeout of 16 bit adder in LP0 using 180 UMC CMOS 
process flow. 
 
 
 
 
Contents  
List of Figures  
List of Tables 
CHAPTER 1: INTRODUCTION  
1.1 Motivation                                                                                                                          01  
1.2 Objective                                                                                                                            02  
1.3 Organization of the Thesis                                                                                                 02  
CHAPTER 2: KNOWN LOGIC FAMILIES RELEVANT TO PRESENT  
 SCENARIO – AN OVERVIEW  
2.1 Complementary CMOS logic                                                                                            04  
2.2 Pseudo NMOS logic                                                                                                          05  
2.3 Domino logic                                                                                                                     05 
CHAPTER 3: STUDY OF FEEDTHROUGH LOGIC 
3.1 Principle of operation                                                                                                         07  
3.2 Low power structure of FTL                                                                                              09  
CHAPTER 4: COMPARISON OF HS0 AND LP0 WITH KNOWN LOGIC  
STYLES FOR SMALL CIRCUITS  
4.1 2-stage inverter                                                                                                                  10  
4.1.1 Buffer in static CMOS logic                                                                                           10 
4.1.2 Buffer in pseudo NMOS logic                                                                                        11  
 4.1.3 Buffer in Domino logic                                                                                                  12 
 4.1.4 Buffer in HS0 logic                                                                                                        12 
 4.1.5 Buffer in LP0 logic                                                                                                        13                                                                           
4.1.6 Comparison of logic styles for buffer                                                                             14 
 
 
4.2 NAND circuit                                                                                                                    15  
4.2.1 NAND in static CMOS logic                                                                                          15 
4.2.2 NAND in pseudo NMOS logic                                                                                       15  
 4.2.3 NAND in Domino logic                                                                                                 16 
 4.2.4 NAND in HS0 logic                                                                                                       17 
 4.2.5 NAND in LP0 logic                                                                                                        17                                                                           
4.2.6 Comparison of logic styles for NAND                                                                            18 
CHAPTER 5: COMPARISON OF HS0 AND LP0 WITH DOMINO LOGIC  
 FOR ADDER CIRCUITS  
5.1 4-bit ripple carry adder                                                                                                     20  
5.1.1 Domino style 4-bit adder                                                                                               20 
5.1.2 HS0 style 4-bit adder                                                                                                     21    
5.1.3 LP0 style 4-bit adder                                                                                                     22  
5.1.4 Comparison of logic styles for 4-bit adder                                                                    24     
5.2 16-bit ripple carry adder                                                                                                   24  
5.2.1 Domino style 16-bit adder                                                                                             24 
5.2.2 HS0 style 16-bit adder                                                                                                   25    
5.2.3 LP0 style 4-bit adder                                                                                                     26  
5.2.4 Comparison of logic styles for 4-bit adder                                                                    26 
CHAPTER 6: ANALYSIS OF PERFORMANCE OF FTL ON SEQUENTIAL 
CIRCUITS 
6.1   Single ended flip flop                                                                                                     28  
6.1.1 Domino logic flip flop                                                                                                   28 
6.1.2 FTL flip flop                                                                                                                  29    
6.1.3Modified FTL flip flop                                                                                                    31  
 
 
6.1.4 Comparison of Domino with modified FTL for flip flop                                              32     
6.2   Double ended pulse triggered flip flop                                                                            32  
6.2.1 Domino logic flip flop                                                                                                    32 
6.2.2 FTL flip flop                                                                                                                   33    
6.2.3 Comparison of Domino with FTL for flip flop                                                              34  
CHAPTER 7:EFFECTS OF CASCADING SEQUENTIAL CIRCUITS 
7.1   4-bit domino logic register                                                                                              35  
7.2  4-bit modified FTL register                                                                                              36 
CHAPTER 8:DIGITAL TAPE OUT OF 16-BIT LP0 RIPPLE CARRY 
ADDER 
8.1 Schematic of 16- bit RCA                                                                                               39 
8.2 Layout                                                                                                                             39 
8.2.1 Layout of CMOS inverter                                                                                            40 
8.2.2 Layout of LP0 inverter                                                                                                 40 
8.2.3 Layout of LP0 1-bit adder                                                                                            42 
8.2.4 Layout of LP0 2-bit adder block                                                                                  43 
8.2.5 Layout of 16-bit LP0 adder                                                                                          44 
8.3 Post Layout simulation                                                                                                    45 
8.3.1 Simulation from schematic                                                                                           45 
8.3.2 Post Layout simulation                                                                                                 45 
CHAPTER 9: CONCLUSIONS                                                                                         47 
REFERENCES 
 
 
 
 
 
 
List of Figures  
Figure No.                                      Title                                                                           Page No.  
2.1                                   Static CMOS inverter                                                                          04  
2.2                                Pseudo NMOS inverter                                                              05  
2.3                                Conventional domino structure                                                  05  
3.1                                HS0 structure                                                                             08  
3.2                                Plot of different output stages of inverter                                  08  
3.3                                LP0 structure                                                                              09  
4.1                                Circuit diagram of 2-stage CMOS inverter                                10  
4.2                                Simulation waveforms of 2-stage CMOS inverter                     11  
4.3                                Circuit diagram of 2-stage pseudo NMOS inverter                    11  
4.4                                Simulation waveforms of 2-stage pseudo NMOS inverter         11  
4.5                                Circuit diagram of 2-stage domino inverter                                12  
4.6                                Simulation waveforms of 2-stage domino inverter                     12  
4.7                                Circuit diagram of 2-stage HS0 inverter                                     13  
4.8                                Simulation waveforms of 2-stage HS0 inverter                          13  
4.9                                Circuit diagram of 2-stage LP0 inverter                                      13  
4.10                              Simulation waveforms of 2-stage LP0 inverter                          14  
4.11                             Circuit diagram of CMOS NAND                                               15  
4.12                              Simulation waveforms of CMOS NAND                                   15  
4.13                              Circuit diagram of pseudo NMOS NAND                                  16  
4.14                             Simulation waveforms of pseudo NMOS NAND                       16 
4.15                             Circuit diagram of dynamic NAND                                            16 
4.16                             Simulation waveforms of dynamic NAND                                 17 
 
 
4.17                             Circuit diagram of NAND HS0                                                 17  
4.18                             Simulation waveforms of NAND HS0                                      17 
4.19                             Circuit diagram of NAND LP0                                                 18  
4.20                             Simulation waveforms of NAND LP0                                      18 
5.1                               1-bit RCA in domino logic                                                        20  
5.2                               4-bit RCA structure in domino                                                  21  
5.3                               Simulation waveforms of 4-bit RCA in domino                        21  
5.4                               1-bit RCA in HS0 logic                                                             21  
5.5                               4-bit RCA structure in HS0                                                       22  
5.6                               Simulation waveforms of 4-bit RCA in HS0                            22  
5.7                               1-bit RCA in LP0 logic                                                             23  
5.8                               4-bit RCA structure in LP0                                                       23  
5.9                               Simulation waveforms of 4-bit RCA in LP0                            23  
5.10                               16-bit RCA structure in domino                                             24  
5.11                              Simulation waveforms of 16-bit RCA in domino                    25  
5.12                               16-bit RCA structure in HS0                                                   25  
5.13                              Simulation waveforms of 16-bit RCA in HS0                         25 
5.14                               16-bit RCA structure in LP0                                                   26 
5.15                              Simulation waveforms of 16-bit RCA in LP0                         26 
6.1                               Domino logic master slave flip flop                                          28  
6.2                               Simulation waveforms of Domino flip flop                              29  
6.3                               FTL master slave flip flop                                                         29  
6.4                               Simulation waveforms of FTL flip flop                                    30 
6.5                               FTL latch circuit                                                                        30  
6.6                               Modified FTL master slave flip flop                                         31  
 
 
6.7                               Simulation waveforms of Modified FTL flip flop                    31 
6.8                               Domino logic pulse triggered flip flop                                      32  
6.9                               Simulation waveforms of Domino flip flop                              33  
6.10                              FTL pulse triggered flip flop                                                    33 
6.11                             Simulation waveforms of FTL flip flop                                    34  
7.1                               Domino 4-bit register circuit                                                     35  
7.2                               Simulation waveforms of Domino 4-bit register                      36  
7.3                               Modified FTL 4-bit register                                                       37  
7.4                               Simulation waveforms of modified FTL register                      37 
8.1                               16-bit RCA schematic diagram                                                 39  
8.2                               Layout of CMOS inverter                                                          40  
8.3                               RCX of CMOS inverter                                                             41  
8.4                               Layout of LP0 inverter                                                               41 
8.5                               RCX of LP0 inverter                                                                  42  
8.6                               Layout of 1-bit LP0 adder                                                          42  
8.7                               RCX of 1-bit LP0 adder                                                             43  
8.8                               Layout of 2-bit LP0 adder block                                                43  
8.9                               Layout of 16-bit LP0 adder                                                        44  
8.10                             RCX of 16-bit LP0 adder                                                          44 
8.11                             Simulation waveforms of 16-bit RCA from schematics           45 
8.12                             Post layout simulation waveforms of 16-bit RCA                    45 
 
  
 
 
 
 
 
List of Tables  
 
Table                                                     Title                                                                Page No.  
4.1                                    Comparison results of all logic styles for a buffer                  14  
4.2                                    Comparison results of all logic styles for NAND                   19  
5.1                                    Comparison results of 4-bit RCA                                            24  
5.2                                    Comparison results of 16-bit RCA                                          26 
6.1                                    Comparison of domino and modified FTL flip flop                32 
6.2                                    Comparison of domino and FTL double ended flip flop         34 
8.1                                    Comparison of results from post layout simulation with         46 
                                          Simulation from schematic 
  
 
  
    
 
    
                                                                                           
                                                                                      
                                                                                   
 
 
                            
1 
 
 
          
CHAPTER 1: INTRODUCTION 
1.1: MOTIVATION 
Digital electronic computations started with the introduction of vacuum tubes. In this era of 
vacuum tube based computer, machines like ENIAC and UNIVAC were developed. It was 
comprised of 18,000 vacuum tubes and was 80 feet long with several feet of height and 
width. This clearly tells about the low integration density problem of vacuum tubes. So 
implementation of larger engines became economically and practically infeasible.  
 
The invention of the transistor, followed by the introduction of the bipolar transistor led to 
the first successful IC logic family, TTL (Transistor-Transistor Logic). TTL had the 
advantage, of a higher integration density and on this; the first integrated circuit revolution 
was based. Ultimately, the large power consumption per gate put a restriction on the number 
of devices that can be reliably integrated on a single chip. 
                               Next was the turn of the MOS digital integrated circuit approach. Initially 
MOS ICs were implemented in PMOS only. As electrons have higher mobility than holes, 
NMOS was preferred later. The second age of the digital integrated circuit revolution began 
with the introduction of microprocessors by Intel (4004) and 1974 (8080). These processors 
used NMOS-only logic, with higher speed relative to the PMOS logic. But later, NMOS-only 
logic started suffering from the same problem: power consumption.  
 
Finally the balance tilted towards the CMOS technology, where we still are today. Power 
consumption concerns are again becoming dominant in CMOS design as well.  
2 
 
 
 
Unfortunately, this time there does not seem to be a new technology coming up any time 
soon. So what we can do is make slight modifications in the logic style so as to improve 
speed and reduce power consumption.[9] 
                   In case of CMOS, addition of a single input increases the device count by 2 and 
thus increases the propagation delay. New logic styles were developed to minimise the 
propagation delay and chip area. So forms of CMOS circuits are searched to supplement the 
static CMOS logic that can be used in special applications. Then Dynamic logic came into 
picture which works as per clock. It has higher speed as well as lower power but suffers from 
cascading problem which led to Domino and NORA logic styles.  
1.2: OBJECTIVE 
The objective of this project has two parts. First is to design high performance low power 
arithmetic circuits using this new CMOS dynamic logic family called FTL. Second is to 
analyse the performance of this logic when applied to sequential circuits and also the effects 
upon cascading. 
1.3: ORGANISATION OF THESIS 
The thesis is divided in nine chapters including this one. First chapter introduces the project 
idea and motivation behind it.  
Chapter 2 gives an overview of the known logic styles.  
Chapter 3 explains the principle of operation of FTL. 
3 
 
 In Chapter 4 there is a comparison of power and delay of two basic structures of FTL with 
other known logic styles for small circuits like buffer (2-stage inverter) and NAND.  
Chapter 5 deals with the design of a set of ripple carry adders 4-bit and 16-bit in FTL and 
compare the power and delay values with that of  Domino logic. 
 Chapter 6 gives the analysis of performance of FTL on a single ended master slave flip flop 
and a double ended pulse triggered flip flop. 
 Chapter 7 shows the effects upon cascading the sequential circuits.  
In Chapter 8, details of the tape out of 16 bit LP0 adder are discussed.  
 Chapter 9 has the conclusions. 
 
 
 
 
 
 
 
 
 
 
4 
 
 
CHAPTER 2: KNOWN LOGIC FAMILES RELEVANT TO 
PRESENT SCENARIO– AN OVERVIEW 
In this chapter, 3 types of logic styles – complementary CMOS, pseudo NMOS and Domino 
are discussed and their working are briefly described. 
 
2.1: COMPLEMENTARY CMOS LOGIC 
Figure 2.1 is a circuit of CMOS inverter. The transistor is acting like a switch with an infinite 
resistance in off state and a finite resistance in on condition.[1] 
               
                Figure 2.1 static CMOS inverter   
 When Vin is high and equal to VDD, the NMOS transistor is on and the PMOS is off which 
pulls the output node to ground. When the input is low PMOS is on and NMOS is off which 
makes the output node voltage high.[8] So there never is a direct path between VDD and Gnd 
which makes the static power consumption zero. Power consumption is only because of the 
leakage currents at the time of switching. Static CMOS logic has other important properties. 
The high and low output levels are VDD and GND. So the voltage swing is equal to the 
supply voltage resulting in high noise margins. The logic levels are not dependent upon the   
 
5 
 
 
device sizes relatively thus also known as ratioless devices. 
 
2.2: PSEUDO NMOS LOGIC 
Figure 2.2 is the circuit of a CMOS inverter. 
 
Figure 2.2 Pseudo NMOS inverter 
This is a ratioed logic style consisting of an active PDN connected to any load device. This 
reduces the gate complexity substantially at the cost of static power consumption. Transistor 
sizing is critical to maintain sufficient noise margins. The most popular approach in this class 
is the pseudo-NMOS technique where a PMOS is connected to gnd in place of a load. [1] 
Each input is connected to gate of only one transistor. [3] This will reduce its area but result 
in high power consumption due to direct path between supply voltage and gnd. 
 
2.3: DOMINO LOGIC 
Figure 2.3 is the conventional domino circuit structure. 
 
           Figure 2.3 Conventional structure of Domino circuit 
6 
 
Dynamic logic came to picture to reduce the area and gate complexity of CMOS. It reduced 
the device count and increased the speed. As there is no direct path, power consumption is 
very low. But there is cascading problem in dynamic logic which is removed in Domino and 
NORA logic styles. 
 
 Domino CMOS reduces device count and chip area and it improves performance relative to 
static CMOS. Major drawback of this circuit is power dissipation due to switching action and 
clock load. For the power dissipation problem, current structures trade power for speed (in 
the delay). 
 
 
 
 
 
 
 
 
 
 
 
 
 
7 
 
 
CHAPTER 3: STUDY OF FEEDTHROUGH LOGIC 
 
FTL works as per domino concept for dynamic circuits, with additional feature of gates 
commencing evaluation before their inputs arrive. This results in faster evaluation in the 
computational blocks. Also, the other problems associated with domino logic like 
implementation of non-inverting only logic, redistribution of charge and the need for output 
inverters—are eliminated,  reducing the die area and delay.  
 
 
 3.1: PRINCIPLE OF OPERATION 
           
  FTL has two basic structures- high speed structure (HS0) and low power structure (LP0). [4] 
The structure of HS0 family is shown in Fig 3.1. It consists of a PDN (pull down network 
NMOS block), an NMOS transistor (Tr) for pulling down the output node to zero, with a pull 
up PMOS load transistor (Tp). Trnsitors Tr and Tp are clock controlled. During the high 
phase of clock (reset phase),  Tr is on which pulls the output node to gnd. When clock goes 
low, Tr is turned off, and the output node evaluates to either high or low as per the input 
conditions. If the logic network is evaluated to high, the out node is pulled up toward VDD, 
else, it will remain low.[2] 
 
 
 
 
8 
 
 
                                                 
 Figure 3.1 HS0 structure                                              Figure 3.2 plot of output voltages of different stages of inverter 
 
Since in this logic family the output is low when clock is high, there is no need for inverters 
to restore the output node’s polarity. When there is a clock transition from 1 to 0, the outputs 
of the cascaded gates start rising to the switching threshold voltage Vth. This feature 
distinguishes FTL from other dynamic logic styles. At Vth, any small change in the input 
signal would cause an immediate change of the voltage value at the output node. In all other 
logic styles, inputs have to reach the threshold voltage to start the transition of output node. 
[2]  
            Now when the inputs arrive, output voltage will need to make just a partial transition 
from Vth to VOH or VOL. The higher speed of FTL is due to the reduced propagation delays in 
both low-to high and high-to-low transitions. This family however faces a challenge to 
maintain the stability of Vth for long cascaded circuit structures, which is the main reason 
behind the fast logic evaluation.  
 
 
 
9 
 
 
3.2: LOW POWER STRUCTURE OF FTL (LP0) 
 
The HS0 and LP0 families are derived from two basic logic families. HS0 is derived from 
pseudo NMOS logic whereas LP0 is derived from static CMOS logic style. In LP0 structure 
as shown in the Figure 3.2 the lower part is same as HS0 structure. [12]  
         
                   
                 Figure 3.3 LP0 structure 
So the principle of working is the same except that it has a PUN complementary to PDN due 
to which there is no static power consumption. So though speed is increased relative to 
domino style, it’s power consumption is very low as the there is no direct path from VDD to 
Gnd.[2] 
 
 
 
10 
 
CHAPTER 4: COMPARISON OF HS0 AND LP0 WITH 
KNOWN LOGIC STYLES FOR SMALL CIRCUITS 
 
Now in this chapter, I have compared HS0 and LP0 structures with other known logic styles 
like static CMOS, pseudo NMOS and Domino. The parameters which are compared are 
power and delay. In this chapter, I have taken only small circuits- two stage inverter (buffer) 
and NAND circuits. 
The simulations have been carried out using 180 nm CMOS process flow from UMC. 
 
4.1: 2-STAGE INVERTER (BUFFER) 
4.1.1: BUFFER IN STATIC CMOS LOGIC 
The circuit diagram is as shown in the Figure 4.1 and its simulation waveforms in Figure 4.2. 
 
  Figure 4.1 circuit diagram for 2-stage static CMOS inverter 
 
11 
 
 
  Figure 4.2 simulation waveform of static CMOS 2-stage inverter  
4.1.2: BUFFER IN PSEUDO NMOS LOGIC 
Figure 4.3 and 4.4 show the circuit and simulation waveform of 2-stage inverter in pseudo 
NMOS logic. 
 
  Figure 4.3 circuit diagram for 2-stage pseudo NMOS inverter 
 
  Figure 4.4 simulation waveform of pseudo NMOS 2-stage inverter  
12 
 
 
4.1.3: BUFFER IN DOMINO LOGIC 
Figure 4.5 and 4.6 show the circuit and simulation waveform of 2-stage inverter in Domino 
logic. 
 
Figure 4.5 circuit diagram for 2-stage Domino inverter 
 
Figure 4.6 simulation waveform of Domino 2-stage inverter  
4.1.4: BUFFER IN HS0 LOGIC 
Figure 4.7 and 4.8 show the circuit and simulation waveform of 2-stage inverter HS0 logic. 
13 
 
 
Figure 4.7 circuit diagram for 2-stage HS0 inverter 
 
Figure 4.8 simulation waveform of HS0 2-stage inverter  
4.1.5: BUFFER IN LP0 LOGIC 
Figure 4.9 and 4.10 show the circuit and simulation waveform of 2-stage inverter in LP0. 
 
Figure 4.9 circuit diagram for 2-stage LP0 inverter 
14 
 
 
Figure 4.10 simulation waveform of LP0 2-stage inverter 
 
4.1.6: COMPARISON OF LOGIC STYLES FOR A 2-STAGE INVERTER 
The Table given below in Table 4.1 shows the values of power and propagation delay of all 
the logic styles mentioned above. 
 Average Power Delay (tp) 
Complementary CMOS 0.32 uW 6.6 e-10 
Pseudo NMOS 100.1 uW 4.75 e-10 
Domino 0.99 uW 4.85 e-10 
HS0   50 uW 4.35 e-10 
LP0 0.13 uW 1.45 e-9 
 
Table 4.1 Comparison of power and delay of all logic styles for a 2-stage inverter circuit  
 
We can see that HS0 and pseudo NMOS circuits are the fastest and also consume maximum 
power.  Also, LP0 structure is the least power consuming but also the slowest amongst the 
lot.  Apart from that the results are more or less comparable. We take another small circuit 
i.e. NAND circuit for comparison. 
 
 
15 
 
 
4.2: NAND CIRCUIT 
4.2.1: NAND IN STATIC CMOS LOGIC 
Figure 4.11 and 4.12 show the circuit and simulation waveform of NAND circuit in static 
CMOS logic. 
 
Figure 4.11circuit diagram NAND in static CMOS logic 
 
Figure 4.12 simulation waveform of NAND in static CMOS logic 
4.2.2: NAND IN PSEUDO NMOS LOGIC 
Figure 4.13 and 4.14 show the circuit and simulation waveform of NAND circuit in static 
CMOS logic. 
 
16 
 
 
Figure 4.13 circuit diagram NAND in pseudo NMOS logic 
 
Figure 4.14 simulation waveform of NAND in static CMOS logic 
4.2.3: NAND IN DYNAMIC LOGIC 
Figure 4.15 and 4.16 show the circuit and simulation of NAND circuit in dynamic logic. 
 
Figure 4.15 circuit diagram NAND in dynamic logic 
 
17 
 
 
Figure 4.16 simulation waveform of NAND in dynamic logic 
4.2.4: NAND IN HS0 LOGIC 
Figure 4.17 and 4.18 show the circuit and simulation of NAND circuit in HS0 logic. 
 
Figure 4.17 circuit diagram NAND in HS0 logic 
 
Figure 4.18 simulation waveform of NAND in HS0 logic 
4.2.5: NAND IN LP0 LOGIC 
Figure 4.19 and 4.20 show the circuit and simulation of NAND circuit in LP0 logic. 
18 
 
 
Figure 4.19 circuit diagram NAND in LP0 logic 
 
 
Figure 4.20 simulation waveform of LP0 NAND circuit 
 
4.1.6: COMPARISON OF LOGIC STYLES FOR A NAND CIRCUIT 
The table given below in Table 4.2 shows the values of power and propagation delay of all 
the logic styles mentioned above. 
 
 
 
19 
 
 
 Average power Delay (tp) 
Complementary CMOS 1.57 e -7 W 0.045 ns 
Pseudo NMOS 3.22 e -5 W 0.028 ns 
Domino 2.81 e -8 W 0.033 ns 
HS0 1.49 e -5 W 0.035 ns 
LP0 1.39 e -7 W 0.065 ns 
 
Table 4.2 Comparison of power and delay of all logic styles for a NAND circuit  
 
As we can see, the results of both power and delay are more or less comparable. This is 
because the rise of output node to threshold voltage happens only after a certain no. of stages 
and neglected in the initial 2 stages. As our circuits are small with a very low device count 
and less cascading, we cannot observe any substantial difference in the power or delays of 
these logic styles. We therefore design a set of adders in the next chapter to observe the 
difference clearly in power and delay values. 
 
 
 
 
 
 
 
 
 
 
 
 
 
20 
 
 
CHAPTER 5: COMPARISON OF HS0 AND LP0 WITH 
DOMINO LOGIC STYLE FOR ADDER CIRCUITS 
  
We have presented the design of a set of adders [6,7] and made comparison of their features 
with a corresponding set of adders in domino logic to prove the usefulness of the FTL in 
practical applications. The simulation results of the 4–bit and 16-bit ripple carry adder (RCA) 
structures, for the implementation on 0.18 um 1.8V logic high speed and low power process 
from UMC are presented in this chapter. 
5.1: 4-BIT RIPPLE CARRY ADDER 
5.1.1: DOMINO STYLE 4-BIT ADDER 
Figure 5.1 shows the design of 1-bit adder in Domino style. 
 
Figure 5.1 1-bit RCA in Domino logic 
Now the figure given below i.e. figure 5.2 shows the structure of 4-bit RCA in Domino logic 
and figure 5.3 shows its simulation waveforms. 
 
21 
 
 
   Figure 5.2 structure of 4-bit RCA in domino style 
 
 Figure 5.3 Simulation waveforms of 4-bit RCA in domino style 
5.1.2:HS0 LOGIC FOR A 4-BIT ADDER 
Figure 5.4 shows the design of 1-bit adder in HS0 logic. 
 
Figure 5.4 1-bit RCA in HS0 logic 
22 
 
 
Now Figure 5.5 shows the structure of 4-bit RCA in HS0 logic and figure 5.6 shows its 
simulation waveforms. 
 
Figure 5.5 structure of 4-bit RCA in HS0 style 
 
Figure 5.6 Simulation waveforms of 4-bit RCA in HS0 style 
5.1.3: LP0 LOGIC FOR A 4-BIT ADDER 
Figure 5.7 shows the design of 1-bit adder in LP0 logic. Low power adder is significant in 
special application. [11] 
 
23 
 
 
Figure 5.7 1-bit RCA in LP0 logic 
Now figure 5.8 shows the structure of 4-bit RCA in LP0 logic and figure 5. shows its 
simulation waveforms. 
 
Figure 5.8 structure of 4-bit RCA in LP0 style 
 
Figure 5.9 Simulation waveforms of 4-bit RCA in LP0 style 
 
24 
 
 
5.1.4: COMPARISON OF LOGIC STYLES FOR 4-BIT FULL ADDER 
The table given below in Table 5.1 shows the values of power and propagation delay of the 
logic styles mentioned above. 
 Average Power Delay (tp) 
Domino 3.2 uW 1.03 ns 
HS0 296 uW 0.26 ns 
LP0 3.1 uW 0.8435 ns 
 
Table 5.1 Comparison results of 4-bit RCA 
 
The results are clear from this table. HS0 has the maximum power consumption and 
minimum delay. LP0 when compared to domino has a slightly lower value of propagation 
delay and nearly same amount of power consumption. So, when the power delay product is 
compared, LP0 structure is the most optimised one. For further verification we can check the 
results for a higher cascaded level i.e. 16-bit adder. 
5.2: 16-BIT RIPPLE CARRY ADDER 
5.2.1: DOMINO STYLE 16-BIT ADDER 
 Figure 5.10 shows the structure of 16-bit RCA in Domino logic and figure 5.11 shows its 
simulation waveforms. 
 
Figure 5.10 structure of 16-bit RCA in Domino style 
25 
 
 
 
Figure 5.11 Simulation waveforms of 16-bit RCA in Domino style 
5.2.2: HS0 LOGIC 16-BIT ADDER 
 Figure 5.12 shows the structure of 16-bit RCA in Domino logic and figure 5.13 shows its 
simulation waveforms. 
 
Figure 5.12 structure of 16-bit RCA in Domino style 
 
Figure 5.13 Simulation waveforms of 16-bit RCA in HS0 style 
26 
 
 
5.2.3: LP0 LOGIC 16-BIT ADDER 
 Figure 5.14 shows the structure of 16-bit RCA in Domino logic and figure 5.15 shows its 
simulation waveforms. 
 
Figure 5.14 structure of 16-bit RCA in LP0 style 
 
Figure 5.15 Simulation waveforms of 16-bit RCA in LP0 style 
5.2.4: COMPARISON OF LOGIC STYLES FOR 4-BIT FULL ADDER 
The table given below in Table 5.1 shows the values of power and propagation delay of the 
logic styles mentioned above. 
 Average Power Delay (tp) 
Domino 11.4 uW 4.02 ns 
HS0                1187.84 uW 0.58 ns 
LP0 37.2 uW 1.75 ns 
 
Table 5.2 Comparison results of 16-bit RCA 
27 
 
 
 
 
The results are even clearer for a 16-bit full adder. Delay is minimum for HS0 and power 
again is the maximum for HS0. But in case of LP0, delay is lower than Domino. Power delay 
product is the lowest in LP0. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28 
 
 
CHAPTER 6: ANALYSIS OF PERFORMANCE OF FTL ON 
SEQUENTIAL CIRCUITS 
First of all, the question is why. Why is there a need to apply FTL on sequential circuits. 
Machines at present have a power consumption rate related to their clock rates. The device  
operating at higher clock rate consumes more power. In a CPU, power dissipation is mainly 
because of the switching action of transistors inside it. So, if power consumption of one flip 
flop can be reduced, it will bring a huge difference for the entire CPU. [10] We have 
compared the results of power and propagation delays for single ended edge triggered master 
slave flip flop and double ended pulse triggered flip flop in domino and FTL styles. 
6.1: SINGLE ENDED FLIP FLOP 
6.1.1: DOMINO LOGIC FLIP FLOP 
Figure 6.1 is the circuit for master slave flip flop and figure 6.2 shows its simulation 
waveforms. 
 
Figure 6.1 Domino logic master slave flip flop 
29 
 
 
This is similar to the Clocked CMOS circuit which is insensitive to clock overlap. When clock is low, 
master stage latch acts as an inverter sampling the inverted value of input D on the internal node. This 
is evaluation phase for master stage. When clock is high master is in hold mode while the second 
slave section enters the evaluation phase. 
 The overall circuit acts as a positive edge triggered flip flop. 
 
  Figure 6.2 Simulation waveforms for Domino logic master slave flip flop 
6.1.2: FEEDTHROUGH LOGIC FLIP FLOP 
 
Figure 6.3 FTL master slave flip flop 
 
30 
 
 
 
Figure 6.4 Simulation waveforms of FTL master slave flip flop 
FTL LATCH  
  
Figure 6.5 FTL latch 
On the left is an HS0 inverter and on the right there is a latch. When the Clock is low, there is 
a high voltage at first node which switches the next transistor on. The lower transistor gets 
the input D and discharges as per the input condition. The output of next stage depends on 
conditional discharging of lower transistor. Output stage transistors enhance the load drive 
capability of the charge storage stage. [5] 
 
31 
 
 
Slight modifications are made to optimise the FTL latch. The output stage transistors are 
removed and W/L ratio of the lower transistor taking input D is increased to stabilise the 
output. The modified FTL master slave flip flop is given in figure 6.6 and its simulation 
waveforms are given in figure 6.7. 
6.1.3: MODIFIED FEEDTHROUGH LOGIC FLIP FLOP 
 
Figure 6.6 circuit of modified FTL master slave flip flop 
  
Figure 6.7 Simulation waveforms of modified FTL master slave flip flop 
 
32 
 
 
6.1.4: COMPARISON OF DOMINO WITH MODIFIED FTL FOR FLIP FLOP 
Table 6.1 shows the comparison of values of power consumption and propagation delays of 
Domino logic and FTL. 
 Average Power Delay (tp) 
Domino 1.73 e-4 W 0.254 ns 
Modified FTL                   4.05 e-6 W 0.215 ns 
 
Table 6.1 Comparison table for Domino and modified FTL 
 
From the table it is very clear that modified FTL has greater value of power consumption and 
a slightly lower propagation delay. But as the circuit is 1-bit, comparable results are expected. 
6.2: DOUBLE ENDED PULSE TRIGGERED FLIP FLOP 
6.2.1: DOMINO LOGIC PULSE TRIGGERED FLIP FLOP 
Figure 6.8 shows the circuit of a pulse triggered flip flop. 
 
Figure 6.8 circuit of pulse triggered domino flip flop 
33 
 
 
Pulse triggered circuits work for a very short duration of a pulse. They still are called flip 
flops as they do not sample inputs continuously.  In the circuit shown above, the pulse for 
which the circuit works is created by the use of 3 inverters in series and applying the input 
and delayed inverted value to two series NMOS resistors. As we can see there are 4 
transistors for discharging which takes time. Its simulation waveforms are given in figure 6.9. 
 
Figure 6.9 Simulation waveforms of pulse triggered domino flip flop 
6.2.2: FTL PULSE TRIGGERED FLIP FLOP 
 
Figure 6.10 circuit of pulse triggered FTL flip flop 
34 
 
 
Figure 6.10 shows the circuit of FTL double ended pulse triggered flip flop. [13] Here the 
pulse window is applied to the 2 PMOS connected to VDD. As the PMOS are on for a very 
short duration, this circuit consumes minimal power. Also the circuit has only 2 NMOS 
transistors to discharge which speed up the circuit. Simulation waveforms of this circuit are 
shown in Figure 6.11. 
 
Figure 6.11 Simulation waveforms of pulse triggered FTL flip flop 
6.2.3: COMPARISON OF DOMINO WITH FTL FOR DOUBLE ENDED FLIP FLOP 
Table 6.2 shows the comparison of values of power consumption and propagation delays of 
Domino logic and FTL for double ended pulse triggered flip flop 
 Average Power Delay (tp) 
Domino 8.91 e-6 W 0.2165 ns 
FTL                    5.64 e-6 W 0.1443 ns 
. 
Table 6.2 Comparison table for Domino and modified FTL 
From the results, we can see that FTL has lower power consumption and lower propagation 
delay as well. Lower delay is because of less no. of transistors in FTL for discharging and 
lower power is because the PMOS remains on for the duration of a small pulse. Though the 
circuit is optimised, we have to see the effects of cascading before deciding on opting for 
FTL in case of sequential circuits. 
35 
 
CHAPTER 7: EFFECTS OF CASCADING SEQUENTIAL 
CIRCUITS  
 
To analyse the effects of cascading, I have taken a 4-bit register made of single ended master 
slave flip flops implemented in Domino and modified FTL. 
 
7.1: 4-BIT DOMINO LOGIC REGISTER  
Figure 7.1 shows the circuit diagram of a 4-bit register implemented in domino logic. 
 
Figure 7.1 circuit of 4-bit register in domino logic 
Each block of single ended flip flop has a circuit which is given in figure 6.1. 
Figure 7.2 shows the simulation waveforms of a 4-bit register implemented in domino logic. 
36 
 
 
Figure 7.2 Simulation waveforms of 4-bit domino register 
 
The topmost waveform here is the input signal D followed by clock signal. The lower 4 waveforms 
are the outputs of the flip flops 1,2,3 and 4 of register circuit. Now, we can see that output of the 1
st
 
flip flop is as expected. Since same clock is applied to all the flip flops, outputs of previous stage flip 
flops don’t reach the next stage in time so as to get evaluated in time i.e. at the positive edge of clock. 
As it is a synchronous sequential circuit, the delay is not propagated through the stages. So the 
advantage of reduced collective propagation delay is not observed. 
 
7.2: 4-BIT MODIFIED FTL REGISTER  
Figure 7.3 shows the circuit diagram of a 4-bit register implemented in modified FTL logic. 
 
 
37 
 
 
Figure 7.3 circuit of 4-bit register in modified FTL logic 
Each block of single ended FTL flip flop has a circuit which is given in figure 6.6. 
Figure 7.4 shows the simulation waveforms of a 4-bit register implemented in modified FTL. 
 
Figure 7.4 Simulation waveforms of 4-bit modified FTL register 
38 
 
The topmost waveform here is the clock signal followed by input signal D. The lower 4 waveforms 
are the outputs of the flip flops 1,2,3 and 4 of register circuit. Now, we can see that output of the 1
st
 
flip flop is as expected. Since same clock is applied to all the flip flops, outputs of previous stage flip 
flops don’t reach the next stage in time so as to get evaluated in time i.e. at the positive edge of clock. 
As it is a synchronous sequential circuit, the delay is not propagated through the stages. So the 
advantage of reduced collective propagation delay is not observed. 
Now, as the major advantage of FTL which is observed upon cascading, cannot possibly be observed 
in synchronous sequential circuits. So, even if the circuit implemented in FTL is optimised i.e. has 
lower power consumption and higher speed, FTL is not preferred in case of sequential circuits. 
 
Now as we know that FTL is most suitable for arithmetic circuits, we carry out the digital tapeout of 
16- bit LP0 full adder. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39 
 
CHAPTER 8: DIGITAL TAPE OUT OF 16-BIT LP0 RIPPLE 
CARRY ADDER 
After the analysis in last two chapters, we have come to a conclusion that FTL is most 
suitable for arithmetic cascaded circuits. So, we further carry the tape out of 16-bit LP0 RCA 
which was found be the most optimised in chapter 5 Table 5.2. 
8.1: SCHEMATIC OF 16-BIT LP0 RCA 
Figure 8.1 shows the block/circuit diagram of 8-bit RCA in LP0. Each block contains 2 LP0 
RCA, 2 CMOS inverters and 1 LP0 inverter.  
 
Figure 8.1 circuit of 16-bit RCA in LP0 
8.2: LAYOUT 
A 16-bit RCA has following components: 
 1-bit LP0 full adder 
 CMOS inverter to invert inputs 
 LP0 inverter to invert outputs 
40 
 
To carry the layout of 16 bit adder, we need to carry out the layout of each of these 
components mentioned. After layout we have to follow certain steps for each layout: 
 DRC: Design rule check- It checks for errors in the layout of metals like misalignment 
or minimum spacing rules or metal overlap etc. 
 LVS: Layout Vs Schematic- It checks if the circuit created by layout is same as that of 
schematic. It checks for pin, net and device mismatches. 
 RCX: When the DRC and LVS are successful, we check RCX which shows the 
parasitic resistances and parasitic capacitances included in the circuit due to layout of 
metals. 
So, the layout and RCX of each of the components are shown below. 
8.2.1 LAYOUT OF CMOS INVERTER 
Figure 8.2 shows the layout of CMOS inverter and figure 8.3 shows its RCX output 
 
      Figure 8.2 layout of CMOS inverter 
41 
 
 
    Figure 8.3 RCX of CMOS inverter 
8.2.2 LAYOUT OF LP0 INVERTER 
Figure 8.4 shows the layout of LP0 inverter and figure 8.5 shows its RCX output 
 
   Figure 8.4 Layout of LP0 inverter 
42 
 
 
Figure 8.5 RCX of LP0 inverter 
8.2.3 LAYOUT 1-BIT LP0 ADDER 
Figure 8.6 shows the layout of LP0 1 bit adder and figure 8.7 shows its RCX output 
 
   Figure 8.6 Layout of LP0 1-bit full adder 
43 
 
 
Figure 8.7 RCX of LP0 1-bit full adder 
8.2.4 LAYOUT 2-BIT LP0 ADDER BLOCK 
Figure 8.8 shows the layout of 2-bit adder block 
 
Figure 8.8 Layout of 2-bit LP0 adder block 
44 
 
 
8.2.5 LAYOUT OF 16-BIT LP0 ADDER 
Figure 8.9 shows the layout of 16-bit LP0 adder and figure 8.10 shows its RCX output 
 
Figure 8.9 Layout of 16-bit LP0 adder  
 
Figure 8.10 RCX of 16-bit LP0 adder  
 
45 
 
8.3 POST LAYOUT SIMULATION 
Figure 8.11 shows the simulation waveforms of schematic and figure 8.12 shows the post 
layout simulation waveforms. 
8.3.1 SIMULATION FROM SCHEMATIC 
 
Figure 8.11 Simulation waveforms from 16-bit LP0 schematics  
8.3.2 POST LAYOUT SIMULATION 
 
Figure 8.12 Post Layout simulation waveforms of 16-bit LP0 adder 
46 
 
We can compare the two waveforms and observe a slight variation in the waveforms of post layout 
simulation although the waveforms look almost the same. This variation is due to the introduction of 
parasitic resistances and capacitances of metals used in the layout. This will be different for different 
layouts depending on the optimisation achieved while layout. 
Table 8.1 shows the comparison between post layout simulation and simulation from schematics. 
 Average Power Delay (tp) 
schematics 3.72 e-5 W 1.75 ns 
                   Post layout              11.09 e-5 W 3.7 ns 
 
Table 8.1 Comparison of post layout simulation with simulation from schematic 
It is clear from the table that both power and delay increase after layout. But 3.7 is still less 
than the delay of domino which was 4.02 ns. This is because of the metals used in layouts. So 
this file can be sent to a foundry for fabrication.  
After post layout simulation, placement and routing is done and finally the GDS file is created which 
can be sent to the foundry for the fabrication of chip.  
 
 
 
 
 
 
 
 
 
 
47 
 
 CHAPTER 9: CONCLUSIONS 
 
Propagation delay and power consumption are comparable for small circuits (i.e. circuits with 
less no. of inputs) in CMOS, pseudo NMOS, Domino and FTL because the device count is 
low and structure is not cascaded. 
 
HS0 LP0 family is best suited to applications in circuits with long chain of cascaded inverting 
structures as the value attained by the output node is Vth after certain no. of stages which is 
neglected in initial stages.  
 
Power delay product of HS0 is greater and of LP0 is lesser than that of domino. So, LP0 is 
the most optimised logic family out of three logic families compared here.  
 
In case of single ended flip flop, the designed circuit has a higher speed but also consumes 
more power. In case of double ended flip flop our design has higher speed and lower power 
too.  
 
 However the major advantage of FTL i.e. upon cascading is not applicable in case of 
sequential circuits as observed by the simulation results of register circuit. So FTL is 
preferred for arithmetic circuits only. 
 
 
 
 
 
48 
 
REFERENCES 
 [1] Jan M. Rabaey, Anantha Chandrakasan, Borivoje Nikolic, Digital integrated circuits : A     
design perspective 2nd edition, Pearson Prentice Hall, 2011  
[2] Victor Navarro-Botello, Juan A. Montiel-Nelson, Saeid Nooshabadi, High performance 
low power CMOS dynamic logic for arithmetic circuits, Microelectronics journal, May 2007.  
[3] Adel S. Sedra, Kenneth C.Smith, Microelectronic circuits, 5th edition, Oxford University 
Press,2003  
[4] Juan A. Montiel-Nelson, Saeid Nooshabadi, Fast feedthrough logic: A High performance 
logic family for GaAs, November 2004.  
[5] Victor Navarro-Botello, Juan A. Montiel-Nelson, Saeid Nooshabadi, Analysis of High- 
performance fast feedthrough logic familes in CMOS, June 2007. 
[6] N. Weste, K. Eshraghian, Principles of CMOS VLSI Design, A systems Perspective, 
Addison Wesley, MA, 1988. 
[7] C. Fang, C. Huang, J. Wang, C. Yeh, Fast and compact dynamic ripple carry adder 
design, Proceedings of IEEE Asia Pacific Conference on ASIC, APASIC 2002, August 2002, 
Taipei, Taiwan,pp. 25–28. 
[8] S. M. Kang, Y. Leblebici, ‘CMOS Digital Integrated Circuits: Analysis & Design’, TATA 
McGraw- Hill  Publication, 3e, 2003. 
[9] K.S. Yeo, K. Roy, ‘Low- Voltage, Low-Power VLSI Subsystems’. 
 
[10] H.Mahmoodi, V.tirumalashetty, M.cooke, and K.Roy, “ultralow power clocking scheme 
using energy recovery and clocl gating” IEEE Trans VLSI Syst. vol.17, pp33-44 jan2009. 
 
 
49 
 
.[11] Y.jiang, A. Al-sheraidah, Y.Wang, E.sha, J. Chung, A novel multiplexer based low 
power full adder, IEEE Trans. Circuits Syst.-II,vol. 52, 2004, pp. 345-348. 
 [12] V. Navarro-Botello, J.A. Montiel-Nelson, S. Nooshabadi, Low power arithmetic circuits 
in feedthrough dynamic CMOS logic, Proceedings of 49th IEEE International Midwest 
Symposium on Circuits, and Systems, MWSCAS-2006, San Juan, Puerto Rico, 
August 2006. 
[13] S.H.Rasouli, A.Khademzadeh, A.Afzali-Kusha and M. Nourani "low-power single and 
double edge triggered flip flop for high speed application," Proc. inst. electr. eng.-circuits 
Devices Syst.,vol.152, no.2 , pp.118-122,Apr-2005 
 
 
 
 
 
 
 
 
 
 
 
 
