High-performance low-power microprocessor circuits by Lu, Shih-Lien et al.
AN ABSTRACT OF THE THESIS OF
Steven K. Hsu for the degree of Master of Science in Electrical and Computer




This thesis will discuss two critical components of a digital system --
domino logic styles and flip-flops. In today's microprocessors, both domino logic
and flip-flops are essential to high-performance and low-power design. Two new
domino logic styles are presented and analyzed, Double Edge Triggered (DET) and
Double Data Rate (DDR) Domino.Using a CMOS 0.25p.rn MOSIS model,
HSPICE simulations show that a DET & DDR 8-bit adder has a maximum
throughput of l.6Gops (Giga Operations per second) and 2Gops, respectively,
while a conventional domino adder has only iGops.In addition, this thesis
proposes a novel master-slave flip-flop.This flip-flop is compared with other
known flip-flop structures. Using a CMOS 0.35j.tm MOSIS model, the new flip-
flop was simulated to have an optimal power-delay product. The proposed flip-flop
consumes very low power, while having a very small clock load and data load.
Redacted for PrivacyCopyright by Steven K. Hsu
February 2, 2001





in partial fulfillment of
the requirements for the
degree of
Master of Science
Presented February 2, 2001
Commencement June 2001Master of Science thesis of Steven K. Hsu presented on February 2, 2001.
APPROVED:
Major Professor, representing Electrical and Computer Engineering
Head of Department of E1eçjica1 and Computer Engineering
Dean of
I understand that my thesis will become part of the permanent collection of Oregon
State University libraries. My signature below authorizes release of my thesis to
any reader upon request.





I would first of all like to thank my advisor, Dr. Shih-Lien Lu, for his
support and encouragement during my graudate years.
In addtion, I would like to thank Dr. Ram Krishnamurthy for his fruitful
comments, motivation, and suggestions.I would also like to thank the rest of my
graduate committee, which included Dr. John F. Wager and Dr. Bruce D'Ambrosio
for their time.
Lastly, I would like to thank my parents for their advice and encouragement
in my education.TABLE OF CONTENTS
ge
1. INTRODUCTION .1
1.1 Background ................................................................................................. 1
1.2 Scope ...........................................................................................................
2. DOMINO LOGIC ................................................................................................6
2.1 Introduction .................................................................................................6
2.2 Conventional Domino Logic ....................................................................... 7
2.3 Double Edge Triggered Domino Logic ..................................................... 10
2.4 Double Data Rate Domino Logic ..............................................................14
2.5 Qualitative Comparison ............................................................................. 17
2.6 Simulation ................................................................................................. 19
2.7 Full Adder Circuits .................................................................................... 20
2.9 Full Adder Comparison ............................................................................. 24
2.10 8-bit Ripple Carry Adders ....................................................................... 27
3.Ep-imops......................................................................................................
3.1 Introduction ...............................................................................................34
3.2 Flip-Flop Circuits ...................................................................................... 35
3.3 Simulation ................................................................................................. 37
3.4 Flip-Flop Test Bench ................................................................................. 38
3.5 Results ....................................................................................................... 40TABLE OF CONTENTS (continued)






1.Conventional Domino CMOS Logic ................................................................... 7
2.Conventional Domino Logic Pipeline ................................................................. 8
3. Domino Timing Diagram .................................................................................... 8
4. DET Domino Logic Block ................................................................................ 10
5.DEl Domino Logic Pipeline .............................................................................11
6.DET Domino Timing Diagram .........................................................................11
7.DDR Domino CMOS Logic ..............................................................................14
8.DDR Domino Logic Pipeline ............................................................................15
9.DDR Domino Timing Diagram .........................................................................15
10.Conventional Domino Full Adder .....................................................................22
11. DEl Domino Full Adder ................................................................................... 23
12. DDR Domino Full Adder .................................................................................. 23
13. Full Adder Total Power Comparison ................................................................ 26
14. Full Adder PTP Comparison ............................................................................. 26
15. Conventional Ripple Carry Adder ..................................................................... 27
16. DET Domino Ripple Carry Adder .................................................................... 28
17. DDR Domino Ripple Carry Adder .................................................................... 28
18. Total Transistor Count ....................................................................................... 29
19. Ripple Carry Worst Case Power Comparison ................................................... 32LIST OF FIGURES (continued)
Figure
20. Ripple Carry Maximum Throughput Comparison............................................32
21. Ripple Carry PTP Comparison..........................................................................33
22. Conventional Master Slave Flip-Flop...............................................................36
23. Modified C2MOS Flip-Flop..............................................................................36
24. PowerPC Flip-Flop............................................................................................37
25. Proposed Flip-Flop............................................................................................37
26. Flip-Flop Test Bench.........................................................................................38
27. Flip-Flop Maximum Clocking Frequency.........................................................39
28. Overall Delay Comparison................................................................................42
29. Total Power Consumption.................................................................................42
30. Total Power Range vs. delay.............................................................................43
31. Ranges of PDPtotal............................................................................................43
32. Constant Voltage Scaling..................................................................................44LIST OF TABLES
Table Page
1. NAND Precharge Buffer Truth Table .17
2.Qualitative Comparison................................................................18
3. HSPICE Simulation Model...............................................................................20
4.Full Adder Truth Table......................................................................................22
5.Full Adder Power Comparison..........................................................................25
6.Full Adder PTP Comparison.............................................................................25
7.Comparison of 8-bit Adders..............................................................................29
8.Ripple Carry Adder Power Analysis.................................................................31
9.8-bit Ripple Carry Adder Simulation................................................................31
10. Flip-Flop Simulation Parameters.......................................................................39
11. Flip-Flop Comparison..................................................................41LIST OF APPENDICES
Appendices Page
A. O.35p.m CMOS MOSIS HSPICE Model...........................................................51
B. O.25jtm CMOS MOSIS HSPICE Model...........................................................54
C. Conventional Domino 8-bit Ripple Carry Adder HSPICE Netlist....................57
D. DEl Domino 8-bit Ripple Carry Adder HSPICE Netlist.................................61
E. DDR Domino 8-bit Ripple Carry HSPICE Netlist............................................69
F. Example HSPICE Code for Conventional Domino Mesasurements................73
G. Example HSPICE Code for DDRIDET Domino Measurements......................75
H. Example HSPICE Netlist Conventional Flip-Flop............................................77
I.Example HSPICE Netlist PowerPC Flip-Flop..................................................78
J.Example HSPICE Netlist C2MOS Flop-Flop...................................................79
K. Example HSPICE Netlist SN-SN Flip-Flop......................................................80
L. Example HSPICE Flip-Flop Optimization Code..............................................82
M. Example HSPICE Code for Flip-Flop Power Measurements...........................87
N. Example HSPICE Code for Maximum Frequency...........................................93High-Performance Low-Power Microprocessor Circuits
1. INTRODUCTION
1.1 Background
Power dissipation in a microprocessor can be divided into three components
static, short-circuit, and dynamic power. Currently, the major component of the
power dissipation in a CMOS logic based digital system is the charging and
discharging the load capacitance of circuit nodes, otherwise known as dynamic
power [1]. This dynamic power can be expressed by the well-known equation:
Power= a*f*C*Vdd*Vswing (1)
In this equation a is the activity factor, f is the frequency, C is the total node
capacitance,Vddis the supply voltage, andVswingis the switching voltage. A
constant, direct path between the supply voltage and ground causes static power.
Another component of static power that is becoming more important is transistor
leakage. Leakage, or standby current, is power consumed even when the transistors
are cut-off. Short-circuit current is caused when CMOS logic stage transistor are in
the linear region because of slow rise and fall times of the input. This short-circuit
current occurs for a short amount of time and caused from a resistive path from
power to ground.Today, decreasing power is equally important as increasing
performance. One simple solution to decrease the power is to decrease the supplyvoltage. The power is reduced quadtratically by reducing the supply voltage. From
equation (2) we can see that if we solve for dt and set dQ=C*dV to solve for
equation (3).However, by decreasing the supply voltage, a designer sacrifices




In equation (4), C is the total switched node capacitance, Vdd is the supply voltage,
and I is the current. Because power and performance are both critical, it is crucial
to design the two most important parts of a digital system, the logic and flip-flops,
with high-performance and low-power techniques. Delay is not the only merit for
performance.In a critical path, the delay may be the crucial factor; however, in
some applications, the throughput of a circuit is the real performance metric.
Throughput can be defined as the amount of work done in a given amount of time
[21.From equation (5), we see that as frequency increases, the throughput
increases. Typically, the faster the frequency a digital system can be clocked, the
shorter the delay between the pipelines.
Throughput = # of operations* frequency (5)
Pipelining is a technique that allows higher throughput, at the expense of latency.
Increasing the number of pipeline stages also increase the maximum frequency of a
circuits. To compare circuits, this thesis will use two types of metrics. The first3
metric will be a power throughput product (PTP) and will be used to compare
domino logic styles. This metric is defined in equation (6).
PTP = Power*(1/throughput) (6)
The second metric is the power delay product (PDP) and will be used to compare
flip-flops. This metric is defined by equation (7).
PDP = Power*delay (7)
Using these two metrics, this thesis will be able to quantify the performance and
power trade offs of logic styles and flip-flops.
Dynamic logic has been used mainly for critical paths and functional unit
blocks where the throughput is essential.In these high throughput applications,
domino has been the logic of choice since it has 30% or more raw performance
improvement over static CMOS, especially in high fan-in gates.However, in
certain transistor processes, logic only continues at maximum performance until
increasing transistor widths becomes ineffective. In addition, due to higher leakage
and an increase in keeper size, domino logic performance has been a diminishing
benefit over static every process generation [3].Using a dual threshold process
helps alleviate the problem temporarily, however, does not address the long term
performance issue -- since both the low and high Vt must scale (thus increasing the
leakage) [4]. Thus, designers need to use high-performance logic families or find
alternative circuit styles or micro-architectural speed-ups. In all domino schemes,
one of the phases is used for pre-charge while the other phase is used for
evaluation. This thesis will explore and discuss alternative circuit logic techniquesto exploit both phases of the clock in domino circuit design. Two new domino
logic styles will be introduced, double edge triggered (DET) domino and double
data rate (DDR) domino.DET and DDR domino yield higher maximum
throughput, approximately 38% and 50%, respectively, than conventional domino.
In addition, DDR and DET domino enable lower clock frequencies for the same
throughput as domino. The need for a much lower frequency clock for the same
throughput allows a flexible clock distribution to help the clock distribution design
issues with today's microprocessors {5, 6].
For high-performance designs where high throughput is needed, logic is not
the only important factor.Flip-flops account for a large percentage of the area as
well. Many master-slave structures have been reported, yet due to higher demands
for performance and power, new flip-flop structures must be realized. This thesis
will also compare the conventional flip-flop, modified C2MOS flip-flop[7],
PowerPC 603 flip-flop [8], and a proposed flip-flop [9] in terms of delay and
power.5
1.2 Scope
This thesis will address circuit design issues and will show simulation:
results.The scope of this thesis is not to cover new architectures, but just to
showcase the new circuit techniques introduced. Chapter 2 covers domino logic,
where this thesis covers conventional, DET, and DDR domino. Comparisons and
simulation results are also covered in chapter 2.In chapter 3, flip-flops are
compared and a novel flip-flop is proposed. Simulation results are presented for
the flip-flops in chapter 3.Chapter 4 presents conclusions and recommendations
for the proposed circuits.2. DOMINO LOGIC
2.1 Introduction
There have been many techniques used toincrease frequency and
throughput in CMOS circuit design. Many dynamic logic styles will improve the
performance of the circuit.Dynamic logic replaces the slow PMOS logic
transistors with a single clocked PMOS transistor and incorporates a pre-charge and
evaluate stage.Domino logic was first introduced by [101 as an alternative
dynamic logic style with a simplified clocking scheme and the ability to cascade
gates.Today, domino logic is a standard in high-performance microprocessor
design. Domino has distinct characteristicsit is a non-inverting type and can
only make a low to high transition on the output. Another type of dynamic logic
includes NORA (No Race)[11], which prevents race conditions.This technique
also improves upon conventional dynamic logic, but it needs PMOS transistors in
some logic stages.Due to the elimination of the static stage in NORA, the
performance increases.However, this performance comes at the expense of
additional noise and reduced robustness.Reduced robustness causes false
discharges that would be amplified through a direct concactinantion of dynamic
stages.
Domino also has many different clocking schemes. Skew-tolerant domino
has been proposed by [12].It also removes latch delays by staggering the clocks
of domino. In industry design, it is common to see a traditional 2 phase or a higher7
performance 4 phase clocking scheme for domino design. Clock-Delayed Domino
[13] has also been introduced. In this type of domino, the clock is delayed using a
delay element.All of these high-performance techniques are useful in domino
design.
2.2 Conventional Domino Logic
In this section, the operation of the conventional domino logic will be
described. This thesis will use the term conventional domino for 4 phase clocking.
The thesis will not address 2 phase textbook domino.
In a conventional high-performance domino logic gate, there is an NMOS
logic part and a PMOS pre-charge part, which is a single clocked PMOS. The













Figure 1. Conventional Domino CMOS LogicS1.g.l Sbg.2 St.g.3 St.g.4





Figure 3. Domino Timing Diagram
For clarity, conventional domino operation is briefly described. As shown
in Figure1, conventional domino consists of an NMOS logic stage and a
precharge/keeper output. Domino gates primarily have a speed advantage, as well
as an area advantage. The speed advantage is due to the low input capacitance. All
of the logic is evaluated with an NMOS pull down network so there is no need to
drive additional PMOS gate capacitance on the input contrary to static CMOS.
Another advantage of domino is that the area is small compared to an equivalent
static logic function since the PMOS stage is replaced with a precharge PMOS. Inconventional domino, there is a precharge phase and then an evaluate phase. The
output of a domino gate can make only a. low to high transition during the
evaluation phase (monotonic).During the precharge phase, typically all inputs
must be at low to prevent precharge contention.
In Figure 2, a conventional domino logic pipeline is shown. The clocks are
delayed phases of each other. The foot transistors can be removed aggressively in
stages 2-4 if the static input node does not transition from a low to high before
evaluation occurs. If the low to high transition occurs during the precharge phase,
then there will be static power consumption. Figure 3 shows an example timing
diagram for 4-phase domino clocking with 50% duty cycle. The term domino is
used because as stage 1 evaluates, the result ripples through stage 2, and then stage
3 and stage 4. This clocking scheme consists of a clock and delayed versions of the
clock. C1k2 is a delayed version of ciki, and clk3 is actually the complement of
clkl. Since there are 4 phases of the clock, there is no need for intermediate latches
and the data can ripple through, eliminating clock skew entirely from the cycle
time. In this timing scheme, a stage is always evaluating when the data arrives. As
shown, when ciki is low, mpl is on and precharges node A to the supply voltage in
stage 1.When clk 1 switches high, mel switches on to allow the NMOS logic
section to evaluate. Mkl, minvk are part of the keeper circuitry.If node A needs
to be kept high during evaluation, node B is held low, and mkl switches and
sources current to keep node A high.If the leakage is too great, such as in wide
ORs, the keeper can be replaced with a conditional keeper [14] to allow a fast10
recovery time without fighting the pull down logic.Although domino provides
many advantages, it has its disadvantages.Since there is a precharge phase,
domino logic consumes a substantial amount of power.Also explained above,
domino logic needs complex clocking schemes and circuitry to correctly function.
Also shown in Figure 1, minvs is the static output inverter that can be replaced with
just a static logic stage.Typically, this inverter is skewed high to increase
performance.Usually, a static inverter is sized with a typical ratio of PMOS
transistor width = 2.5 * NMOS transistor width.If a inverter is skewed high, the
PMOS transistor width is increased, thus moving the low input trip point higher.
However, due to noise issues in domino design, skewing this inverter trades off
with noise tolerance of node A that can be amplified on to the output.
2.3 Double Edge Triggered Domino Logic
Figure 4. DET Domino Logic BlockS.q.I SI.g.2 St.9.4





Figure 6. DET Domino Timing Diagram
0
11
DEl domino logic style consists of an NMOS logic stage and a multiple
output latch. A high level diagram of DET domino logic is shown in Figure 4. As
shown, it is very similar to conventional domino. DET domino has many of the
same advantages as domino. DET domino provides low input capacitance for high
performance. However, since there is a complementary block on the output stage,12
there is less of an area win over static. The main advantage of DET domino is that
the logic can evaluate every phase of the clock. The NMOS evaluate transistors are
moved to the top of the NMOS logic stack creating a mirror of conventional
precharge/keeper outputs. By moving this NMOS evaluation transistor, the circuit
trades performance for noise such as charge sharing. The circuit must be sized
appropriately or use a dual rail domino style with cross-couple PMOS's to reduce
this charge sharing noise. Stage A and B are mirrors of each other and consist of
the precharge/keeper output.Also shown in Figure 4, instead of an inverter, the
circuit operates with a latch (miatch) to create a latch the output.The latch
prevents race through conditions from occuring when the next phase of data rolls
into the logic gate. DET domino also requires the complement of the clock and
establishes this through a local inverter or a separate clock if necessary. Instead of
having only an evaluation phase or only a precharge phase like conventional
domino, DET domino will always have one stage evaluating while the other stage
is precharging. For example, when clkl is high, stage B can be precharing while
stage A is evaluating. Stage B is precharged by mp2, and me2 turns off to isolate
the logic from the precharge. When this happens, mel is on and allows the logic to
evaluate the logic function (assuming that stage A was precharged in the previous
clock cycle). When the next phase switches and clkl is low, stage B is evaluating
(since it precharged last phase) and stage A is precharging for when clkl switches
high next. All stages of a DET domino pipeline evaluate on both phases of clock.
In this circuit configuration, DET domino uses both edges of the clock to evaluate13
and uses only one NMOS evaluation stack. However, like latched domino logic,
the data inputs must be set-up before evaluation occurs.This is a disadvantage
since it does not allow time-borrowing. Unlike conventional domino, the inputs do
not have to be low when the logic is precharging since they are isolated.
In Figure 5, a DEl domino logic pipeline shows the interface
between stage 4 to stage 1. Each alternating stage uses a DET domino logic block.
Stage 2 is just a DDR domino logic block.The reason that a designer must
alternate between the two types of blocks is that there is a potential race through
condition when two DET domino stages are cascaded. In Figure 6, the clk timing
diagram is shown. As shown, a 4 phase clk timing sequence is similar to domino.
Although it is 4 phase, the local inverters in each logic section create 2x the phases
making the total 8.This is because each logic block needs cik signal and its















Figure 7. DDR Domino CMOS Logic
output
14
Double Data Rate Domino (DDR) Logic is also very similar to conventional
domino. The concept includes having a duplicate logic function that can be used in
parallel. The output can be used as a NAND to be latched, or can be kept separate
with just inverters feeding into another domino stage. A circuit diagram of DDR
domino is shown in Figure 7. The input is fed into both logic functions.15
Sthg.1 Stag.2 Stag.3 stg.416
case, shown in Figure 9, ciki and clk3 are non-overlapping 50% duty cycle clocks.
The input will see twice the capacitance, however, if the output and input are kept
separate the input capacitance will be the same as conventional domino.To
describe the operation, as shown in Figure 3, assume that node A has precharged
already in the previous clkl phase.When data arrives, stage A is ready and
evaluates when ciki goes high. However, during this same phase, clk3 is low and
node C is precharged.Stages A and B are NANDed and the correct output is
propagated.
The NAND logic is summarized in Table 1.When clkl goes low to
precharge node A, clk3 goes high and evaluates stage C. Therefore, during every
phase logic is done and propagated through. There is no wasted precharge time.
The throughput is effectively doubled; however, the critical path delay still will be
the same or slightly slower since the static inverter is now replaced with a static
NAND. The throughput increase must be justified over the area and power impact
that DDR domino brings.Case 1 is not applicable since nodes A or C will be
precharged during a phase of the clock. Cases 2 & 3 are the same situation. In this
situation, node A or C is evaluated as 0 while the opposite node is precharged. In
either case, the node that evaluates to 0 controls the output to be a 1.In the final
case 4, both node A & C are 1. In this situation, either nodes A or C is evaluated to
1, and the opposite node is precharged. Since the evaluation is at 1, the output is





This case is not applicable since one of
1 0 0 1 the nodes will always be precharged.
Node A is evaluated as 0 and node C is
2 0 1 1 precharged.
Node C is evaluated as 0 and node A is
3 1 0 1 precharged.
Node C or A is evaluated as 1 and the
4 1 1 0 other node is precharged.
Table 1. NAND Precharge Buffer Truth Table
Since domino logic discharges the output node (while in precharge phase), the
activity factor for the output node is very high.For high fan out gates with an
activity facto of 1, there is an excess of power dissipation in conventional domino.
In order to increase the throughput, the frequency also must be increased.
2.5 Qualitative Comparison
Overall, Table 2 qualitatively covers conventional, DET, and DDR domino.
In terms of area, conventional domino is less constraining. Both DET and DDR
domino requireapproximately doublethe amount of transistors.Power
consumption is typically larger if area is larger. DDR domino consumes the largest
amount of power since the dynamic precharge dominates in domino and is twice
the area.Domino and DDR domino both enable very high frequencies. DET18
domino is slightly slower because latchesare necessary at the outputs. Although
clock frequencies are approximately within range ofeach other, DET and DDR



















Adding more transistors only creates more complexity; therefore, conventional
domino is the simplest. Robustness of conventional domino and DDR domino is
approximately the same.However, in DDR and DET domino, strict 50% duty
cycles must be obeyed and any overlapping clocks (which should otherwise be
non-overlapping) will create contention.This istypical in any double-edge
triggered timing such as double-edge triggered flip-flops. The robustness in DEl
domino is slightly less; due to the charge sharing that occurs during evaluation.
2.6 Simulation
For the simulation, a TSMC 0.25 jim CMOS modified process model from
MOSIS [151 was used. A small modification to the process file included the
introduction of the ACM=3 and HDIF=O.35j.tm to allow the simulation to calculate
the source and drain area automatically. This allowed a reduction in time writing
the spice deck because of AD, AS, PD, and PS are removed from the spice input
file and are replaced with just a single GEO parameter. To exploit this ACM
parameter, GEO could be set to values 0 through 3.The following are the
associated meanings:20
GEO =0 (default) drain and source are not shared
GEO = 1 drain of device is shared with another device
GEO = 2 source of device is shared with another device
GEO =3 source and drain are shared with another device
The 0.25pm process contained a single Vt, the NMOS Vt was 0.44V, while the
PMOS Vt was 0.66V. The voltage supply used was 2.5V. All simulations were
done at a nominal 25°C. Table 3 shows the breakdown of the process model and
simulation parameters.






Table 3. HSPICE Simulation Model
2.7 Full Adder Circuits
The circuit used to test the logic styles was a full adder. Full adders are the
principal building block for arithmetic units in microprocessors. A full adder is
composed of a carry and a sum stage.Equations 8and 9 show the logic of a full
adder.21
Sum=A$BC (8)
Carry = (A n B) v (A n C) v (BAC) (9)
The logic has three inputsA, B, and Cand the truth table of this full adder is
shown in Table 4.In conventional domino logic, the output is low during
precharge, therefore it is infinitely skewed low when evaluation occurs. In DDR &
DET domino, the output is skewed infinitely towards the previous value of the
output.
Conventional and DET domino full adders were designed and sized for
performance, power, and noise.Figure 10 shows the design of the conventional
domino full adder.The conventional domino full adder is composed of 26
transistors. In Figure 11, the DET domino full adder is shown. The DET domino
full adder is composed of 55 transistors, whereas the DDR domino full adder
consists of 52. The DDR domino full adder is shown in Figure 12. The carry logic
creates part of the sum logic with carry_bar. Therefore, the sum transistor logic can
be simplified as shown in equation 10.




Table 4. Full Adder Truth Table
carvy
Figure 10. Conventional Domino Full Adder
sum
22Figure 11. DET Domino Full Adder
Figure 12. DDR Domino Full Adder
urn
2324
2.9 Full Adder Comparison
Each full adder was simulated in a small test bench to validate functionality
and performance. All three inputs to the full adder were toggled so that all possible
inputs ase exercised. Therefore, all inputs have an equal probability of switching.
The carry output and sum outputs have a fan out of 5pm PMOS capacitance. This
setup allowed the full adder to be optimized for the 8-bit ripple carry adder. Power
was measured by taking the average current consumption for one clock cycle. The
table shows the clock power and logic power for a fixed throughput of 800Mops
(Mega operations per second).In Table 5, there is a comparison of power for
conventional, DET, and DDR domino full adder. As shown, the DET clock power
is slightly larger than conventional and DDR domino clock power, due to the extra
load capacitance associated with the latches.Since the comparison is a fixed
throughput, the DET and DDR domino full adders are operating at half the
frequency of the conventional full adder. Since this is the case, the clock power of
conventional and DDR are approximately equal since conventional is 2x the
frequency while DDR is double the load capacitance.The overall power









Domino (Fig. 10) 31.6 118.3 149.9
DEl Domino (Fig. 11) 41.8 132.5 174.3
DDR Domino (Fig. 12) 32.5 183.2 215.7
Table 5. Full Adder Power Comparison
Table 6 shows a comparison of total power and PTP for conventional, DET, and
DDR domino. As shown in Figure 13, the power consumption of DDR domino is
the highest among all three styles. DET domino power consumption falls between
the two.Figure 14 shows that for a fixed throughput of 800Mops, conventional














Domino 26 800 149.9 187.4
DET Domino55 400 174.3 217.9
DDR Domino52 400 215.7 269.6






















Figure 14. Full Adder PTP Comparison27
2.10 8-bit Ripple Carry Adders
The architecture was composed of a simple 8-bit pipelined ripple carry
adder. In a ripple carry adder, the carry of the first full adder becomes the input of
the second full adder. The sum output is ripped off the ripple carry adder and de-
skewed until all sums are available at the same time.Figure 15 shows the
architecture for the conventional domino 8-bit adder.Figure 16 shows the
architecture of the DET domino 8-bit adder. Figure 17 shows the architecture of
the DDR domino 8-bit adder. As shown with the DDR adder, the NAND gates
siphon the sum outputs. These architectures are very similar since they all use 4
phases.
1 B1 7
FAO FA1 FA2Ij FA3 FA4 FA5 FA6 FA7
so 51 S2 S3 S4 S5 S6 S7








B A5B5 A6B6 B
FA4 FA5 FA6 FA7
DET1 DET2 DET1 DET2
S4
Figure 16. DET Domino Ripple Carry Adder
S6
Cia C2a C3a C4a C5a IC6a IC7a C8a
FAOa FAIa FA2a FA3a FA4a jFA5aFA6a}.._....FA7a
++ ++
Cib C2b C3b C4b C5b IC6b IC7b
FAOb FAib FA2b FA3b FA4b *IFA6bH FA7b
SOb SOa Sib Sia S2b S2a S3S3a S4b S4a S5b1 S5a S6b1 S6a S7b
Figure 17. DDR Domino Ripple Carry Adder
S7a
In Table 7, the ripple carry adder characteristics are compared.Conventional
domino adder was composed of the least amount of area with only 208 transistors.
DDR and DET domino were approximately twice the transistor count with 416 and
452, respectively. The number of full adders refers to the number of physical full
adders (FA). In DET domino's case, a full adder is counted as 1.5 FA since either
two carry stages are needed or two sum stages are needed. Figure 18 summarizesthe transistor count of each type of adder. Although DET domino was composed of




. #of phases#of full adders
transistors
Domino 208 4 8
DET Domino452 4 12















Figure 18. Total Transistor Count
In Table 8, the clock and logic power are shown. Logic power is measured
for an activity of 1 and an activity of 0. With an activity of 0, power is measured
for both cases where the output is always 1 or always 0.In conventional domino,the worst-case power occurs when the output is always 1, since in every cycle the
gate precharges, forcing the output to be 0. When the evaluation phase occurs, the
domino node is discharged, forcing the output to be 1.if the fan out dominates the
power, this could force conventional domino to consume excess power. In DDR
domino, the output of the NAND will always stay at the correct value, thus saving
some additional load power. Figure 20 shows the worst-case power consumption
for all adders. The DET domino adder has the greatest power consumption when
activityis1,while conventional and DDR domino have the most power
consumption when activity is 0 and always 1 (since they both always discharge).
However, in DET domino, since there is shared logic, the range of power
consumption with activity is less than conventional and DDR domino. When the
domino gate does not evaluate every clock cycle, the minimum amount of power is
consumed because the domino node is not discharged.
In Table 9, the ripple carry adder simulation results are shown for the
maximum clocking frequency. Figure 21 show that DDR domino has the highest
maximum throughput of 2Gops (Giga operations per second) over and beyond
conventional domino.Since conventional domino evaluates only once per clock
cycle, the maximum throughput is half that of DDR domino. DET domino falls at
1.6Gops since the maximum clocking frequency is less than that of DDR and
conventional. PTP is a valuable metric because DDR and DEl domino consume
approximately twice the transistor count. As shown in Figure 22, DET domino has
a PTP almost independent of activity factor and value. DDR domino operates less31
than conventional domino in terms of PTP. The highest points on the graph refer to
the maximum power consumed, while the lowest points are where the lowest power
is consumed.
In a highly pipelined design, where domino is not sufficient, a designer may
choose to use DDR domino if the area and energy is not a concern. However, DET
domino could be used if the energy needs to be reduced and the data activity has a




Nominal ConditionsPower a=0 a=1 Power a=0
(mW) always 1 (mW)
always 0
(mW) (mW)
Domino 0.81 3.66 3.19 2.56
DET Domino 1.63 3.41 4.63 3.15
DDR Domino 1.61 7.81 6.16 6.10





Max. Worst CaseFTP Latency
Conditions Frequency ThroughputTotal Power
(ns) (pJ) (MHz) (Gops) (mW)
Domino 1000 2.44 1.0 4.47 4.47
DET Domino 800 3.31 1.6 6.26 4.14
DDR Domino 1000 2.51 2.0 9.42 4.71

































Figure 21. Ripple Carry PTP Comparison3. FLIPFLOPS
3.1 Introduction
34
In today's clocked synchronous circuits, a large percentage (up to30-40%)
of the power loss is due to the clock load. Therefore, it is very important to design
the registers (flip-flops and latches) in a synchronous digital system for optimal
performance and power consumption.These flip-flops need to have a small
sampling window, consume low power, and have a small clock load. A novel high-
performance low-power CMOS master-slave flip-flop is proposed here.The
proposed flip-flop consumes very low power, while having a very small clock load
and data load. In the HSPICE simulations, the new flip-flop has an optimal power-
delay product better than other reported master-slave structures. The proposed flip-
flop is compared to other reported master-slave flip-flops.
A flip-flop has a total power composed of internal power, data power, and
clock power.As mentioned before, a flip-flop's dynamic power dissipation is
dependent upon the switching activity. For data power, an a of 0 could equate to
data signals of all Os or all is for the period of time power is measured. Similarly
anaof 1 equates to maximum switching activity, where data input switches from 0
to i alternating every cycle. On the other hand clock power always has an activity
factor of 1, where there is no clock gating. Therefore, clock power has become a
more important issue in designing an efficient flip-flop.In addition, from the
power equation we know that lowering supply voltage is the most effective way to35
reduce the power consumption [.16; 17] of a CMOS circuit. Any new flip-flop must
work well at low supply voltages and will need to be able to complement new logic
styles, incorporating logic into them without penalty.
3.2 Flip-Flop Circuits
The previously reported flip-flops are shown in Figures 22-25.The
conventional flip-flop is a standard 16-transistor structure. This robust flip-flop can
be seen in many textbooks and is the most popular master-slave structure.The
modified C2MOS flip-flop did not include a local clock buffer, contrary to [7]. We
removed this to see the "real" clock power, because any of these master-slave flip-
flop structures could use the same technique. However, the modified C2MOS flip-
flop is suitable for low-power applications with a local clock buffer, due to its small
internal power dissipation. The PowerPC 603 flip-flop contains 18 transistors and
consumes very low power.
The proposed flip-flop, shown in Figure 26, has many interesting properties.
To lower clock power consumption, the flip-flop uses only NMOS clocked
transistors. The gate of NMOS transistors effectively has less capacitance than a
PMOS, yielding faster switching and lower power for the clock circuitry. Although
the proposed flip-flop uses only NMOS pass transistors, we remove the threshold
drop and static power consumption by the cross-coupled PMOS [18] that is formed
in the latch circuitry.The latch circuitry is created by two cross-coupled True-36
Single Phase Clocking (TSPC) SN stages [19]; however, our structure is pseudo-
static and needs the complement of the clock. The proposed flip-flop is a SN-SN
configuration, and a race limited SP-SN flip-flop can be formed.There is no
fighting within the internal circuitry, and the flip-flop can be optimized if the
complement of the signal is available.
k.JFclk
Figure 22. Conventional Master Slave Flip-Flop
Figure 23. Modified C2MOS Flip-FlopFigure 24. PowerPC Flip-Flop
Figure 25. Proposed Flip-Flop
3.3 Simulation
37
The simulation methodology was very similar to that of the proposed
method in [7].As seen in Table 10, the technology and the HSPICE simulation
are shown. The simulations used a 0.35tm MOSIS [15] BSIM model Level 49 at a
voltage of 2.OV.Each flip-flop is optimized in HSPICE with the embedded
optimizer for total average power and delay.For clarity, this thesis will review the definitions of internal power, clock
power, and data power.The internal power is the internal circuitry power
consumption only and does not include the output load capacitance.The clock
power is the power consumed by only the clocked transistors in the circuit. The
data power is the power consumed by driving the flip-flop.The simulation
measured the average power using a similar test bench as in [7] at 100MHz, an a of
0.5 for 16 clock cycles, and a load of 200fF.
3.4 Flip-Flop Test Bench
Figure 26. Flip-Flop Test Bench
200fF
To measure the performance, a different metric as proposed in [7].Although an
optimal data-output delay is a very suitable metric, our delay metric was the
maximum clocking frequency of a ring oscillator composed of one inverter and one39
flip-flop [20]. We used the same inverter size for each flip-flop, and as long as the
oscillation was twice the clock frequency, the circuit was operational.
Technology:








(1) Data/Clock slopes of ideal signals: iOOps
(2) Clock duty-cycle: 50%
(3) Delay Calculation: between 50% points
(4) Data sequences: 16 clock cycles
(5) Clock Frequency: 100MHz
Table 10. Flip-Flop Simulation Parameters
Upon meeting the oscillation requirements, we assumed:
1 / Max. Freq. =Tdelay=Tsetup+TclkQ+ +Tskew.(11)
ThisTdelaymetric, in (11), shows the performance of the flip-flop in normal




Figure 27. Flip-Flop Maximum Clocking Frequency3.5 Results
In Table 11, the simulation results are shown for an activity factor of 0.5.
One can see that the proposed flip-flop consumes very low power and has a
suitable delay. The delay is also shown in Figure 28. In Figure 29, we can see the
total power consumed. The conventional flip-flop shows robust results in terms of
speed and power. Although the modified C2MOS consumes the most power, it is
larger due to the clock power, where no local clock generator is used. However, it
consumes very low internal power compared to the other structures. The PowerPC
603 consumes low power, but has limited performance. The proposed flip-flop
consumes the lowest average power, while having a small clock load and a very
small data load.The data load reduction from the other circuits is due to the









Conditions transistors width (tW) (p.W) (p.m)(tW)
(ps) ( (gm)___________
Conventional 16 40.6 13.1 6.4 4.6 24.1 877.221.1
Modified 20 62.4 10.6 15.3 2.3 28.2 909.125.6 C2MOS_______
PowerPC 603 18 45.6 13.3 5.1 4.1 22.51063.823.9
Proposed 22 39.7 12.6 5.6 1.6 19.81000.019.8
(this work)

















ConvntionaI mC2MOS PowerPC Proposed
Figure 28. Overall Delay Comparison
Conventional mC2MOS PowerPC Proposed





From Figure 30, one can see the total power vs. delay for each optimized
flip-flop. This graph shows the power for different activity factors ranging from 0
to 1. The marked areas represent a = 0.5. Figure 6 shows the power-delay product
for each flip-flop. From Figure 31, the proposed flip-flop has an optimal power-
delay product better than the other previous structures.43
In Figure 32, constant voltage scaling was performed on the conventional
and the proposed flip-flop. Although the proposed flip-flop consumes low power,
the proposed flip-flop does not operate for the following condition V1>Vdd-Vt. To
obtain operation at extremely low voltages, low-threshold pass transistors must be
used.However, our proposedflip-flopdoes considerably well with the




















£ Conventionalm(YMOS o Proposed- PowerPC
Figure 30. Total Power Range vs. delay
Conventional n2MOS PowerPC
Activity Factor = 0.5























04. RECOMMENDATIONS & CONCLUSION
4.1 Recommendations
Although this thesis has showcased DDR and DET logic as domino logic
solutions, they are both general-prpose solutions to any dynamic logic.Such
examples could include Current Sensing Differential Logic [21], Differential
Current Switch Logic [22], Pre-Charged Pass Transistor Logic [23], and N-Logic
True Single Phase Logic [24]. These dynamic logic families are some candidates
to have double data rate and double edge triggered concepts applied. This would
need further investigation and experimentation to see how much impact these
techniques would have. DEl domino chage sharing could be removed by using
DEl Pre-Charged Pass Transistor Logic. In addition, another appropriate solution
would be dual-rail DET domino with cross-coupled PMOS's for charge sharing and
leakage control.
The proposed flip-flop could work very well with Complementary Pass Logic
(CPL) styles [25, 26, 27]. Since the complement of the data input is produced and
needed, the extra input inverter can be moved to the output to produce the
complement. Typically, in order to have low voltage operation, CPL must have
low-threshold NMOS pass-transistors. With this type of process, the proposed flip-
flop also increases in performance as well. A very typical application could be
CPL with dual-Vt NMOS only.The second Vt for NMOS is used for the pass
transistors, and is achieved via substrate bias. Dual-Vt CMOS process yields moreperformance, however, it is more expensive since it requires a twin tub.Either
multiple-Vt schemes will work well with the proposed flip-flop. However, we must
point out that the other flip-flops do not benefit as much as the proposed with the
dual-Vt NMOS only scheme.
4.2 Conclusion
High-performance low-power CMOS circuits were presented. DET and DDR
domino were presented and compared with conventional domino. Using a O.251tm
CMOS MOSIS model, HSPICE simulation confirmed that DET and DDR domino
yield higher maximum throughput, approximately 38% and 50%, respectively, than
conventional domino and thus enable lower clock frequencies for a given
throughput. Domino logic styles were presented, and other parts of a digital system
were investigated.
Logicisnot the only critical design issue;flip-flops also need to be
investigated. A novel CMOS master-slave flip-flop was presented. The proposed
flip-flop consumes very low power, while having a very small clock load and data
load.The proposed flip-flop was compared to other reported master-slave flip-
flops. Using a O.35p.m CMOS MOSIS model, the new flip-flop was simulated to
have an optimal power-delay product better than other reported master-slave
structures.47
REFERENCES
[1] N. Weste and K. Eshraghian, Principles of CMOS VLSI Design. Menlo Park,
California: Addison Wesley Publishing Company, 1993.
[2]J.Hennessy and D. Patterson, Computer Architecture: A Quantitative
Approach. San Francisco, California: Morgan Kaufmann, 1996.
[3] S. Thompson, I. Young, J. Greason, and M. Bohr, "Dual Threshold Voltages
and Substrate Bias: Keys to High Performance, Low Power, 0. lpm Logic
Desgins," 1997 Symposium on VLSI Technology Digest of Technical Papers,
pp. 69-70.
[4] V. De and S. Borkar, "Technology and Design Challenges for Low Power and
High Performance," 1999 International Symposium of Low Power Electronic
Devices, pp. 163-168.
[5] Y. Liu, S. Nassif, L. Pileggi, and A. Strojwas, "Impact of Interconnect
Variation on the Clock Skew of a Gigahertz Microprocessor," Design
Automation Conference 2000, pp. 168-17 1.
[6]S. Rusu and S. Tam, "Clock Generation and Distribution for the First IA-64
Microprocessor," International Solid State Circuits Conference 2000, pp. 176-
177.
[7] V. Stojanovic and V. Oklobdzija, "Comparative Analysis of Master-Slave
Latches and Flip-Flops for High-performance and Low-power Systems," IEEE
J. Solid State Circuits, vol. 34, no. 4, pp. 536-548, April 1999.
[8] G. Gerosa, S. Gary, C. Dietz, D. Pham, K. Hoover, J. Alvarez, H. Sanchez, P.
Ippolito, T. Ngo, S. Litch, J. Eno, J. Golab. N. Vanderschaaf, and J. Kathle,
"2.2W, 80MHz Superscalar RISC processor," IEEE J. Solid State Circuits, vol.
29, no. 12, pp. 1440-1454, Dec. 1994.
[9]S. Hsu and S.L. Lu, "A Novel High-Performance Low Power CMOS Master-
Slave Flip-Flop," Proceedingsofl2"' Annual IEEE International ASIC/SOC
Conference, September 1999.
[l0]R. H. Krambeck, C. Lee, and H. Law, "High-Speed Compact Circuits with
CMOS," IEEE J. Solid State Circuits, vol. 17, no.3, pp. 614-619, June 1982.REFERENCES (continued)
{11]N. Goncalves and H. De Man, "NORA: A Racefree Dynamic CMOS
Technique for Pipelined Logic Structures," IEEE J. Solid State Circuits, vol.
SC-18, no. 3, PP. 26 1-266, June 1983.
[12]D. Harris and M. Horowitz, "Skew-Tolerant Domino Circuits," IEEE J. Solid
State Circuits, vol. 32, no. 11, pp. 1072-17 11, November 1997.
[131G. Yee, et al., "Clock-Delayed Domino for Adder and Combinational Logic
Design," Proceedingsofthe 1996 International Conference on Computer
Design, Austin, Texas, pp. 332-337.
[14]A. Alvandpour, P. Larsson-Edefors, and C. Svensson, "A Leakage Tolerant
Multi-Phase Keeper for Wide Domino OR's," The
6thIEEE International
Conference on Electronics, Circuits, and Systems 1999, vol. 1, pp. 209-2 12.
[15] <http//:www.mosis.edu>.
[16] S. Mutoh, S. Shigematsu, Y. Matsuya, H. Fukuda, and J. Yamada, "A 1V
Multi-threshold Voltage CMOS DSP with an Efficient Power Management
Technique for Mobile Phone Application," ISSCC Digest of Technical Papers,
FA 10.4, Feb. 1996.
[17]R. Gonzalez, B. Gordon, and M. Horowitz, "Supply and Threshold Voltage
Scaling for Low-power CMOS, "IEEE J. Solid State Circuits, vol. 32, no. 8,
pp. 1210-1216, August 1997.
[18] M. Afghahi, "A Robust Single Phase Clocking for Low-power, High-speed
VLSI Applications," IEEE J. Solid State Circuits, vol. 31, no. 2, pp. 247-254,
Feb. 1996.
[19]J. Yuan and C. Svensson, "New Single-clock CMOS Latches and Flip-flops
with Improved Speed and Power Savings," IEEE J. Solid State Circuits, vol.
32, no. 1, pp. 62-69, Jan. 1997.
[20] M. Afghahi and J. Yuan, "Double Edge Triggered D Flip-flops for High Speed
Low-power Applications," IEEE J. Solid State Circuits, vol. 26, no. 8, pp.
1168-1170, August 1991.REFERENCES (continued)
[21] J. Park, J. Lee, and W. Kim, "Current Sensing Differential Logic: A CMOS
Logic for High Reliability and Flexibility," IEEE J. Solid State Circuits, vol.
34, no. 6, PP. 904-908, June 1999.
[22] D. Somasekhar and K. Roy, "Differential Current Switch Logic: A Low Power
DCVS Logic Family," IEEEJ. Solid State Circuits, vol. 31, no.7, pp. 98 1-991,
July 1996.
[23] M. Hanawa, K. Kaneko, T. Kawashimo, and H. Maruyama, "A 4.3ns 0.3p.m
CMOS 54x54b Multiplier Using Precharged Pass-Transistor Logic," 1996
IEEE International Solid-State Circuits Conference, pp. 364-365.
[24] R. Gu and M. Elmasry, "All N-Logic High-Speed True-Single-Phase Dynamic
CMOS Logic," IEEE J. Solid State Circuits, vol. 31, no. 2, pp. 221-229,
February 1996.
[25] A. Parameswar, H. Hara, and T. Sakurai, "A Swing Restored Pass-transistor
Logic Based Multiply and Accumulate Circuit for Multimedia Applications,"
IEEEJ. Solid State Circuits, vol. 31, no.6, pp. 804-809, June 1996.
[26] K. Yano, T. Yamanaka, T. Nishida, M. Saito, K. Shimohigashi, and A.
Shimizu, "A 3.8-ns CMOS 16x16-b Multiplier using Complementary Pass-
Transistor Logic," IEEE J. Solid State Circuits, vol. 25, no. 2, pp. 388-395,
April 1990.
[27] K. Yano, Y. Sasaki, K. Rikino, and K. Seki, "Top-down Pass-transistor
Design," IEEEJ. Solid State Circuits, vol. 31, no.6, pp. 792-803, June 1996.)ICES
50Appendix A: 0.35 p.m CMOS MOSIS HSPICE Model
*DATE: Oct 30/98














































































LLN =1LW =0 LWN =1 LWL =0
+CAPMOD=2 CGDO =l.96E-10 CGSO =l.96E-l0
+CGBO =0 CJ 9.276962E-4 PB =0.8157962
+MJ =0.3557696 CJSW =3.181055E-10 PBSW =0.6869149
+MJSW =0.1 PVTHO -0.0252481 PRDSW =-96.4502805












































































tX.)+WWL =0 LL =0 LLN =1
-i-LW =0 LN =1 LWL =0
+CAPMOD=2 CGDO =2.307E-10 CGSO =2.307E-10
+CGBO =0 CJ =1.420282E-3 PB =0.99
+MJ =0.5490877 CJSW =4.773605E-10PBSW =0.99
+MJSW =0.1997417 PVTH0 =6.58707E-3 PRDSW =-93.5582228
PK2 =1.011593E-3 WKETA =-0.0101398 LKETA =6.027967E-3
UIAppendix B: O.25p.m CMOS MOSIS HSPICE Model
*DATE: Dec6/99




















































































U'+CAPMOD=2 XPART =0.4 CGDO =3.11E-1O
+CGSO =3.11E-10 CGBO =1E-li CJ =1.758521E-3
+PB =0.99 MJ =0.457547 CJSW =4.085057E-T1O
+PBSW =0.8507757 MJSW =0.3374073 PVTHO =7.147521E-5
+PRDSW =-67.2161633 PK2 =-1.344599E-3WKETA =3.035972E-3










































































WL =0+WLN =1 WW =0 WWN =1
+WWL =0 LL =0 LLN =1
+LW =0 LWN =1 LWL =0
+CAPMOD= 2 XPART= 0.4 CGDO = 2.68E-10
+CGSO = 2.68E-10 CGBO = 1E-li CJ = 1.902493E-3
+PB = 0.9810285 MJ = 0.4644362 CJSW = 3.142741E-10
+PBSW = 0.9048624 MJSW = 0.3304452 PVTHO = 4.952976E-3
+PRDSW= 29.8169373 PK2 = 3.383373E-3 WKETA= -7.913501E-3
+LKETA = -0.0208318
C'Appendix C: Conventional Domino 8-bit Ripple Carry Adder HSPICE Netlist
* Conventional Domino CMOS Logic * Loaded Full Adder
.options node list post=2
.param mL = O.25u
XO Vdd in 0 clkl Cl Cl_ FAC
Xl Vdd 0 Cl clk2 C2 C2_ FAC
X2 Vdd 0 C2 clk3 C3 C3_ FAC
X3 Vdd 0 C3 clk4 C4 C4_ FAC
X4 Vdd 0 C4 clkl C5 C5_ FAC
X5 Vdd 0 C5 clk2 C6 C6_ FAC
X6 Vdd 0 C6 clk3 C7 C7_ FAC
X7 Vdd 0 C7 clk4 C8 C8_ FAC
X8 Vdd Vdd 0 Cl_ clk2 Si FAS
X9 Vdd Vdd 0 C2clk3 S2 FAS
Xl0 Vdd Vdd 0 C3_ clk4 S3 FAS
Xll Vdd Vdd 0 C4_ ciki S4 FAS
X12 Vdd Vdd 0 C5_ clk2 S5 FAS
X13 Vdd Vdd 0 C6_ clk3 S6 FAS
Xl4 Vdd Vdd 0 C7_ clk4 S7 FAS
Xl5 Vdd Vdd 0 C8_ ciki SB FAS
Mnsl 0 Si 0 Vss CMOSN L='mL' W='5u' GEO=3
Mns2 0 S2 0 Vss CMOSN L='mL' W='5u' GEO=3
Mns3 0 S3 0 Vss CMOSN L='mL' W='Su' GEO=3
Mns4 0 S4 0 Vss CMOSN L='rnL' W='5u' GEO=3
Mns5 0 S5 0 Vss CMOSN L='mL' W='5u' GEO=3
Mns6 0 S6 0 Vss CMOSN L='mL' W='Su' GEO=3
Mns7 0 S7 0 Vss CMOSN L='mL' W='5u' GEO=3
LuMns8 0 S8 0 Vss CMOSN L='mL' W='5u' GEO=3
** Double Pumped Domino Full Adder
.SUBCKT FAC AmBin Cmcik Cout L
* Carry Logic
MnlO L AmJ Vss CMOSN L='mL'
Mnll L Bin K Vss CMOSN L='mL'
Mn12 J CmM Vss CMOSN L='mL'
Mn13 J Bin M Vss CMOSN L='mL'
Mn14 K CmM Vss CMOSN L='mL'
Mn15 M cik Vss Vss CMOSN L='ir









SUBCKT FAS AmBin CmCout_bar cik Sum_out
* Sum Logic
Mn16 W Cout_bar T Vss CMOSN L='mL' W='4u' GEO=3
Mn17 T AmX Vss CMOSN L='mL' W='4.8u' GEO=3
Mn18 T Bin X Vss CMOSN L='mL' W='4.8u' GEO=3
Mn19 T CmX Vss CMOSN L='mL' W='4.8u' GEO=3
Mn20 W AmV Vss CMOSN L='mL' W='4u' GEO=3
Mn21 V Bin U Vss CMOSN L='mL' W='4.8u' GEO=3
Mn22 U CmX Vss CMOSN L='mL' W='5.76u' GEO=3
Mn24 X cik Vss Vss CMOSN L='mL' W='6.9l2u' GEO=l
XS W Sum_out cik OUTS.ic V(W)=O
.ENDS FAS
* Double Pumped Output Logic
.SUBCKT OUTC M 0 clk
Mp9 M clk Vdd Vdd CMOSP L='mL' W='7u' GEO=l
MplO N M Vdd Vdd CNOSP L='mL' W='O.25u' GEO=l
Mn25 N M Vss Vss CMOSN L='mL' W='O.25u' GEO=l
MphM N Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mp12 0 M Vdd Vdd CMOSP L='mL' W='lOu' GEO=l
Mn23 0 M Vss Vss CMOSN L='mL' W='2.5u' GEO=l
.ENDS OUTC
.SUBCKT OUTS M 0 cik
Mp13 M clk Vdd Vdd CMOSP L='mL' W='7u' GEO=l
Mp14 N M Vdd Vdd CMOSP L='mL' W='O.25u' GEO1
Mn26 N M Vss Vss CMOSN L='mL' W='O.25u' GEO=l
Mpl5 M N Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mp16 0 M Vdd Vdd CMOSP L='mL' W='lOu' GEO=l
Mn27 0 M Vss Vss CMOSN L='mL' W='2.5u' GEO=l
.ENDS OUTS
.SUBCKT CLKINV A B
Mpl B A Vee Vee CMOSP L='mL' W='40u' GEO=l
Mnl B A Vss Vss CMOSN L='mL' W='20u' GEO=l
.ENDS CLKINV.SUBCKT CLKINVd A B
Mpl B A Vff Vff CMOSP L='mL' W='40u' GEO=l
Mnl B A Vss Vss CMOSN L='mL' W='20u' GEO=l
.ENDS CLKINVd
* Clock Buffers
Xclkl clkl_ clkl clkinv
Xclk2 clk2_ clk2 clkinv
Xclk3 clk3_ clk3 clkinv
Xclk4 clk4_ clk4 clkinv
Xclkld clkl_ clkld clkinvd
Xclk2d clk2_ clk2d clkinvd
Xclk3d clk3_ clk3d clkinvd
Xclk4d clk4_ clk4d clkinvd
0Appendix D: DET Domino 8-bit Ripple Carry Adder HSPICE Netlist
* Double Data Rate Domino CMOS Logic wI NAND
* Loaded Full Adder
.options node list post=2
.param mL = O.25u
XO Vdd Vss Vss clkl clk3 clk4 clk2 Cla Cib FAC
Xla Cla Vdd Vss clk2 C2a_ FACD
Xlb Cib Vdd Vss clk4 C2b_ FACD
Xl C2a_ C2bC2 NAND
X2 C2 Vdd Vss ciki clk3 clk4 clk2 C3a C3b FAC
X3a C3a Vdd Vss clk2 C4a_ FACD
X3b C3b Vdd Vss clk4 C4b_ FACD
X3 C4a_ C4b_ C4 NAND
X4 C4 Vdd Vss ciki clk3 clk4 clk2 C5a C5b FAC
X5a C5a Vdd Vss clk2 C6a_ FACD
X5b C5b Vdd Vss clk4 C6bFACD
X5 C6a_ C6b_ C6 NAND
X6 C6 Vdd Vss ciki clk3 clk4 clk2 C7a C7b FAC
X7a C7a Vdd Vss clk2 C8a_ FACD
X7b C7b Vdd Vss clk4 C8b_ FACD
X7 C8a_ C8b_ C8 NANDX8aVddVss Vss Cia cik2 Sia_ FASD
X8bVddVss Vss Cib cik4 Sib_ FASD
X8 Sia_ Sib_ Si NAND
X9 VddVss Vss C2 ciki cik3 cik4 cik2 S2a S2b FAS
XiOaVddVss Vss C3a cik2 S3a_ FASD
XiObVddVss Vss C3b cik4 S3b_ FASD
XiO S3aS3bS3 NAND
Xii VddVss Vss C4 ciki clk3 clk4 cik2 S4a S4b FAS
Xi2aVddVss Vss C5a cik2 S5a_ FASD
Xi2bVddVss Vss C5b cik4 S5b_ FASD
Xi2 S5a_ 55b_ S5 NAND
Xi3vddVss Vss C6 ciki clk3 clk4 cik2 S6aS4b FAS
Xi4aVddVss Vss C7a cik2 S7aFASD
Xi4bVddVss Vss C7b cik4 S7b_ FASD
Xi4 S7a_ S7b_ S7 NAND
Xi5VddVss Vss C8 ciki cik3 clk4 cik2 S8a S8b FAS
Mns2a 0 S2a 0 Vss CMOSNL='rnL'W='5u' GEO=3
Mns4a 0 S4a 0 Vss CMOSNL='mL'W='5u' GEO=3
Mns6a 0 S6a 0 Vss CMOSNL='mL'W='5u' GEO=3




Mns8b0 S8b0VssCMOSNL='mL'W='5u'GEO=3Mnsl 0 Si 0 Vss CMOSN L='mL' W='5u' GE03
Mns3 0 S3 0 Vss CMOSN L'mL' W='5u' GEO=3
Mns5 0 S5 0 Vss CMOSN L='mL' W'5u' GEO=3
Mns7 0 S7 0 Vss CMOSN L='mL' W='5u' GEO=3
** Double Pumped DominoFullAdder
.SUBCKTFAC AmBinCmclk cik bar ciki clklCouta Coutb
* Carry Logic
Mni2 LCmJ Vss CMOSN L='mL' W='6u' GEO=3
Mnl3 L Bin K Vss CMOSN L='mL' W='6u' GEO=3
Mnl4J AmVss VssCMOSN L='mL' W='7.2u' GEO=l
Mni5JBin Vss Vss CMOSN L='mL' W='7.2u' GEO=l
Mn16 KAmVss VssCMOSN L='mL' W='7.2u' GEO=l
XC L Couta Coutb cik cik bar ciki clklDPOUTC
.ic V(L)=0 V(Cout_bar)=0
.ENDSFAC
SUBCKTFAS AmBin CmCout cik cik_bar ciki clkl_ Suma Sumb
* Sum Logic
Mn17W AmV VssCMOSN L='mL' W='6u' GEO=3
Mn18 V Bin U Vss CMOSN L='xnL' W='7.2u' GEO=3
Mn19U CmVss VssCMOSN L='mL' W='8.64u' GEO=l
Mn20T AmVss VssCMOSN L='mL' W='7.2u' GEO=l
Mn2i T Bin Vss Vss CMOSN L='mL' W='7.2' GEO=l
Mn22T CmVss VssCMOSN L='mL' W='7.2u' GEO=l
Mn23 W Cout_bar T Vss CMOSN L='mL' W='Gu' GEO=3
XS W Suma Sumbcik cik_bar ciki clki_ DPOUTSMp24 Cout_bar CoutVdd VddCMOSPL='mL'W'2.5u' GEO=1
Mn25 Cout_bar Cout Vss Vss CMOSNL='inL'W'1u' GEO=1
.ENDS FAS




Mn12 J CmM Vss
Mn13 J Bin N Vss
Mn14 K CmM Vss














MplO NL Vdd VddCMOSPL='mL'W='O.25u' GEO=1




.SUBCKT FASD AmBin CmCout cik W
* Sum Logic
Mn16 W Cout_bar T Vss CMOSNL='mL'W='4u' GEO=3
Mn17 T AmX Vss CMOSNL='mL'W='4.8u' GEO=3
Mn18 T Bin X Vss CMOSNL='mL'W='4.8u' GEO=3
Mn19 T CmX Vss CMOSNL='mL'W='4.8u' GEO=3Mn20 W AmV Vss CMOSN L='mL' W='4u' GEO=3
Mn21 V Bin U Vss CMOSN L='mL' W='4.8u' GEO=3
Mn22 U CmX Vss CMOSN L='mL' W='5.76u' GEO=3
Mn24 X cik Vss Vss CMOSN L='mL' W='6.912u' GEO=l
Mpl3 W cik Vdd Vdd CMOSP L='mL' W='7u' GEO=l
Mp14 N W Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mn26 N W Vss Vss CMOSN L='mL' W='O.25u' GEO=l
Mpl5 W N Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mp30 Cout_bar Cout Vdd Vdd CMOSP L='mL' W='2.5u' GEO=l
Mn3l Cout_bar Cout Vss Vss CMOSN L='mL' W='lu' GEO=l
.ic V(W)=O
.ENDS FASD
* Double Pumped Output Logic
.SUBCKT DPOUTC X P D clk cik_bar clkl clkl_
Mn24 X clk_bar M Vss CNOSN L='mL' W='5u' GEO=2
Mp12 M cik_bar Vdd Vdd CMOSP L='mL' W='7u' GEO=l
Mp25 N M Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mn25 N M Vss Vss CMOSN L='mLW='O.25u' GEO=l
Mp26 M N Vdd Vdd CMOSP L='mL' W='l.Ou' GEO=l
Mn26 X clk Q Vss CMOSN L='mL' W='5u' GEO=2
Mp27 Q cik Vdd Vdd CMOSP L='rnL' W='7u' GEO=l
Mp28 R Q Vdd Vdd CMOSP L='mL' W='O.25u' GEO=l
Mn27 R Q Vss Vss CMOSN L='mL' W='O.25u' GEO=lMp29 Q R Vdd Vdd CMOSP L='mL' W='l.Ou' GEO=l
Mp30 G Q Vdd \Tdd CMOSP L='mL' W='12u' GEO=l
Mpcl P clkl_ G Vdd CMOSP L='mL' W='lOu' GEO=2
Mncl P clkl Z Vss CMOSN L='mL' W'5u' GE02
Mn5l Z Q Vss Vss CMOSN L='mL' W='6u' GEO=l
Mp31 H M Vdd Vdd CMOSP L='mL' W='12u' GEO=1
Mpc2 D ciki H Vdd CMOSP L='mL' W='lOu' GEO=2
Mnc2 D clkl_ Y Vss CMOSN L='mL' W='5u' GEO=2
Mn50 Y M Vss Vss CMOSN L='mL' W='6u' GEO=1
*Mp32 N cik_bar Vdd Vdd CMOSP L='mL' W='l.Ou' GEO=l
*Mp33 N M Vdd Vdd CMOSP L='mL' cq='l.Ou' GEO=l
*Mn30 N cik_bar S Vss CMOSN L='mL' W='l.Ou' GEO=2
*Mn31 S M Vss Vss CMOSN L='mL' W='l.Ou' GEO=l
*Mp34 M N Vdd Vdd CMOSP L='mL' W='3.Ou' GEO=l
*Mp35 P. cik Vdd Vdd CMOSP L='mL' W='l.Ou' GEO=l
*Mp36 R Q Vdd Vdd CMOSP L='mL' W='l.Ou' GEO=l
*Mn32 P. clk T Vss CMOSN L='mL' W='l.Ou' GEO=2
*Mn33 T Q Vss Vss CMOSN L='mL' W='l.Ou' GEO=l
*Mp37 Q R Vdd Vdd CMOSP L='mL' W='3.Ou' GEO=l
.ic V(X)=O V(D)=O V(P)=O
.ENDS DPOUTC
* Double Pumped Output Logic
.SUBCKT DPOUTS X P D clk clk_bar clkl clkl_
Mn24 X cik_bar M Vss CMOSN L='mL' W='5u' GEO=2
Mp12 M cik_bar Vdd Vdd CMOSP L='mL' W='7u' GEO=lMp25 N MVdd VddCMOSPL='mL'W='O.25u' GEO=1
Mn25 N M Vss Vss CMOSNL='mL'W='O.25u' GEO=1
Mp26 M NVdd VddCMOSPL='mL'W='l.Ou' GEO=1
Mn26 X cik Q Vss CMOSNL='niL'W='5u' GEO=2
Mp27 Q cikVdd VddCMOSPL='mL'W='7u' GEO=1
Mp28 R QVdd VddCMOSPL='mL'W='O.25u' GEO=1
Mn27 R Q Vss Vss CMOSNL='mL'W='O.25u' GEO=1
Mp29 Q RVdd VddCMOSPL='mL' W='l.Ou'GEO=1
Mp40 E QVdd VddCMOSPL='mL' W='lOu'GEO=1
Mp41 P clkl_ EVddCMOSPL='mL'W='lOu' GEO=2
Mn43 P ciki H Vss CMOSNL='mL'W='7u' GEO=2
Nn54 H Q Vss Vss CMOSNL='mL'W='7u' GEO=1
Mp44 F MVdd VddCMOSPL='mL' W='lOu'GEO=1
Mp45 D ciki FVddCMOSPL='rnL'W='lOu' GEO=2
Mn47 D clkl_ I Vss CMOSNL='mL'W='7u' GEO=2
Mn55 I M Vss Vss CMOSNL='mL'W='7u' GEO=1
*Mp32 N cik_bar Vdd Vdd CMOSPL='mL'W='l.Ou' GEO=1
*Mp33 N MVdd VddCMOSPL='rnL'W='l.Ou' GEO=1
*Mn30 N cik_bar S Vss CMOSNL='mL'W='l.Ou' GEO=2
*Mn31 S M Vss Vss CMOSNL='rnL' W='l.Ou'GEO=1
*Mp34 M NVdd VddCMOSPL='mL'W='3.Ou' GEO=1
*Mp35 R cikVdd VddCMOSPL='mL' W='l.Ou'GEO=l
*Mp36 R QVdd VddCMOSPL='rnL'W='l.Ou' GEO=1
*Mn32 R cik T Vss CMOSNL='mL'W='l.Ou' GEO=2
*Mn33 T Q Vss Vss CMOSNL='rnL'W='l.Ou' GEO=1
*Mp37 Q RVdd VddCMOSPL='mL'W='3.Ou' GEO=lic V(X)=O V(D)=O V(P)=O
ENDS DPOUTS
.SUBCKT NAND A B D
Mpl D A Vdd Vdd CMOSP L='mL' W='lOu' GEO=l
Mp2 D B Vdd Vdd CMOSP L='rnL' W='lOu' GEO=l
Mnl D A C Vss CMOSN L='mL' W='5u' GEO=l
Mn2 C B Vss Vss CMOSN L='mL' W='Su' GEO=l
.ENDS NAND
.SUBCKT CLKINV A B
Mpl B A Vee Vee CMOSP L='mL' W='40u' GEO=l
Mnl B A Vss Vss CMOSN L='mL' W='20u' GEO=l
.ENDS CLKINV
.SUBCKT CLKINVd A B
Mpl B A Vff Vff CMOSP L='mL' W='40u' GEO=l



















clk4d clkinvdAppendix E: DDR Domino 8-bit Ripple Carry HSPICE Netlist
* DDR Conventional Domino CMOS Logic * Loaded Full Adder
.options node list post=2
.param niL = O.25u
XOa Vdd in 0 clkl Cla Cla_ FACout
Xla Vdd 0 Cia clk2 C2a C2a_ FACout
X2a Vdd 0 C2a clk3 C3a C3aFACout
X3a Vdd 0 C3a clk4 C4a C4a_ FACout
X4a Vdd 0 C4a clkl C5a C5aFACout
X5a Vdd 0 C5a clk2 C6a C6a_ FACout
X6a Vdd 0 C6a cik3 C7a C7a_ FACout
X7a Vdd 0 C7a cik4 C8a C8a_ FACout
XOb Vdd in 0 clk3 Clb Clb_ FACout
Xlb Vdd 0 Clb clk4 C2b C2b_ FACout
X2b Vdd 0 C2b clkl C3b C3b_ FACout
X3b Vdd 0 C3b clk2 C4b C4b_ FACout
X4b Vdd 0 C4b clk3 C5b C5b_ FACout
X5b Vdd 0 C5b clk4 C6b C6b_ FACout
X6b Vdd 0 C6b clkl C7b C7b_ FACout
X7b Vdd 0 C7b clk2 CSb C8b_ FACout
X8a Vdd 0 0 Cla_ clk2 Sla FASout
X9a Vdd 0 0 C2a_ clk3 S2a FASout
XlOa Vdd 0 0 C3a_ clk4 S3a FASout
Xila Vdd 0 0 C4a_ clki S4a FASout
Xl2a Vdd 0 0 C5a_ clk2 S5a FASout
Xi3a Vdd 0 0 C6a_ cik3 S6a FASout
Xl4a Vdd 0 0 C7a_ clk4 S7a FASoutXi5aVdd 0 0C8a_ ciki S8a FASout
X8bVdd 00Cib_ cik4 Sib FASout
X9bVdd 0 0C2b_ ciki S2b FASout
XiObVdd 0 0C3bcik2 S3b FASout
XiibVdd 0 0C4b_ cik3 S4b FASout
Xi2bVdd 0 0C5b_ cik4 S5b FASout
X13bVdd 0 0C6b_ ciki S6b FASout
X14bVdd 0 0C7b_ cik2 S7b FASout
X15bVdd0 0 C8b_ cik3 SBb FASout
XNAND1 Sia Sib Si NAND
XNAND2 S2a S2b S2 NAND
XNAND3 S3a S3b S3 NAND
XNAND4 S4a S4b S4 NAND
XNAND5 S5a S5b S5 NAND
XNAND6 S6a S6b S6 NAND
XNAND7 S7a S7b S7 NAND










SUBCKT FACout AmBin Cmcik0 L
0MnlOLAmJ Vss CMOSNL='mL'W='4u' GEO=3
MnllLBin K Vss CMOSNL='mL'W='4u' GEO=3
Mn12 J CmM Vss CMOSNL='mL'W='4.8u' GEO=3
Mn13 J Bin M Vss CMOSNL='mL'W='4.8u' GEO=3
Mn14 K CmM Vss CMOSNL='mL'W='48u' GEO=3
Mn15 M cik Vss Vss CMOSNL='mL'W='5.76u' GEO=1
Mpl3aLcikVdd VddCMOSPL='mL'W='7u' GEO=1
Mpl4a NL Vdd VddCMOSPL='mL'W='O.25u' GEO=1
Mn26a NL Vss VssCMOSNL='mL'W='O.25u' GEO=1
Mpl5aL N Vdd VddCMOSPL='mL'W='O.25u' GEO=1
Mp27 0L Vdd VddCMOSPL='mL' W='lOu'GEO=1
Mn27 0L Vss VssCMOSNL='mL'W='5u' GEO=1
Ic V(L)=OV(P)=volt V(0)=volt
ENDS FACout
SUBCKT FASout AmBin CmCout_bar cik W
* Sum Logic
Mn16 W Cout_bar T Vss CMOSNL='mL'W='4u' GEO=3
Mn17 T AmX Vss CMOSNL='mL'W='4.8u' GEO=3
Mn18 T Bin X Vss CMOSNL='mL'W='4.8u' GEO=3
Mn19 T CmX Vss CMOSNL='mL'W='4.8u' GEO=3
Mn20 W AmV Vss CMOSNL='mL'W='4u' GEO=3
Mn21 V Bin U Vss CMOSNL='mL'W='4.8u' GEO=3
Mn22 U CmX Vss CMOSNL='mL'W='5.76u' GEO=3
Mn24 X cik Vss Vss CMOSNL='mL'W=6.912u' GEO=1
Mpl3a W cikVdd VddCMOSPL='mL'W='7u' GEO=].
Mpl4a N WVdd VddCMOSPL='rnL'W='O.25u' GEO=1
Mn26a N W Vss Vss CMOSNL='mL'W='O.25u' GEO=1
Mpl5a WN Vdd VddCMOSPL='mL'W='O.25u' GEO=1Ic V(W)=O V(P)=volt V(0)=volt
.ENDS FASout
.SUBCKT NAND A B C
Mpl C A Vdd Vdd CMOSP L='rnL' W='lOu' GEO=l
Mp2 C B Vdd Vdd CMOSP L='mL' W='lOu' GEO=l
Mn3 C A D Vss CMOSN L='mL' W='5u' GEO=2
Mn4 D B Vss Vss CMOSN L='mL' W='5u' GEO=l
ENDS NAND
SUBCKT CLKINV A B
Mpl B A Vee Vee CMOSP L='mL' W='80u' GEO=l
Mnl B A Vss Vss CMOSN L='mL' W='40u' GEO=l
ENDS CLKINV
SUBCKT CLKINVd A B
Mpl B A Vff Vff CMOSP L='mL' W='80u' GEO=l
Mnl B A Vss Vss CMOSN L='mL' W='40u' GEO=l
ENDS CLKINVd
* Clock Buffers
Xclkl clkl_ clkl clkinv
Xclk2 clk2_ clk2 clkinv
Xclk3 clk3_ clk3 clkinv
Xclk4 clk4clk4 clkinv
* Clock Buffers
Xclkld clkl_ cikid clklnvd
Xclk2d clk2_ clk2d clkinvd
Xclk3d clk3_ clk3d clkinvd
Xclk4d clk4_ clk4d clkinvdAppendix F: Example HSPICE Code for Conventional Domino Mesasurements
sources
.param volt=2.5
Vcc Vdd 0 DC volt
.global vdd
Vee Vee 0 DC volt
.global vee
Vff Vff 0 DC volt
.global vff
.param gnd=0










Vclkl clkl_ 0 pulse gnd volt delayl rise fall level clkperiod
Vclk2 clk2_ 0 pulse gnd volt delay2 rise fall level clkperiod
Vclk3 clk3_ 0 pulse gnd volt delay3 rise fall level clkperiod
Vclk4 clk4_ 0 pulse gnd volt delay4 rise fall level clkperiod







.meas tran avg_curr_logic avg I(Vcc) from 'clkperiod' to5.O*clkperiod
.meas avgpower_logic param=' (_avg_curr_logic*2 .5)/4'
.meas tran avg_curr_clk avg I(Vee) from 'clkperiod' tol5.O*clkperiod
.meas avgpower_clk pararn=' (_avg_curr_clk*2 .5) /4'
.meas tran avg_curr_clkd avg I(Vff) from 'clkperiod' to'5.O*clkperiod
.meas avgpower_clkd param=' (_avg_curr_clkd*2 .5)14'
.TRAN O.lns tran_endAppendix G: Example HSPICE Code for DDRIDET Domino Measurements
* sources
.param volt=2.5
Vcc Vdd 0 DC volt
.global vdd
Vee Vee 0 DC volt
.global vee
Vff Vff 0 DC volt
.global vff
.param gnd=0











Vclkl clkl_ 0 pulse gnd volt delayl rise fall level clkperiod
Vclk2 clk20 pulse gnd volt delay2 rise fall level clkperiod
Vclk3 clk30 pulse gnd volt delay3 rise fall level clkperiod
Vclk4 clk4_ 0 pulse gnd volt delay4 rise fall level clkperiodVin in 0 pulse gnd volt delay4 rise falll*lengthl*dataperiod
VinA0 A0 0 pulse gnd volt delay4 rise fall4*length+4*risel4*dataperjod
VinBO 30 0 pulse gnd volt delay4 rise fall2*lengthI2*rise2*dataperiod
VinCO CO 0 pulse gnd volt delay4 rise falll*lengthl*dataperiod
VinAOd AOd 0 pulse gnd volt delayl rise fall4*length+4*rise4*dataperiod
VinBOd BOd 0 pulse gnd volt delayl rise fa1l2*length+2*rise2*dataperiod










.meas tran avgcurrjogic avg I(Vcc) froml*clkperiodto5*clkperiod
.meas avgpower_logic param=' (_avg_curr_logic*2 .5)14'
.meas tran avg_curr_clk avg I(Vee) froml*clkperiodto5*clkperiod
.meas avgpower_clk param=' (_avg_curr_clk*2 . 5)/4'
.meas tran avg_curr_clkd avg I(Vff) froml*clkperiodto5*clkperiod
meas avgpower_clkd param=' (_avgcurr_clkd*2 .5) /4'
.TRAN 0.lns tran_end
C'Appendix H: Example HSPICE Netlist Conventional Flip-Flop
* Standard DQ Flip Flop
.options node list post=2 relv=le-4 absvar=.3 relvar=O.l
.param mL = O.35u
* Flip Flop Net List
Mpl D cik A Vdd CMOSP L='mL' W='wpl' AD=wpl*(mL/2)*6s AS=wpl*(mLI2)*6PD=6*mL PS=6*mL
Mnl D cik_bar A Vss CMOSN L='inL' W='wnl' AD=wnl*(mL/2)*6AS=wnl*(rnLI2)*6PD6*rnLl PS=6*InLl
Mn2 B A Vss Vss CMOSN L='rnL' W='wn2' AD=wn2*(mL/2)*6AS=Fwn2*(mL/2)*6PD=6*mL PS=6*mL
Mp2 B A Vdd Vdd CMOSP L='mL' W='wp2' AD=wp2*(mL/2)*6AS=wp2*(mL/2)*6PD=6*n1LPS=6*mL
Mn3 B clk C Vss CMOSN L='rnL' W='wn3' AD=lwn3*(mL/2)*6AS=wn3*(mL/2)*6PD=6*mL PS=6*mL
Mp3 B clk_bar C Vdd CMOSP L='mL' W='wp3' AD=wp3*(mL/2)*6l AS=wp3*(mL/2)*6 PD6*mL PS=6*mLl
Mn4 Q C Vss Vss CMOSN L='inL' W='wn4' AD=wn4*(mL/2)*5AS=wn4*(mL/2)*5pD=l5*mL+4PS=5*mL+wn4
Mp4 Q C Vdd Vdd CMOSP L='mL' W='wp4' AD=wp4*(mL/2)*5l AS=wp4*(mL/2)*5PD=5*mL+wp4PS=5*mL+wp4
Mp5 E Q Vdd Vdd CMOSP L='mL' W='wp5' AD=wp5*(mLI2)*6AS=wp5*(mLI2)*6PD=6*mL PS=6*mL
Mn5 E Q Vss Vss CMOSN L='mL' W='wn5' AD=wn5*(mL/2)*6AS=wn5*(mLI2)*6PD=6*rnLPS=6*mL
Mn6 E cik_bar C Vss CMOSN L='mL' W='wn6' AD=wn6*(mL/2)*6AS=wn6*(mL/2)*6PD=6*mLPS=6*mL
Mp6 E clk C Vdd CMOSP L='mL' W='wp6' AD=wp6*(mLI2)*6AS=wp6*(mLI2)*6 PD6*mL PS='6*mL
Mn7 F clk A Vss CMOSN L='mL' W='wn7' AD=wn7*(mL/2)*6AS=wn7*(mL/2)*6PD=6*mLPS='6*mL
Mp7 F cik_bar A Vdd CMOSP L='mL' W='wp7' AD=wp7*(mL/2)*6AS=wp7*(mLI2)*5PD=6*mL PS=6*mL
Mp8 F B Vdd Vdd CMOSP L='rnL' W='wpS' AD=wp8*(mL/2)*6AS=wp8*(mL/2)*6l PD=6*mL PS=6*mL
Mn8 F B Vss Vss CMOSN L='rnL' W='wn8' AD=wn8*(mL/2)*6AS=wn8*(mLI2)*6PD=6*mL PS=6*mL
Cl Q 0 200fF
C2 D 0 200fF






































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































 Appendix J: Example HSPICE Netlist C2MOS Flop-Flop
* Modified C2MOS Flip Flop
.options node list post=2 relv=le-4 absvar=.3 relvar=O.l
.param inL = O.35u
* Flip Flop Net List
Mpl A cik C Vdd CMOSP L='rnL' W='wpl' AD=wpl*mL AS=wpl*mL PD'2*mLPS=2*mL
Mnl A cik_bar B Vss CMOSN L='mL' W='wnl' AD=wnl*mL AS=lwnl*mLlPD=l2*mLPS='2*mLI
Mp2 C D Vdd Vdd CMOSP L='mL' W='wp2' AD=wp2*mL AS=wp2*mL PD=2*mL PS=2*mLl
Mn2 B D Vss Vss CMOSN L='mL' W='wn2' AD=lwn2*mL AS=wn2*mLPD='2*mLPS=2*xnL
Mp3 A cik_bar G Vdd CMOSP L='mL' W='wp3' AD=wp3*mL AS=lwp3*mLPD=2*mLPS='2*mL
Mn3 A cik F Vss CMOSN L='mL' W='wn3' AD=wn3*mL AS=wn3*mLPD='2*mLPS=2*mL
Mp4 G E Vdd Vdd CMOSP L='mL' W='wp4' AD=lwp4*mLI ASwp4*mL PD=2*mLPS=l2*mL
Mn4 F E Vss Vss CMOSN L='rnL' W='wn4' AD=1wn4*mL AS=wn4*mLs PD=2*mLPS=l2*mL
Mp5 E A Vdd Vdd CMOSP L='xnL' W='wp5' AD=wp5*mL AS=wp5*mL PD=2*mLPS=2*mLl
Mn5 E A Vss Vss CMOSN L='mL' W='wn5' AD=lwn5*mLl AS=wn5*mL PD2*mLl PS=2*mL
Mp6 Q clk_bar H Vdd CMOSP L='mL' W='wp6' AD=wp6*mL AS='wp6*mL' PD=2*rnLPS='2*mL
Mn6 Q cik I Vss CMOSN L='mL' W='wn6' AD=wn6*mL AS=wn6*mLPD='2*mLPS=2*mLl
Mp7 H A Vdd Vdd CMOSP L='mL' W='wp7' AD=wp7*rnL AS=wp7*mLPD='2*mL' PS=2*mL
Mn7 I A Vss Vss CMOSN L='mL' W='wn7' AD=wn7*mL AS=wn7*mL PD=2*mLPS=2*mL
Mp8 Q clk K Vdd CMOSP L='mL' W='wpS' ADwp8*mLl AS=wp8*mL PD=2*mLPS=2*mL
Mn8 Q clk_bar L Vss CMOSN L='znL' W='wn8' AD=wn8*rnL AS=wn8*mL' PD=2*mLPS='2*InL
Mp9 K J Vdd Vdd CMOSP L='mL' W='wp9' AD=wp9*mL AS=wp9*mLPD=2*mL PS=2*mL
Mn9 L J Vss Vss CMOSN L='mL' W='wn9' AD=lwn9*mLl AS=lwn9*mLPD=2*mL PS=2*mL
MplO J Q Vdd Vdd CMOSP L='mL' W='wplO' AD=wplO*mL AS=wplO*mL PD=2*mLPS=2*mLl
MnlO J Q Vss Vss CMOSN L='mL'='wnlO' AD=wnlO*mL AS=wnlO*mLPD=2*mL PS=2*mL
Cl Q 0 200fF
C2 D 0 200fF































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































 Appendix L: Example HSPICE Flip-Flop Optimization Code
* sources
.param volt=2
Vcc Vdd 0 DC volt
Vxx Vxx 0 DC volt
.global vdd
.param gnd=0









Vclk clk_in 0 pulse gnd volt level rise fall level cik_period
Vclk_test test_clk_in 0 pulse gnd volt level rise fall level clk_period
Vclk_blk Vdd_blk 0 volt
Vdata_gry Vdd_gry 0 volt
Vclk_blk_test Vdd_blk_test 0 DC volt
Vdata_gry_test Vdd_gry_test 0 DC volt
* Loaded Inverters actual inverters
Mpldata data_in_bar data_in Vxx Vxx CMOSP L='mL' W='wpldata' AD=wpldata*(mL/2)*5
AS=wpldata*(mLI2)*5PD=15*mL+wpldatas PS=5*mL+wpldatalMnldata data_in_bar data_in Vss Vss CMOSN L='mL' W='wnldata' AD=wn1data*(mL/2)*5
AS=wnldata*(mL/2)*5PD=s5*mL+wnldatal PS=5*mLlwnldata
Mp2data D data_in_bar Vdd_gry Vdd_gry CMOSP L='mL' W='wp2data' AD=wp2data*(mL/2)*5
AS=wp2data*(mL/2)*5PD=5*mL+wp2datal pg=5*rp+wp2data
Mn2data D data_in_bar Vss Vss CMOSN L='mL' W='wn2data' AD=wn2data*(mL/2)*5AS=wn2data*(rnL/2)*5
PD=' 5 PS=' 5 *a1+2data
Mplclk cik_bar cik_in Vdd_blk Vdd_blk CMOSP L='mL' W='wplclk'AD=wplclk*(mL/2)*5l AS=wplclk*(mL/2)*5
PD=' 5*Ip+wplclkPS=' 5*mL+wplc1k
Mnlclk cik_bar cik_in Vss Vss CMOSN L='mL' W='wnlclk' AD=lwnlc1k*(mL/2)*5AS=wnlclk*(mL/2)*5l
PD=' PS='
Mp2clk cik cik_bar Vdd_blk Vdd_blk CMOSP L='mL' W='wp2clk' AD=lwp2clk*(mL/2)*5AS=wp2clk*(mL/2)*5
PD=5*mLiwp2clkPS=5*mL+wp2clk
Mn2clk cik cik_bar Vss Vss CMOSN L='mL' W='wn2clk' AD=wn2clk*(mL/2)*5ASrwn2clk*(mL/2)*5
PD=5*mL+wn2clkPS=5*mL+wn2clk
*Unloaded Inverters dummy inverters
Mpldata_test test_data_in_bar test_data_in VxxVxxCMOSP L='mL' W='wpldata' AD=wpldata*(mL/2)*5
AS=wpldata*(mL/2)*5PD=5*rriLIwpldataPS=5*mL4wpldata
Mnldata_test test_data_in_bar test_data_in Vss Vss CMOSN L='mL' W='wnldata'AD=wnldata*(mL/2)*5
AS='wnldata*(mL/2) *5PD=5*mLwnldatal PS=s5*mL+wnldata
Mp2data_test test_D test_data_in_bar Vdd_gry_test Vdd_gry_test CMOSP L='mL' W='wp2data'
AD=wp2data*(mL/2)*5AS=wp2data*(mL/2)*5PD=s5*mL+wp2dataPS=5*mL+wp2data
Mn2data_test test_D test_data_in_bar Vss Vss CMOSN L='mL' W'wn2data' AD=wn2data*(mL/2)*5
AS=wn2data*(mL/2)*5PD='5*mL+wn2dataPS=l5*mL+wn2data
Mplclk_test test_cik_bar test_cik_in Vdd_blk_test Vdd_blk_test CMOSP L='mL' W'wplclk'
AD=Iwplclk*(rnL/2)*5AS=wplclk*(mL/2)*5PD=5*mLiwplclkl PS=5*mL+wplclkl
Mnlclk_test test_cik_bar test_cik_in Vss Vss CMOSN L='mL' W='wnlclk' AD='wnlclk*(mL/2)*5
AS=wnlclk*(mL/2)*5 PD=5*mL+lclk PS=5*mL+wnlclkl
Mp2clk_test test_cik test_cik_bar Vdd_blk_test Vdd_blk_test CMOSP L='mL' W='wp2clk'
AD=wp2c1k*(mL/2)*5AS=wp2c1k*(mL/2)*5PD=5*mL+wp2clks PS=5*mLlwp2clkMn2clk_test test_clk test_cik_bar Vss Vss CMOSN L='mL' W='wn2clk' AD=wn2clk*(mL/2)*5
AS=wn2clk*(mL/2)*5PD=5*xnL+wn2clkPS=5*mL+wn2clkl
.param length=4ns
.param delay=' ((clk_period-4n) /2)'
.param data_period=' 2*clk_period
*Data Activity of 1 1010101...
Vin data_in 0 pulse gnd volt delay rise fall length data_period


































.TRAN 0. O5ns tran_end sweep optimize=optl results=tpopt, tpclkopt, tpclk_baropt, tpdataopt,pdp,
model=optmod
.model optmod opt itropt=30 max=le+5




.meas tran avg_curr avg I(Vcc) from=0.5*tran_endto='tran_end'
.meas tran avg_curr_clk avg I(Vclk_blk) from=O.5*tran_endto='tran_end'
.rneas tran avg_curr_data avg I(Vdata_gry) from=b0.5*tran_endto='tran_end'
.meas tran avg_curr_clk_test avg I(Vclk_blk_test) from=.5*tran_endto='tran_end'





.meas avg_power_data_test param=l_avg_curr_data_test*voltl.meas tran tropt trig v(clk) val='volt/2' rise=3 targ v(Q) val='volt/2' rise=2
.meas tran ttopt trig v(clk) val='volt/2' rise=4 targ v(Q) val='volt/2' fall=2
.meas tran trclkopt trig v(clk) va1=.1*voltrise=1 targ v(clk) va1=.9*voltrise=1
.meas tran tfclkopt trig v(clk) val=' .9*voltfall=1 targ v(clk) val=' .1*voltfall=1
.meas trari trclk_baropt trig v(clk_bar) va1=.1*voltrise=1 targ v(clk_bar) va1=.9*voltrise=1
.meas tran tfclkbaropt trig v(clkbar) val=l.9*voltl fall=1 targ v(clk_bar) va1=.1*voltfall=1
.meas tran trdataopt trig v(D) va1=.1*voltrise=2 targ v(D) va1=.9*volts rise=2
.meas tran tfdataopt trig v(D) va1=.9*volts fall=2 targ v(D) va1=.1*voltfall=2
.meas tpopt param=' (tropt+tfopt)/2' goal=lOOps
.meas tpclkopt param=' (trclkopt+tfclkopt)/2' goal=lOOps
.meas tpclk_baropt param=' (trclk_baropt+tfclk_baropt) /2' goal=lOOps
.meas tpdataopt param=' (trdataopt+tfdataopt) /2' goal=lOOps
.meas avgpower_eff param=' avgpower-40u'
meas avgpower_clk_eff param='avgpower_clk-avgpower_clk_test'
meas avgpower_data_eff param= 'aVg_power_data-aVg_power_data_test'
meas PDP param=' (avg_power_effs-avg_power_clk_effavg_power_data_eff) *tpoptgoal=lOf minval=50f
.rnodel
option post
ENDAppendix M: Example HSPICE Code for Flip-Flop Power Measurements
* sources
.param volt=2
Vcc Vdd 0 DC volt
Vxx Vxx 0 DC volt
.global vdd
.param gnd=0









Vclk cik_in 0 pulse gnd volt level rise falllevel cik_period
Vclk_test test_clk_in 0 pulse gnd volt levelrise fall level clk_period
Vclk_blk Vdd_blk 0 DC volt
Vdata_gry Vdd_gry 0 DC volt
Vclk_blk_test Vdd_blk_test 0 DC volt
Vdata_gry_test Vdd_gry_test 0 DC volt
* Loaded Inverters actual inverters
Mpldata data_in_bar data_in Vxx Vxx CMOSP L='mL'W='wpldata' AD=lwpldata*(mL/2)*5
AS=wpldata*(mL/2)*5PD=5*mL+wpldataPS=f5*mL4wp1dataMnldata data_in_bar data_in Vss Vss CMOSN L='mL' W='wnldata' AD=wnldata*(mL/2)*5
AS=wnldata*(mL/2)*5l PD=5*mLwnldataf PS=5*mL+wnldata
Mp2data D data_in_bar Vdd_gry Vdd_gry CMOSP L='mL' W='wp2data' AD=wp2data*(mL/2)*5
AS=wp2data*(mL/2)*5PD=5*mLtwp2data' PS=5*mL+wp2data
Mn2data D data_in_bar Vss Vss CMOSN L='mL' W='wn2data' AD=swn2data*(mL/2)*5AS=wn2data*(mL/2)*5
PD=' 5 *mL2 data' PS=' 5 *mL+2data
Mplclk clk_bar cik_in Vdd_blk Vdd_blk CMOSP L='mL' W='wplclk' AD=wplclk*(mL/2)*5AS=wplclk*(mL/2)*5
PD=' 5*mL+wplclk' PS=' 5*rp+wp1c1kl
Mnlclk clk_bar clk_in Vss Vss CMOSN L='mL' W='wnlclk' AD=lwnlclk*(mL/2)*5AS=wnlc1k*(mL/2)*5l
PD=' 5*mL+lc1kPS=' 5*mL+wnlclk'
Mp2clk cik cik_bar Vdd_blk Vdd_blk CMOSPL='mL'W='wp2clk' AD=wp2clk*(mL/2)*5AS=wp2clk*(mL/2)*5
PD=5*mLwp2clk PS=5*mL+wp2clk
Mn2clk clk cik_bar Vss Vss CMOSN L='mL' W='wn2clk' AD=wn2clk*(mL/2)*5AS=wn2c1k*(mL/2)*5
PD=5*mL+wn2clk PS=5*mLwn2clk
*Unloaded Inverters dummy inverters
Mpldata_test test_data_in_bar test_data_in Vxx Vxx CMOSP L='mL' W='wpldata' AD=wpldata*(mL/2)*5
AS=wpldata*(mL/2)*5pD5*T+wpldata PS=5*mL+wpldata'
Mnldata_test test_data_in_bar test_data_in Vss Vss CMOSN L='mL' W='wnldata' AD=wnldata*(mL/2)*5
AS=wnldata*(mL/2)*5PD=5*mL+ldata' PS=5*mLIwnldata
Mp2data_test test_D test_data_in_bar Vdd_gry_test Vdd_gry_test CMOSP L='mL' W='wp2data'
AD=wp2data*(mL/2)*5AS=wp2data*(mL/2)*5I pD5*j+wp2data PS=5*mLIwp2data
Mn2data_test test_D test_data_in_bar Vss Vss CMOSN L='mL' W='wn2data' AD=wn2data*(mL/2)*5
AS=wn2data*(mL/2)*5PD=15*mL+wn2datal PS=5*mL+wn2datas
Mplclk_test test_cik_bar test_cik_in Vdd_blk_test Vdd_blk_test CMOSP L='mL' W='wplclk'
AD=wplclk*(mL/2)*5AS=lwplclk*(mL/2)*5PD='5*mL+wplclkPS=5*mLiwplc1kl
Mnlclk_test test_clk_bar test_clk_in Vss Vss CMOSN L='mL' W='wnlclk' AD=wnlclk*(mL/2)*5
AS=wnlclk*(mL/2)*5PD=5*mLwnlclks PS=5*mL+wnlclks
Mp2clk_test test_clk test_cik_bar Vdd_blk_test Vdd_blk_test CMOSP L='mL' W='wp2clk'
AD=wp2clk*(mL/2)*5AS=wp2clk*(mL/2)*5f PD=5*mLiwp2clkPS=l5*mL+wp2c1kMn2clk_test test_cik test_clk_bar Vss Vss CMOSN L='mL' W='wn2clk' AD=wn2clk*(mL/2)*5
AS=wn2clk*(mL/2)*5l PD=5*rnL+wn2clkPS=5*mLwn2clkl
.param length=4ns
.param delay=' ((clk_period-4n) /2)'
.param data_period=' 2*clk_period'
*Data Activity of 1 1010101...
.param activity=1
Vin data_in 0 pulse gnd volt delay rise fall length data_period
yin_test test_data_in 0 pulse gnd volt delay rise fall length data_period
*Data Activity of 0.5 1011001...
*.param activity=0.5
*Vin data_in 0 pwl 0 volt4.25*clk_period_risevolt4.25*clk_period' 05.25*c1k_period_rise' 0
5.25*clk_periodvolt6.25*clk_period_rises volt6.25*c1k_period07.25*c1k_periOd_rise' 0
l7.25*c1k_periodvolt8.75*clk_period_rise' volt8.75*clk_period0 lll.25*clk_period_rise' 0
11.25*c1k_periodvolt 111.75*clk_period_risel volt lll.75*c1k_period0 Il6*clk_periodrise0
l6*clk_periodvolt ?20.25*clk_period_risevolt20.25*clk_period0 l2l.25*c1k_period_rise 0
2l.25*clk_periods volt22.25*clk_period_risevolt 122.25*clk_period' 0 123.25*clk_period_rise' 0
23.25*clk_periodvolt24.75*clk_period_risevolt 124.75*clk_period' 0 127.25*clk_period_rise' 0
127.25*clk_period' volt27.75*clk_period_rises volt27.75*clk_period' 0
*Vin test test_data_in 0 pwl 0 volt4.25*clk_period_rise' volt l4.25*clk_period0 lS.25*clk_period
rise' 0 s5.25*clk_periodvolt l6.25*clk_period_risevolt6.25*c1k_periodI 07.25*c1k_period_rise'
I7.25*clk_periodvolt l8.75*clk_period.rise# volt8.75*clk_period' 011.25*clk_period_rise 0
#ll.25*clk_period' volt h1l.75*clk_period_risevolt11.75*clk_periods 0 hl6*clk_period_rise0
l6*clk_periodvolt 120.25*clk_periodrisel volt20.25*clk_period02l.25*clk_period_rise0
2l.25*clk_periodvolt22.25*clk_period.risei volt22.25*clk_period' 023.25*clk_period_rise 0
l23.25*clk_periodvolt24.75*clk_period_risel volt l24.75*clk_period0 127.25*clk_period_rise0































param tran_encl=' (32*clkperiod)'.TRAN O.lns tran_end
.model optmod opt itropt=30 max=le+5




.meas tran avg_curr avg I(Vcc) from=.5*tran_endl to='tran_end'
.meas tran avg_curr_clk avg I(Vclk_blk) from=l.5*tran_endto='tran_end'
.meas tran avg_curr_data avg I(Vdata_gry) from=l.5*tran_endl to='tran_end'
.meas tran avg_curr_clk_test avg I(Vclk_blk_test) from=.5*tran_endto='tran_end'
.meas tran avg_curr_data_test avg I(Vdata_gry_test) trom=.5*tran_endto='tran_end'
.meas avgpower param =_avg_curr*volt
.meas avg.power_clk param = l_avg_curr_clk*voltl
.meas avgpower_data paratn =_avg_curr_data*volt
.meas avgpower_clk_test param =_avg.curr_c1k_test*volt
.meas avgpower_data_test param = l_avg_curr_data_test*volt
.meas tran tropt trig v(clk) val='volt/2' rise=3 targ v(Q) val='volt/2' rise=2
.meas tran tfopt trig v(clk) val='volt/2' rise=4 targ v(Q) val='volt/2' fall=2
.meas tran trclkopt trig v(clk) va1=.1*voltrise=1 targ v(clk) va1=.9*voltrise=1
.rneas tran tfclkopt trig v(clk) val=' .9*voltfall=1 targ v(clk) val=' .1*voltfall=1
.meas tran trclk_baropt trig v(clk_bar) va1=.1*voltl rise=1 targ v(clk_bar) va1=.9*volt
.meas tran tfclkbaropt trig v(clk_bar) val=f.9*voltl falll targ v(clk_bar) va1=.1*volt
.meas tran trdataopt trig v(D) va1=l.1*voltrise=2 targ v(D) va1=.9*voltrise=2
.meas tran tfdataopt trig v(D) va1=.9*voltfall=2 targ v(D) va1=.1*voltfall=2
.meas tpopt param=' (tropt+tfopt)/2' goal=lOOps
.rneas tpclkopt param=' (trclkopt+tfclkopt)/2' goal=lOOps
.meas tpclk_baropt param=' (trclk_baropt+tfclk_baropt) /2' goal=lOOps
.meas tpdataopt param=' (trdataopt+tfdataopt) /2' goal=lOOps
rise=1
fall=1
"C.param aCtiVity_power =' (frequency*2OOf*volt*volt*activity)/2
meas avg_power_eff param= 'aVg_power-actiVity_power'
meas avg_power_clk_eff paraxn= 'avg_power_clk-avg_power_clk_test'
meas avg_power_data_eff param=' avg_power_data-avg_power_data_test'
meas avg_power_total_eff param=' (avg_power_eff+avg_power_clk_etf+avg_power_data_eff)
meas PDP param=avg_power_tota1_eff*tpoptgoalrr2OfAppendix N: Example HSPICE Code for Maximum Frequency
*sources
.param volt=3.3
Vcc Vdd 0 DC volt
Vxx Vxx 0 DC volt
.global vdd
.param gnd=0
Vgnd Vss 0 DC gnd
.global vss
*Clock Parameters






Vclk cik_in 0 pulse gnd volt level rise fall level clk_period
*Vclk bar cik_bar 0 pulse volt gnd level rise fall level clk_period
Vclk_blk Vdd_blk 0 volt
Vdata_gry Vdd_gry 0 volt
*Mpldata data_in_bar data_in Vxx Vxx CMOSP L='mL' W='wpldata' AD=wpldata*(mL/2)*5
AS=wpldata*(mL/2)*5PD=5*mL+wpldataPS=?5*mLwpldata'
*Mnldata data_in_bar data_in Vss Vss CMOSN L='mL' W='wnldata' AD=wnldata*(mL/2)*5
AS=wnldata*(mL/2)*5i pD=5*rpJ.,+ldataPS=5*mL+wnldata
*Mp2data D data_in_bar Vdd_gry Vdd_gry CMOSP L='mL' W='wp2data' AD=wp2data*(mL/2)*5
AS=wp2data*(rnL/2)*5s pD5*IL+wp2dataPS=5*mL+wp2datal*2data D data_in_bar Vss Vss CMOSN L='mL' W='wn2data' AD=wn2data*(mL/2)*5AS=wn2data*(mL/2)*5
PD=' PS=' 5*+2dataS
Mplclk cik_bar cik_in Vdd_blk Vdd_blk CMOSP L='mL' W='wplclk' AD=wplc1k*(mL/2)*5l AS=wplclk*(rnL/2)*5
PD=' 5*rp+wp1c1kPS=' 5*wp1clk
Mnlclk cik_bar cik_in Vss Vss CMOSN L='mL' W='wnlclk' AD=wnlclk*(mL/2)*5AS=wnlclk*(mL/2)*5
PD=' 5*mL+lclkPS=' 5*mL+lc1k
Mp2clk cik clk_bar Vdd_blk Vdd_blk CMOSP L='mL' W='wp2clk' AD=wp2clk*(mL/2)*5AS=wp2clk*(mL/2)*5
PD=5*mLswp2c1kPS=5*rnL+wp2clk













































meas tran avg_curr avg I(Vcc) from=.5*tran_endto='tran_end'




.meas tran tropt trig v(clk) val='volt/2' rise=3 targ v(Q) val='volt/2' rise=3
.meas tran tfopt trig v(clk) val='volt/2' rise=4 targ v(Q) val='volt/2' fall=4
.meas tran trclkopt trig v(clk) va1=.1*voltI rise=1 targ v(clk) va1=.9*voltrise=1
.meas tran tfclkopt trig v(clk) va1=.9*voltfall=1 targ v(clk) va1=.1*voltl fall=1
.meas tran trclk_baropt trig v(clk_bar) va1=.1*voltrise=1 targ v(clk_bar) va1=.9*volt
.meas tran tfclk_baropt trig v(clk_bar) va1=.9*voltfall=1 targ v(clk_bar) va1=.1*volt
.meas tran trdataopt trig v(D) va1=.1*voltl rise=2 targ v(D) va1=.9*voltrise=2
.meas tran tfdataopt trig v(D) va1=.9*voltl fall=2 targ v(D) va1=.1*voltfall=2
.meas tpopt param=' (tropt-i-tfopt)/2' goal=lOOps
.meas tpclkopt param=' (trclkopt+tfclkopt)/2' goal=lOOps
.meas tpclk_baropt param=' (trclkbaropttfclk_baropt) /2' goal=lOOps
.meas tpdataopt param=' (trdataopt+tfdataopt) /2' goal=lOOps
.meas PDP param=' (avgower+avgpower_c1k+avgpower_data) *tpoptgoal=20f
rise=1
fall=1