Linearization of The Timing Analysis and Optimization of Level-Sensitive Circuits by Taskin, Baris
LINEARIZATION OF THE TIMING ANALYSIS AND OPTIMIZATION OF LEVEL-SENSITIVE
SYNCHRONOUS CIRCUITS
by
Baris Taskin
B.S. in E.E., Middle East Technical University, Turkey, 2000
Submitted to the Graduate Faculty of
the School of Engineering in partial fulfillment
of the requirements for the degree of
Master of Science
University of Pittsburgh
2003
UNIVERSITY OF PITTSBURGH
SCHOOL OF ENGINEERING
This thesis was presented
by
Baris Taskin
It was defended on
March 28, 2003
and approved by
Steven P. Levitan, Professor, Electrical Engineering Department
Marlin H. Mickle, Professor, Electrical Engineering Department
Thesis Advisor: Ivan S. Kourtev, Assistant Professor, Electrical Engineering Department
ii
ABSTRACT
LINEARIZATION OF THE TIMING ANALYSIS AND OPTIMIZATION OF LEVEL-SENSITIVE
SYNCHRONOUS CIRCUITS
Baris Taskin , M.S.
University of Pittsburgh, 2003
This thesis describes a linear programming (LP) formulation applicable to the static timing analysis of
large scale synchronous circuits with level-sensitive latches. The automatic timing analysis procedure pre-
sented here is composed of deriving the connectivity information, constructing the LP model and solving
the clock period minimization problem of synchronous digital VLSI circuits. In synchronous circuits with
level-sensitive latches, operation at a reduced clock period (higher clock frequency) is possible by taking
advantage of both non-zero clock skew scheduling
 
and time borrowing
 
	
. Clock skew schedul-
ing is performed in order to exploit the benefits of nonidentical clock signal delays on circuit timing. The
time borrowing property of level-sensitive circuits permits higher operating frequencies compared to edge-
sensitive circuits. Considering time borrowing in the timing analysis, however, introduces non-linearity in
this timing analysis. The modified big M (MBM) method is defined in order to transform the non-linear
constraints arising in the problem formulation into solvable linear constraints. Equivalent LP model prob-
lems for single-phase clock synchronization of the ISCAS’89 benchmark circuits are generated and these
problems are solved by the industrial LP solver CPLEX
 
. Through the simultaneous application of time
borrowing and clock skew scheduling, up to 63% improvements are demonstrated in minimum clock pe-
riod with respect to zero-skew edge-sensitive synchronous circuits. The timing constraints governing the
level-sensitive synchronous circuit operation not only solve the clock period minimization problem but also

Bracketed references placed superior to the line of text refer to the bibliography.
iii
provide a common framework for the general timing analysis of such circuits. The inclusion of additional
constraints into the problem formulation in order to meet the timing requirements imposed by specific ap-
plication environments is discussed.
DESCRIPTORS
Digital synchronous VLSI circuits Linear Programming
Optimization Static timing analysis
iv
TABLE OF CONTENTS
Page
NOMENCLATURE                                            x
1.0 INTRODUCTION                                         1
2.0 SYNCHRONOUS DIGITAL VLSI SYSTEMS                          2
2.1 Operation of a Synchronous System with Registers . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Graph Model of a Synchronous Digital VLSI System . . . . . . . . . . . . . . . . . . . . . 5
2.3 Single-Phase Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.0 TIMING PROPERTIES OF SYNCHRONOUS DIGITAL SYSTEMS              8
3.1 Parameters of an Edge-Sensitive Synchronous Circuit . . . . . . . . . . . . . . . . . . . . . 8
3.2 Parameters of a Level-Sensitive Synchronous Circuit . . . . . . . . . . . . . . . . . . . . . 9
4.0 TIMING ANALYSIS OF LEVEL-SENSITIVE SYNCHRONOUS CIRCUITS         13
4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 Latching Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.2 Synchronization Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.3 Propagation Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.4 Skew Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Iterative Solution Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 LP Problem Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.1 Validity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.2 Initialization Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.0 PROBLEM FORMULATION AND THE PROPOSED SOLUTION PROCEDURE      21
5.1 Modified Big M (MBM) Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 LP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.0 AN EXAMPLE AND EXPERIMENTAL RESULTS                       24
6.1 Digital Synchronous Circuit State of Operation . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2 Performance Results of the Procedure on the ISCAS’89 Benchmark Circuits . . . . . . . . . 27
6.3 Verification and Interpretation of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3.1 Parameter Data Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.3.2 Skew Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.4 Further Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.0 CONCLUSIONS AND FUTURE WORK                             36
v
APPENDIX A                                             37
APPENDIX B                                              40
APPENDIX C                                             43
BIBLIOGRAPHY                                             63
vi
LIST OF TABLES
Table No. Page
5.1 Modified Big M transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 The transformed constraints for the ‘Modified big M’ method. . . . . . . . . . . . . . . . . 23
6.1 ISCAS’89 benchmark circuits results showing the number of registers  and paths  (before
modification). Optimal clock periods, improvements and calculation time are denoted by  ,

and  , respectively. Subscripts 		
 represent circuit topologies for flip-flop based and
latch-based circuits, respectively. Superscripts 
fffiffifl
  indicate zero or non-zero
clock skew and restricted circuit (for clock periods only), and fi !
#"%$&$'
() *"%$&$ stand for
time borrowing, clock skew scheduling and both, respectively. . . . . . . . . . . . . . . . . 29
vii
LIST OF FIGURES
Figure No. Page
2.1 Finite state machine model of a synchronous system. . . . . . . . . . . . . . . . . . . . . . 3
2.2 A local data path in a globally clocked synchronous circuit network. . . . . . . . . . . . . . 4
2.3 Effects of time borrowing on circuit operation. The timing diagram for the edge-sensitive
circuit $,+-+ and level-sensitive circuit $,. are shown. The variables /
021
3 and /
041
3 represent
data propagation times on local data paths 5 076 5 1 and 5 186 5:9 , respectively. Data
propagation is represented by the arrows. Note that in the local data path 5 0;6 5 1 of $,< ,
the data signal arrival at 5 1 occurs during the transparent phase of 5 1 , borrowing time from
the adjacent local data path 5 1=6 5:9 . For identical data propagation times on adjacent
local data paths, a smaller clock period (higher operating frequency) is possible for $,< , that
is, -<?>@,+-+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 A graph representation of a synchronous system. The graph vertices are four different reg-
isters, with five local data paths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Single and multi-phase synchronization of a synchronous circuit. In multi-phase synchro-
nization, the non-overlapping clock phases are defined with identical on-times ( " <A ). The
parameter "CBEDGFHJILK denotes the clock signal at the originating clock source. The superscripts
MON


 ff ff 

JQP represent individual clock phases. Note that the multi-phase clock synchroniza-
tion is defined for 8RTS , where  represents the number of clock phases. . . . . . . . . . . 7
3.1 An edge-triggered flip-flop or register symbol. . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Typical operation of an edge-trigerred flip-flop shown in Figure 3.1. . . . . . . . . . . . . . 9
3.3 Timing properties of an edge-sensitive flip-flop in a circuit with a clock period VUVGW)XYZ .
The operation of the final flip-flop 5:[ of a local data path is illustrated. . . . . . . . . . . . . 10
3.4 A level-sensitive latch or register symbol. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5 Typical operation of a level-sensitive latch shown in Figure 3.4. . . . . . . . . . . . . . . . . 11
3.6 Timing properties of a level-sensitive latch in a circuit with a clock period \U\GW&X?Z . The
operation of the final latch 5:[ of a local data path is illustrated. . . . . . . . . . . . . . . . . 12
4.1 Propagation of the data signal in a simple circuit. Note that two local data paths starting at
the latches 5 0^] and 5 0`_ and ending at 5[ are considered. The time intervals for the arrival
and departure times of the data signal are illustrated by the upper and lower parallel dotted
lines, respectively. The lengths of the white and black rectangular boxes correspond to the
clock-to-output and data-to-output latch delays, respectively. . . . . . . . . . . . . . . . . . 14
4.2 Possible cases for the timing relationships among arrival and departure times for the data
signal at the latch 5 0 . The time intervals for the arrival and departure times are illustrated by
the upper and lower parallel dotted lines, respectively (The left and right ends of these dotted
lines correspond to earliest and latest times, respectively.). The lengths of the white and
black rectangular boxes correspond to the clock-to-output and data-to-output latch delays,
respectively. Note that cases V through VIII may exhibit clocking hazards as explained in
the text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
viii
4.3 The iterative algorithm offered in a
Nbc
. Note that  is the number of registers in the syn-
chronous circuit. The d , fl , e and / vectors are the earliest arrival/departure and latest
arrival/departure times, respectively, where the superscript   fff identifies the value of a
variable in the previous clock cycle. The variables $gEhO,ijE and k=mlnfloijp hold the timing
violation information for each register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.1 A simple synchronous circuit. Note that the minimum clock period with zero skew and
using flip-flops is \Urq (time units). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Zero clock skew and non-zero clock skew clocking schedules for the synchronous circuit
in Figure 6.1. The clocking schedule for the zero clock skew circuit is shown on the left,
with a minimum clock period of sUut  wvv . Non-zero clock skew scheduling results with
a minimum clock period of xUyt  2z|{ is shown on the right. For non-zero clock skew
scheduling, the optimal clock signal delays at the register are (} ] U zo 2z|{ , J} _ U zo w~ S { ,
 } U
z and  } U zo  tq { . The arrows represent data signal propagation on the respective
critical paths. Note that unlike the presented case, the critical paths for zero and non-zero
clock skew scheduling need not be identical. . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3 The optimized timing schedule for the benchmark circuit s27 operable with a minimum
clock period of sUt  
N
. Note that /
0
[
3,
U/
0
[
3

G5
06
5[ and $ 0 Uk 0 U/
0
-Ł
U
/
0

Ł
U
z are considered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 Distribution of data propagation times for s938 with  US registers and Ut ~v data
paths. The height of each bar corresponds to the number of paths within a given delay
range. For example, there are nine (9) paths with delays between 4 and 5 time units. . . . . . 30
6.5 Distribution of the maximum effective path delays in data paths of s938 for zero clock skew.
The target clock period is  = 20.6. The height of each bar corresponds to the number of
paths with an effective path delay within a given range. . . . . . . . . . . . . . . . . . . . . 31
6.6 Distribution of the maximum effective path delays in data paths of s938 for non-zero clock
skew. The target clock period is  = 9.085714. The height of each bar corresponds to the
number of paths with an effective path delay within a given range. . . . . . . . . . . . . . . 32
6.7 Distribution of the clock skew values of the non-zero clock skew case for s938. The target
clock period is U ~ 2z
b
{
q
N
t . The height of each bar corresponds to the number of paths
formed by sequentially adjacent pair of registers which have a clock skew within the given
range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.8 Distribution of the clock delay values of the non-zero clock skew case for s938. The target
clock period is \U ~ 2z
b
{
q
N
t . The height of each bar corresponds to the number of latches
being driven by a clock signal with a time delay within the given range. . . . . . . . . . . . 34
A-1 A simple synchronous circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
B-1 A simple synchronous circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ix
NOMENCLATURE
$ A synchronous digital VLSI circuit.

The graph representation of the synchronous circuit $ .
5 A register in a synchronous circuit network.
5
0 The initial register of a local data path.
5[ The final register of a local data path.
5
06
5[ The local data path formed by the sequentially adjacent registers 5 0 and 5:[ .
" A clock signal synchronizing a circuit network.
"
0 The clock signal driving the register 5 0 of a circuit network.
"C[ Clock signal driving the register 5:[ of a circuit network.
 The period of the clock signal " .
"
<
A The width of the active phase of the clock signal in a level-sensitive circuit.
"
+
A The width of the active phase of the clock signal in an edge-sensitive circuit.

0 The latency of the clock signal " 0 with respect to common clock signal.

[ The latency of the clock signal "C[ with respect to common clock signal.

0
[ The phase shift operator:

0
[U

0
X

[@

where  is the number of clock cycles occuring between respective cycles of the clock
signals " 0 and "C[ .

0 The delay of the clock signal " 0 from the clock source to the register 5 0 .
([ The delay of the clock signal "C[ from the clock source to the register 5[ .

B
9
KE;
j
 The clock skew between registers 5 0 and 5[ :

B
9
KE
j
&U
0
X([
 
/
0
[
3 The propagation delay of the data signal on the local data path 5 06 5:[ .
/
0
[
3 The maximum propagation delay of the data signal on 5 0-6 5:[ .
/
0
[
3, The minimum propagation delay of the data signal on 5 0-6 5:[ .
k[ The hold time of the latch 5[ .
x
$-[ The setup time of the latch 5[ .
/
-Ł The clock-to-output delay of a latch.
/
 Ł The data-to-output delay of a latch.
* 
 
The maximum function. Evaluates to the value of the variable with the highest value
given in the function.
*¡`¢
 
The minimum function. Evaluates to the value of the variable with the lowest value
given in the function.
	d£XjL   The function to return the number of incoming local data paths of a register.
 The number of registers in a synchronous circuit.
 The number of local data paths in a synchronous circuit.
¤
DJB 9 Kp
+-+
The minimum clock period of an edge-sensitive synchronous circuit under zero-clock
skew.

B
9
Kp,KG¥
+-+
The minimum clock period of an edge-sensitive synchronous circuit under non-zero
clock skew scheduling.
¤
DJB
9
Kp
<
The minimum clock period of a level-sensitive synchronous circuit under zero-clock
skew.

B
9
Kp,KG¥
<
The minimum clock period of a level-sensitive synchronous circuit under non-zero clock
skew scheduling.


O
+-+
The improvement of the minimum clock period of an edge sensitive synchronous circuit
through clock skew scheduling.


O
<
The improvement of the minimum clock period of a level-sensitive synchronous circuit
through clock skew scheduling.
|¦§
<
The improvement of the minimum clock period of a level-sensitive synchronous circuit
through time borrowing.
¦§

O
<
The improvement of the minimum clock period of a level-sensitive synchronous circuit
through time borrowing and clock skew scheduling.

H
<
The minimum clock period of an additionally restricted level-sensitive synchronous cir-
cuit with latches.

H
<
The improvement of the minimum clock period of an additionally restricted level-sensitive
synchronous circuit.
ABBREVIATIONS
LP Linear Programming
NLP Non-Linear Programming
MBM Modified Big M Method
xi
1.0 INTRODUCTION
The timing analysis of a digital VLSI synchronous circuit is subject to the structural and application-
specific timing constraints of the circuit. Various algorithms have been proposed to model the timing
scheduling problem of synchronous circuits as both linear and non-linear continuous optimization prob-
lems
 	
	
. A high-level categorization of such timing analyses is by the
register type; analyses targeting edge-sensitive (flip-flop based) and level-sensitive (latch based) digital syn-
chronous circuits. The timing scheduling problem of edge-sensitive synchronous circuits has been success-
fully addressed by both linear and quadratic programming approaches in
 
. Furthermore, it is known
that the timing constraints for level-sensitive synchronous digital circuits contain non-linearity due to time
borrowing
 
. Previous studies on level-sensitive circuits have focused on relaxing these non-linear con-
straints by using iterative solution methodologies
 
. In
 
, the effects of time borrowing on
the circuit operation are considered using an iterative approach starting with a bound for the minimum clock
period and checking for timing violations. The formulation in
 
provides a limited consideration of non-
zero clock skew in the timing analysis and is modified in
 
to accommodate for local clock skew as well
as global clock skew. Both of the analyses offered in
 
and
 
consider clock skew as a fixed numer-
ical constant. Considering clock skew as a variable within a permissible range, instead of a fixed value,
permits operation at higher operating frequencies (through clock skew scheduling   ). In this work, the
linear timing analysis problem generated by the modified big M(MBM) method not only accounts for time
borrowing
 
but also applies clock skew scheduling
 
in order to calculate the minimum clock period
of a level-sensitive synchronous circuit.
This thesis integrates the ongoing research efforts at the Electrical Engineering Department of the
University of Pittsburgh in order to develop a novel timing analysis and optimization procedure for high-
performance level-sensitive synchronous circuits. The formulation and solution procedures proposed herein
are also introduced in
 

–

. Note that the static timing analysis of level-sensitive synchronous circuits is
pursued both in previous and the presented work.
1
The objective of this work is to formulate the clock period minimization problem of level-sensitive
synchronous circuits under non-zero clock skew as a Linear Programming (LP) problem. In this type of
timing analysis, time borrowing is accounted for and clock skew scheduling is applied in a single-phase
synchronization scheme of operation. The resulting problem for level-sensitive circuits is inherently non-
linear. Thus, the MBM method is introduced in order to linearize the problem formulation. The proposed
formulation and calculation procedures are completely automated. The linearization procedure presented
here differs from the iteration-based algorithms
 
in that it provides a stand-alone LP model
with a specific objective function and distinct set of constraints that can be solved by any standard LP
solver. The LP problems formulated in this work are solved using the industrial LP solver CPLEX which
implements a rich set of memory and speed efficient optimization algorithms
 
. As the CPLEX solvers
implement variants of the simplex method
 ff
to solve the LP model problem, the expected computational
complexity of the procedure is polynomially bounded
 ff
.
The rest of this document is organized as follows. The fundamentals of synchronous system operation
and system modeling for computational analysis are summarized in Chapter 2. Abstract operation of system
storage elements and parameterization of the relevant timing properties and problem variables are presented
in Chapter 3. Chapter 4 summarizes the current state of static timing analysis of level-sensitive synchronous
circuits and introduces the timing constraints. The development of the proposed problem formulation and
the generated LP model problem are presented in Chapter 5. The analysis and validation of the results as
the presented procedure is applied on the suite of ISCAS’89 benchmark circuits are presented in Chapter 6.
Comments on the experimental results and concluding remarks are offered in Chapter 7.
2
2.0 SYNCHRONOUS DIGITAL VLSI SYSTEMS
VLSI is an acronym that stands for very large scale integration. This term is used to refer to a broad area
of electrical and computer engineering application fields, where the focus is on the design and analysis of
dense electronic integrated circuits. The investigation of high-performance computational elements, high-
density memory elements, sequential and combinational control logic units, sub-micron size analog circuits,
etc. are some samples among various others, in fields related to VLSI circuit design.
VLSI circuits can be classified according to typical field of application, details of the manufacturing
process, operational characteristics and other features. In this study, digital VLSI circuit design is of partic-
ular interest and in the rest of the document, the term VLSI design is used to refer to this particular group of
circuits unless otherwise indicated.
The term digital VLSI circuit design implies the design of sequential and combinational circuits. Se-
quential circuits consist of logic and register (or storage) elements while combinational circuits consist only
of the former. A detailed discussion of sequential and combinational circuits is presented in
 
. All se-
quential circuits have a well-defined ordering of switching events that ensure the correct ordering of data
propagation between register elements and ultimately between the input and output pins. The majority of
the sequential circuits currently on the market are synchronous circuits, where the sequentiality in time of
the data transfers are provided by a globally distributed synchronization signal. The globally distributed
synchronization signal is called the clock signal and defines the timing scheme or timing discipline of the
synchronous circuit
 

. Furthermore, the distribution of the clock signal throughout the circuit, in order
to generate and distribute the time reference to each register, is accomplished through a highly specialized
structure called the clock distribution network or clock tree network.
A digital synchronous circuit is a network of combinational logic elements and globally clocked reg-
isters. The combinational logic elements implement the functionality of the circuit, where the clocked
registers serve to store the computation results at each clock cycle. The logical order of the computation
and storage processes is synchronized by the globally distributed clock signal. An overall representation of
a typical synchronous circuit is shown in Figure 2.1.
3
COMBINATIONAL
LOGIC
CLOCKED STORAGE
(REGISTERS)
CLOCK DISTRIBUTION
NETWORKfi
SYNCHRONIZATIONfl
COMPUTATION
INPUTffi
DATA
OUTPUT
DATA
CLOCK SIGNAL 
Figure 2.1 Finite state machine model of a synchronous system.
The operation of a synchronous system involves gradual propagation of data signals from the input pins
towards the output pins of the circuit. The gradual advancement—and computation—of the data signal
is orchestrated by the clock signal. Typically, the active level or transition of the clock signal initiates
the propagation of data from the input terminals and storage elements towards the output terminals and
next stages of storage elements. Each clock cycle constitutes a computational cycle, where the data signal
departs from the source, is processed in the combinational logic between the registers and is finally stored
in or delivered to the destination register or the output terminals of the synchronous circuit, respectively.
In order to analyze the operational characteristics of a synchronous circuit, local data paths are defined.
The overall operation of a synchronous circuit is the sequential execution of a large set of simple computa-
tions that occur at each local data path. A local data path is the building block of a synchronous circuit and is
4
Register Ri Register R f
D
C
D
C
Q Qi
Data
Q
X f
Data
Combinational
Logic
Clock Ci Clock C f
Xi
Data In
Q f
Data Out
Figure 2.2 A local data path in a globally clocked synchronous circuit network.
formed by a collection of combinational logic blocks between two clocked storage elements. The data signal
is processed within a local data path once for each clock signal cycle, where the data signal initiating from
a storage element is processed in the combinational logic block and is stored in the next stage of storage
elements, ready for the next clock cycle.
Definition 1: Local data path. Let !#" and !%$ be two registers in the clocked circuit network where i and
f stand for initial and final, respectively. Let the input and output terminals of these registers, and the signals
present at these terminals be defined as shown in Figure 2.2. A local data path is the circuit architecture
formed by a sequentially adjacent pair of registers   and the combinational logic block between them.
The output & " of the initial register propagates through the combinational logic block, evaluating to the data
signal '($ , before the data signal '($ arrives at the final register !%$ . For proper operation of the sequentially
adjacent pair of registers !#" and !%$ , the data stored in !)" must be manipulated by the logic and stored into
!%$ during the next cycle of the clock signal *+$ . Note that the clock signals *," and *+$ are representations
for the two synchronization signals at the clock input terminals of registers !)" and !%$ , respectively. The
clock signals *," and *+$ differ by the nonidentical clock signal delays at respective registers.
The modeling and behavior of synchronous digital systems are described in the rest of this section.
Specifically, the operation of a synchronous system with level-sensitive latches is presented in Section 2.1.
The modeling of a sequential circuit as a graph is briefly reviewed in Section 2.2. Different synchronization
schemes adopted for synchronous circuits are briefly introduced in Section 2.3.
5
2.1 Operation of a Synchronous System with Registers
Because of interconnect delays and process parameter variations, there may be significant time delays
between the clock signals originating at the clock source and arriving at the destination registers. The
algebraic difference between the delays of the synchronizing clock signals of the initial and final register of
a local data path is defined as clock skew
 
:
Definition 2: Clock Skew. Let ! " and !%$ be a sequentially adjacent pair of registers (only combinational
delay between registers) synchronized by the clock signals *," and *+$ , respectively. The clock skew between
!#" and !%$ is defined as
-/.102436587
:9<;>=@?
"A
?
$
 (2-1)
where
?
" and
?
$ are delays of the clock signals, * " and *+$ , from a common clock source to the registers ! "
and !%$ , respectively
 
.
The clock skew is an algebraic difference which may evaluate to a negative, zero or positive value
depending on the values of
?
" and
?
$ . Positive clock skew has a limiting effect on the maximum operating
frequency of a synchronous circuit. Negative clock skew on the other hand may effectively improve the
minimum clock period of a circuit. Precise engineering of the clock tree network enables the utilization of
negative skew on critical paths and permits positive clock skew on less-critical paths in order to speed up
the data propagation. This approach is called clock skew scheduling
 
.
Definition 3: Clock skew scheduling. Clock skew scheduling is a methodology to determine the optimal
values of clock signal delays
?
0
to each register ! 0 in order to obtain the maximum operating frequency.
Clock skew scheduling provides shorter minimum clock periods on critical paths. Note that generally, the
term clock skew scheduling refers to non-zero clock skew scheduling
 
.
Excessive negative and positive clock skew may lead to timing hazards in the circuit. Negative skew
may cause data to be latched into the final register !%$ during an earlier clock cycle than intended, thereby
overwriting data latched during the earlier clock cycle. This type of hazard is known as double clocking
 
.
Similarly, positive skew may cause data to be lost by arriving late at the final register. This phenomenon is
6
known as zero clocking
 
. The double clocking and zero clocking hazards are also called hold and setup
time violations, respectively
 
.
Data propagation in level-sensitive circuits is substantially different compared to edge-sensitive syn-
chronous circuits due to the transparency property of latches. In level-sensitive circuits, the data signal
arriving at a latch in the transparent phase is immediately propagated through the latch. This fact leads to
the phenomenon called time borrowing.
Definition 4: Time borrowing   . Time borrowing (also called cycle stealing  B ) refers to the time
sharing phenomenon between consecutive clock cycles of adjacent local data paths due to the transparency
of level-sensitive latches. Let !#"6CD!FE and !FEGCD! 0 be two local data paths in a synchronous circuit H .
Let HI/I and HJ denote the edge-sensitive (flip-flop-based) and level-sensitive (latch-based) synchronous
circuits, respectively. In the edge-sensitive synchronous circuit HI/I , the upper bound on data propagation
time on the local data paths !)"KC !+E and !FELC ! 0 is the minimum clock period - I/I . In the level-
sensitive circuit HJ , the transparency property of the latch !FE permits data propagation times higher than
the minimum clock period - J on the local data path ! " C ! E by borrowing time from the propagation
on the next local data path—next clock cycle— !FEMC ! 0 . It is known that unless the circuit topology
of the edge-sensitive circuit HI/I or the level-sensitive circuit HJ is modified, the data propagation times
in both circuits are identical. Therefore, a shorter minimum clock period - JON - I/I is feasible for the
level-sensitive circuit HJ . This fact is illustrated in Figure 2.3, where the timing diagrams for H/J and HI/I
are shown on the left and right, respectively. Detailed investigation of the time borrowing phenomenon is
presented in
 
.
2.2 Graph Model of a Synchronous Digital VLSI System
A graph model is often used for computer representation of a synchronous digital system. For con-
venience, the graph model representing a synchronous circuit, where each vertex represents a register and
each edge represents a local data path, is called a circuit graph. A circuit graph provides a common abstract
framework for the automated analysis of circuits.
7
TFF TFF TFF
Ci
C j
Ck
Di jP D
jk
P
TL TL TL
Ci
C j
Ck
Di jP D
jk
P
SFF SL
Figure 2.3 Effects of time borrowing on circuit operation. The timing diagram for the edge-sensitive
circuit H/I/I and level-sensitive circuit H/P are shown. The variables Q "RES and Q E
0
S represent data propagation
times on local data paths !#"TC !+E and !FE@C ! 0 , respectively. Data propagation is represented by
the arrows. Note that in the local data path ! " C ! E of H J , the data signal arrival at ! E occurs during
the transparent phase of !FE , borrowing time from the adjacent local data path !+EUC ! 0 . For identical
data propagation times on adjacent local data paths, a smaller clock period (higher operating frequency) is
possible for HJ , that is, - JVN - I/I .
Definition 5: Circuit graph   . A fully synchronous digital circuit H is represented as the connected
undirected simple graph W%X . The graph W%X is the ordered six-tuple
W%X
=ZY\[^]
X`_
baK]
Xc_
bde]
Xc_
:f
]
X`_
P
:f
]
Xc_
g
:f
]
Xc_
h i , where
[K]
Xc_
=kjlmnmnojp
is the set of vertices of the graph
W
X ,
aK]
Xc_
=Oq
l
mnmn:qbr
is the edges of the graph W X ,
de]
Xc_
=s  t
]
Xc_
"E

pvu`p
is the symmetric adjacency matrix of
W%X
 
. Note that w is the number of registers and x is the number of data paths in the synchronous circuit.
Each vertex from
[
]
Xc_
represents a register of the circuit H . The mappings
f
]
Xc_
P
:
a
]
Xc_
C y and
f
]
X`_
g :
a
]
Xc_
C y to the set of real numbers y assign the lower and upper permissible bounds Q
0
Sz and Q
0
S<{ ,
respectively. The lower and upper bounds of data propagation time Q
0
Sz , Q
0
S<{ are defined between the
sequentially-adjacent pair of registers (on the local data path) represented by the edge q 0}| a . The edge
labeling
f
]
X`_
h defines a direction of signal propagation for each edge
j~
,
q
,
j
, where the subscripts  and
 denote two arbitrary registers forming the local data path indicated by the subscript  .
The described undirected graph is used as the basis for a directed graph with the same vertices set
[
and edge set
a
. The topology of the graph is preserved, while the edge set is modified in order to
8
v1 v2 v3
v4
e12

e32

e42

e 34
e13

Figure 2.4 A graph representation of a synchronous system. The graph vertices are four different registers,
with five local data paths.
accommodate for the direction of data flow between the vertices. The generated directed graph, where each
vertex corresponds to a register and each directed edge corresponds to a data path, represents a sequential
circuit. The directed graph representation of a sample network is presented in Figure 2.4.
2.3 Single-Phase Synchronization
As mentioned earlier, the operation of a synchronous circuit is orchestrated by a globally distributed
clock signal. The clock signal ensures the correct ordering of operations on local data paths. Synchronization
in such circuits can be provided by a single or multi-phase clock signal. A single phase clock signal is a
periodic signal where the phases or cycles are indistinguishable instances referred to a fixed reference cycle.
However, the easy-to-implement and easy-to-analyze single-phase scheme has several shortcomings in the
synchronization of state-of-the-art VLSI circuits. Below nanometer feature sizes, the wire sizes shrink
disproportionally with the feature size. Thus, only a certain percentage of the chip is reachable during a
single clock cycle
 
. For instance, for 0.1 /Ł feature size, only 16% of the die is predicted to be reachable
within a single clock cycle
 
. A multi-clock domain approach is advantageous in terms of increasing the
reachability of circuit registers, routing and creating less skew within physically neighboring local clock
domains and saving power. Furthermore, individual domains can be designed and operated independently,
which provides higher granularity towards frequency and voltage scaling.
Even though multi-phase synchronization is advantageous in many aspects, the analyses of such clock-
ing schemes present many challenges. Identification of the clock tree and partitioning into various clock
9
CLW  T  2
φ

T  2Csource
T T
C1source
C2source
C  n  1 source
Cnsource
CLW  T  n
CLW  T  n
CLW  T  n
CLW  T  n
φ1

0
φ2

T  n
φ  n  1 

T  n  2  n
φ  n 

T  n  1  n
Single-phase clock signal Multi-phase clock signal
Figure 2.5 Single and multi-phase synchronization of a synchronous circuit. In multi-phase synchroniza-
tion, the non-overlapping clock phases are defined with identical on-times ( * J ). The parameter * .1 g po 2
denotes the clock signal at the originating clock source. The superscripts 
mmmo
represent individual
clock phases. Note that the multi-phase clock synchronization is defined for
 
, where

represents the
number of clock phases.
domains, identifying signals that work across different clock domains, identifying data stability and combi-
natorial input isolations are common concerns. In the presented work, only the single-phase synchroniza-
tion of synchronous circuits is investigated. Figure 2.5 presents a generic representation of single-phase and
multi-phase symmetric clock signals.
The generic formulation for multi-phase clocking is defined for

i

. The latency of each clock cycle
with respect to the common clock signal is unique and is denoted here by the parameter ¡" . The parameter
¡
" is the latency of the clock signal at register ! " , which is synchronized by clock phase *
r¢
. The symbol
¡
"
$ is called the phase shift operator   which is used to transform timing variables between different cycles
of the clock signal. The phase shift operator ¡ " $ is defined by the algebraic equation ¡ " $
=
¡£"AM¡$)¤¦¥
-
,
where ¥ is the number of clock cycles occurring between the clock phases *
rv¢
and *
r§
. Note that for the
single-phase clock methodology, the phase shift operator evaluates to ¡
"
$
=
-
. The improvement of the
presented timing analysis in order to accommodate multi-phase or multi-domain clock synchronization is
among the future directions of this research.
10
3.0 TIMING PROPERTIES OF SYNCHRONOUS DIGITAL SYSTEMS
The general structure and principles of operation of a fully-synchronous digital VLSI system were de-
scribed in Chapter 2. In an abstract overview, a synchronous circuit is identified by the storage elements
and the synchronization scheme of the circuit. As presented in Section 2.3, the operation of a digital VLSI
circuit is heavily dependent on the synchronization scheme. Furthermore, the registers directly contribute to
the operation of the circuits by storing the data values at the end of each computational cycle and providing
stable data signals.
Registers can be classified into two categories: (edge-triggered) flip-flops are sensitive to the changes
in their data input terminals when the clock signal has low-to-high or high-to-low transition while (level-
sensitive) latches are sensitive when the clock signal has a certain level or value. In level-sensitive circuits,
the active level of the synchronizing clock signal defines the transparent phase and the inactive level of the
clock signal defines the opaque phase of latch operation. The transition of the clock signal which starts the
transparent phase is called the leading edge, while the trailing edge is the transition of the clock signal which
concludes the transparent phase and marks the beginning of the opaque phase. This chapter begins with the
introduction of two different types of storage elements—flip-flops and latches. The principles of operation
of these registers are discussed and parameters defining the operational characteristics are introduced. In
particular, the operation of edge-triggered flip-flops are discussed in Section 3.1 and the operation of level-
sensitive latches are discussed in Section 3.2.
3.1 Parameters of an Edge-Sensitive Synchronous Circuit
The specific circuit design or electrical implementation of an edge-triggered flip-flop need not be con-
sidered in this work. At a higher level of abstraction, the timing properties of flip-flops are encapsulated by
certain timing parameters. These parameters connect the events on the input, output and clock terminals of
a flip-flop.
11
A flip-flop is a type of register which is sensitive to the transition of the synchronizing clock signal.
Therefore, a flip-flop is commonly referred to as an edge-triggered flip-flop or edge-triggered register. A
typical edge-triggered flip-flop with a clock signal * , input signal Q and output signal & is shown in
Figure 3.1. The operation of a flip-flop is presented in Figure 3.2. Note that the presented flip-flop is a
Data
Output
Data
Input¨
Clock©
Input
D
C©
Q
Figure 3.1 An edge-triggered flip-flop or register symbol.
positive-edge triggered flip-flop, whose data output latches the input signal when the clock signal makes a
low-to-high transition. The sensitive region for a flip-flop, where the register latches the data, is indicated
by the shaded region in Figure 3.2.
C
D
Qª
Clock Period T 
CLK
DATA
IN
DATA
OUT«
Figure 3.2 Typical operation of an edge-trigerred flip-flop shown in Figure 3.1.
Parameters ¬­$ , H/$ , Q^®/¯ and * J which stand for the hold time, setup time, clock-to-output delay and
clock on-time, respectively are introduced briefly. Hold time is the minimum time that the data signal Q
must remain stable after the latching edge of the clock signal so that it is registered at the intended clock
cycle. In Figure 3.3, the value ¬($
=@?±°
A
?±²
labels the hold time for the given clock cycle on !%$ . Setup time
12
t1 t4t3 t5t2 t6
C f
(Clock)
X f
(Data In)
Q f
(Data Out)
Clock Period T
H fS f
D fCQ
Figure 3.3 Timing properties of an edge-sensitive flip-flop in a circuit with a clock period -
=³?±´
A
?µl
.
The operation of the final flip-flop !¶$ of a local data path is illustrated.
is the minimum time between the latching edge of the respective clock cycle and a change in '($ such that
the new data value can be registered in the intended clock cycle. The setup time on !¶$ is illustrated with
H/$
=·?
²
A
?¹¸
. The propagation delay of the data signal from the input terminal to the output terminal after
the active transition of the clock signal—clock-to-output delay—is shown as
?¹º
A
?¹²
. The subscripts m and
M appended to the parameter Q ®/¯ stand for the minimum and maximum delay values, respectively.
A typical clock cycle of a clock signal is shown in Figure 3.3. The length of the clock period is denoted
by the parameter - . The minimum time interval between the leading and trailing edges of the clock signal
is represented by the parameter * J . In Figure 3.3, the leading and trailing edges occur at times
?¹²
and
?¹»
,
respectively. As pointed out in Section 2.2, the data propagation time Q S through the combinational logic
block of a local data path !)"<C¼!%$ is defined Q Sz N½Q S N½Q S<{ . The subscripts Ł and ¾ stand for the
minimum and maximum values, respectively.
3.2 Parameters of a Level-Sensitive Synchronous Circuit
The specific circuit design or electrical implementation of a level-sensitive latch need not be considered
in this work. At a higher level of abstraction, the timing properties of such latches are encapsulated by
13
certain timing parameters. These parameters connect the events on the input, output and clock terminals of
a level-sensitive latch.
A latch is a type of register which is sensitive to the level of synchronizing clock signal. Therefore, a
latch is commonly referred to as a level-sensitive latch or level-sensitive register. A typical level-sensitive
latch with a clock signal * , input signal Q and output signal & is shown in Figure 3.4. The operation of
Data
¿
Output
Data
¿
Input¨
Clock©
Input¨
D¿
C©
Q
Figure 3.4 A level-sensitive latch or register symbol.
C
D
Qª
CLK
DATA
INÀ
DATA
OUT
OpaqueÁ
StateÂ
TransparentÃ
StateÂ
Figure 3.5 Typical operation of a level-sensitive latch shown in Figure 3.4.
the level-sensitive latch is presented in Figure 3.5. Note that the presented latch is a positive-level sensitive
latch, whose data output follows any change in the input signal when the clock signal remains at its positive
value or level. As stated before, the state of the level-sensitive latch, where any change in the input signal
is propagated to the output is called the transparent state or phase. In Figure 3.5, the transparent state is
indicated by the shaded region on the timing diagram.
14
t1 t2 t3 t4 t5 t6 t7 t8
C
(Clock)
X f
(Data In)
Q f
(Data Out)
Clock Period T
H f
S f
D fCQ D
f
DQ
Figure 3.6 Timing properties of a level-sensitive latch in a circuit with a clock period -
=O?¹´
A
?µl
. The
operation of the final latch !%$ of a local data path is illustrated.
Parameters ¬($ , H/$ , Q^Ä
¯
, Q
®/¯
and * J which stand for the hold time, setup time, data-to-output delay,
clock-to-output delay and clock on-time, respectively are introduced briefly. Hold time is the minimum time
that the data signal Q must remain stable after the trailing edge of the clock signal so that it is latched during
the intended clock cycle. In Figure 3.6, the value ¬($
=Å?¹¸
A
?
l
labels the hold time for the given clock
cycle on the final register !%$ of a local data path. Setup time is the minimum time between the trailing
edge of the respective clock cycle and a change in '($ such that the new data value can be latched in the
intended clock cycle. The setup time on !%$ is illustrated with H/$
=Æ?¹´
A
?oÇ
. The propagation delay of the
latch from the data input terminal to the output terminal—data-to-output delay—on !%$ is shown as
?¹»
A
?¹º
.
The propagation delay of the latch from the clock input terminal to the output terminal—clock-to-output
delay—is shown as
?1°
A
?¹²
. The subscripts m and M appended to the parameters Q^Ä
¯
and Q
®/¯
stand for
the minimum and maximum delay values, respectively.
Without affecting the generality of the presented work, the formulation of the timing constraints is
derived for a specific reference clock cycle. The reference clock cycle can be selected as starting with the
inactive value of the clock signal followed by the active value (opaque-phase-first) or vice versa (transparent-
phase-first). In this work, the timing constraints are formulated considering an opaque-phase-first clock
signal driving positive level-sensitive latches. A typical clock cycle of such a clock signal is shown in
15
Figure 3.6. The length of the clock period is denoted by the parameter - . The minimum time interval
between the leading and trailing edges of the clock signal—commonly called the on-time and defining the
transparent phase—is represented by the parameter * J . In Figure 3.6, the leading and trailing edges occur
at times
?±²
and
?¹´
, respectively.
The data propagation time Q S through the combinational logic block of a local data path !)"eC !%$
was defined in Section 2.2 (Recall that Q Sz NÈQ S NÈQ S<{ ). The propagation of the data signal through
the data path !)",CÉ!%$ is formulated during two consecutive clock cycles, generically called the ¥ -th and
5
¥Z¤
v;
-th. The data signal departs from !#" during the ¥ -th clock cycle, is processed in the combinational
logic block and arrives at the destination register !¶$ during the 5 ¥¤
v;
-th clock cycle. Note that the departure
of the data signal from ! " occurs during the transparent phase of the ¥ -th cycle. The arrival of the data signal
at !¶$ can occur both during the transparent and opaque phases of the 5 ¥^¤
v;
-th cycle. If the data signal
arrives during the transparent phase, it is immediately propagated through the latch !%$ . If the data signal
arrives during the opaque phase of !%$ , the data signal has to remain stable until the latching edge of the
clock signal (beginning of the transparent phase) to propagate through the latch !%$ .
16
4.0 TIMING ANALYSIS OF LEVEL-SENSITIVE SYNCHRONOUS CIRCUITS
Digital VLSI synchronous circuits are subject to different types of timing analyses. The presence of
a globally distributed synchronization signal constitutes a merit of comparison between any given sets of
synchronous circuits. This fact has been widely used in the electronics industry. General timing analysis
of such circuits, however, span more detailed analyses of the circuit performance and have been utilized
for various engineering purposes. Timing analysis of synchronous circuits have traditionally been studied
on three different problems: clock period minimization
 	
–
BÊ
, clock period verifica-
tion
 
and clock retiming
 
. Clock period minimization is the analysis of a synchronous circuit in
order to solve for the minimum clock period, in other words for the maximum operating frequency, of a
synchronous circuit. Clock period verification is the analysis to ensure that a synchronous circuit is fully-
operational for a given clock period. Clock period retiming—also called circuit retiming—is the analysis of
a synchronous circuit aiming to achieve higher operating frequencies by modifying the circuit architecture.
Even though there are different types of timing analysis problems, the operation of the synchronous
circuit under scrutiny is generally identical in all cases (possibly except for retiming problems). Thus, in
the formulation of the timing analysis problem, a framework of constraints identifying synchronous circuit
operation is essential. For instance the setup and hold time requirements of each register element in a
synchronous circuit pose certain constraints on circuit operation, thus they are represented in all types of
timing analysis problems.
The generation of a general framework for the timing analysis of level-sensitive circuits is discussed in
this chapter. The clock period minimization problem is modeled and the generated problem is later solved
in Chapter 5. In Section 4.1, the operational constraints
 BÊ
governing level-sensitive operation are intro-
duced. In Section 4.2, a brief overview of previously offered algorithms for the clock period minimization
problem is described. The constructional constraints defined for the novel linear programming model of the
clock period minimization problem are presented in Section 4.3.
17
4.1 Problem Formulation
Certain conditions must be satisfied for every sequentially adjacent pair of registers in a synchronous
circuit in order to prevent timing hazards. These conditions are encapsulated by four sets of operational
constraints and two sets of constructional constraints. The operational constraints
 BÊ
are the constraints
that model the operation of a synchronous circuit. The constructional constraints
 BÊ
are defined to
ensure the correctness and completeness of the proposed model of te optimization problem. The defi-
nitions for the first three sets of operational constraints—called latching, synchronization and propaga-
tion constraints, respectively—are borrowed from
 
. The fourth set of operational constraints—called
the skew constraints
 

—are derived from the skew definitions for edge-sensitive synchronous circuits
presented in
 
. The latching, synchronization, propagation and skew constraints are described in Sec-
tions 4.1.1 , 4.1.2 , 4.1.3 and 4.1.4, respectively.
4.1.1 Latching Constraints
Latching constraints bound the arrival time of the data signal '($ (recall the local data path in Figure 2.2)
in order to ensure that '($ is latched during the intended clock cycle. The earliest arrival time of '($ at the
data input terminal of !%$ is denoted by the parameter
t
$ . Similarly, the latest arrival time of '($ is denoted
by
d
$ . Both parameters are defined in the frame of reference of the native clock cycle, that is, relative to the
beginning of the current clock cycle. In Figure 4.1 for instance, the earliest data arrival at !%$ occurs at time
?
$#¤@¥
-
¤
t
$ with respect to global time zero ( ? $%¤@¥ - is the beginning of the current clock cycle). The
interval for the data arrival time is characterized by the hold time and the setup time requirements of !%$ as
follows:
¬­$%N
t
$ (4-1)
d
$ËN
-
AH<$
m (4-2)
Eq. (4-1) above constrains the earliest arrival of '($ at !%$ . The earliest data arrival time must be no earlier
than hold time after the trailing edge ( ?±¸ in Figure 3.6) of the previous clock cycle. Suppose the 5 ¥G¤ v; -th
18
tskew Ì i1 Í i2 Î`Ï 0
tskew Ì i1 Í f Î`Ð 0
tskew Ì i2 Í f Î`Ð 0
ai1 Ai1Di1di1
ai2 Ai2Di2di2
a f A fD fd f
Di1 fPm
Di1 fPM
Di2 fPm
Di2 fPM
k-th clock cycle k Ñ 1-th clock cycle
k-th clock cycle k Ñ 1-th clock cycle
k-th clock cycle k Ñ 1-th clock cycle
ti1 Ñ Ì k Ò 1 Î T ti1 Ñ kT ti1 Ñ Ì k Ñ 1 Î T
ti2 Ñ Ì k Ò 1 Î T ti2 Ñ kT ti2 Ñ Ì k Ñ 1 Î T
t f Ñ
Ì
k Ò 1
Î
T t f Ñ kT t f Ñ
Ì
k Ñ 1
Î
T
CLKRi1
CLKRi2
CLKR f
Figure 4.1 Propagation of the data signal in a simple circuit. Note that two local data paths starting at
the latches !#"ÔÓ and !)"Õ and ending at !%$ are considered. The time intervals for the arrival and departure
times of the data signal are illustrated by the upper and lower parallel dotted lines, respectively. The lengths
of the white and black rectangular boxes correspond to the clock-to-output and data-to-output latch delays,
respectively.
clock cycle at latch !%$ is illustrated in Figure 3.6, where
?
l
=Ö?
$e¤@¥
-
. The hold time is defined by the
difference
?
¸
A
?µl
. If data arrives at !¶$ earlier than the hold time, a double-clocking hazard occurs
 
.
Similarly, Eq. (4-2) represents the setup constraint on !¶$ . As shown in Figure 3.6, the data must arrive
at the final latch at least setup time prior to the trailing edge of the clock cycle. Assuming the 5 ¥<¤
v;
-th clock
cycle is illustrated in Figure 3.6, the trailing edge of the clock cycle occurs at
?
´
= ?
$×¤
5
¥6¤
v;
-
. Thus, data
cannot be latched into !%$ during the 5 ¥#¤
v;
-th cycle if the data arrives later than
?oÇF=@?
$,¤
5
¥#¤
v;
-
A}H/$ .
Late arrival of the data signal results in a zero clocking hazard
 
as previously explained.
4.1.2 Synchronization Constraints
Synchronization constraints define the departure time of the data signal &%" from the initial latch of a
local data path as illustrated in Figure 4.2. The departure time from a latch depends on the state of the
latch—transparent or opaque. Implementation-specific register internal delays, Q Ä ¯ and Q^®/¯ , affect the
19
departure times in transparent and opaque states of operation, respectively. The earliest departure time Øc" of
&¶" from !#" is defined in Eq. (4-3). The latest departure time QZ" is defined by Eq. (4-4):
ØB"
=¦Ù^ÚÛ^ÜÝt
"¤MQ
"
Ä ¯
z

-
A*
J

¤ÞQ
"
®/¯
z¶ß
 (4-3)
QZ"
=¦Ù^ÚÛ Ü d
"£¤ÞQ
"
Ä ¯
{

-
A*
J

¤¦Q
"
®/¯
{ ß
m (4-4)
An exhaustive inspection of all possible cases of earliest and latest departure times during the ¥ -th clock
cycle is shown in Figure 4.2.
Consider Eq. (4-3), which describes the earliest departure time of the data signal & " from latch ! " . The
first term of the
Ù^ÚÛ
function, à
t
"	¤¦Q
"
Ä
¯
zâá , describes the time instant when the input data arrival occurs
at its earliest time during the active phase of the clock signal *," . The data signal immediately propagates
through the latch (as illustrated in cases I and VIII of Figure 4.2). In these cases, the earliest departure time
ØB" from !)" depends on the earliest arrival time
t
" of the data signal and the time Q "
Ä
¯
it takes for the data
to appear at the output terminal of !#" .
The second term of the
Ù(ÚÛ
function, à - A* J ¤¦Q "
®/¯
zâá , refers to the case when the earliest data
arrival time occurs during the opaque phase of !#" . In the opaque phase of operation, the departure time of
the data signal from the initial latch occurs clock-to-output delay Q "
®/¯
later than the leading edge of the
clock signal. Such data propagation is illustrated in cases II-VII of Figure 4.2. The
Ù(ÚÛ
function is used to
combine these cases, and to define the earliest departure time Ø " from the initial latch ! " . Similar reasoning
applies to the derivation of the latest departure time Q " defined by Eq. (4-4).
4.1.3 Propagation Constraints
Propagation constraints define the arrival time of the data signal '($ at the final latch !%$ of a local data
path. These constraints are as follows:
t
$
=Åã£Ù^äå
"Dæ
ØB"£¤MQ
"
$
Sz
¤
-/.10243,587
:9<;4çè
A
- (4-5)
d
$
=
ã
Ù^ÚÛ
" æ
QZ"¤MQ
"
$
S<{
¤
-<.40213,587
:9<;
çè
A
-
m (4-6)
20
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
Case I Case II
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
Case III Case IV
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
Case V Case VI
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
k-th clock cycle
 ti éëê k ì 1 í T ti é kT 
Case VII Case VIII
Figure 4.2 Possible cases for the timing relationships among arrival and departure times for the data signal
at the latch !#" . The time intervals for the arrival and departure times are illustrated by the upper and lower
parallel dotted lines, respectively (The left and right ends of these dotted lines correspond to earliest and
latest times, respectively.). The lengths of the white and black rectangular boxes correspond to the clock-
to-output and data-to-output latch delays, respectively. Note that cases V through VIII may exhibit clocking
hazards as explained in the text.
For each incoming path to latch !%$ , the lower bound for
t
$ is individually calculated using the expression
æ
ØB"£¤MQ
"
$
Sz
¤
-<.10î213,587
:9<;
A
-
ç
. The minimum of the arrival times among the incoming data paths is as-
signed as the earliest arrival time at !%$ . The latest arrival time
d
$ for the data signal is defined similarly.
In case of multiple data paths fanning into !¶$ , the maximum of the arrival times among the incoming data
paths is the latest arrival time of the data signal at !%$ . These two facts are implied in the formulation by the
inclusion of the
Ù^äå
and
Ù(ÚÛ
functions in Eqs. (4-5) and (4-6), respectively.
The propagation constraints are illustrated on a sample synchronous circuit in Figure 4.1. The earliest
arrival time is illustrated on the data path !#"8ÓïCð!%$ . The data signal departs from !#"8Ó at time ØB"8Ó and prop-
agates on the data path !#"8Ó,Cð!%$ for a time period of Q "ÔÓ $Sz . On this data path, the recorded earliest data ar-
rival time
æ
Øc"ÔÓñ¤MQ
"8Ó
$
Sz
¤
-<.402136587
lv:9<;
A
-
ç
is earlier than the arrival time
æ
ØB"Õò¤MQ
"Õ
$
Sz
¤
-<.10î2136587
¸
:9<;
A
-
ç
21
recorded on the only other incoming path to !%$ , !#"ÕC !%$ . Hence, the earliest data arrival time
t
$
at !%$ is defined by the propagation on the !#"8ÓLC !¶$ data path. Similarly, on the data path !#"Õ¦C
!%$ , a maximum data propagation time of Q "Õ $S<{ elapses conferring the latest data arrival time at !¶$ ,
æ
d
$
=
QZ" Õ ¤MQ
"Õ $
S<{
¤
-<.10î2136587
¸ :9<;
A
-
ç
.
The departure of &%" and the arrival of '($ must occur during two consecutive clock cycles for proper
circuit operation. The phase shift operator ¡ " $
=
- is subtracted from the calculated arrival time in order to
change the point of reference of the data arrival time at !¶$ to the beginning of the previous clock cycle.
4.1.4 Skew Constraints
Skew constraints
 

introduce lower and upper bounds on clock skew on a local data path:
d
"
¤ÞQ
"
Ä
¯
{
¤MQ
"
$
S<{
N

-
A
-<.402136587
:9<;
AH/$ (4-7)
Q
"
®/¯
{
¤MQ
"
$
S<{
N
-
¤½*
J

A
-<.402136587
:9<;
AH<$ (4-8)
Ù(ÚÛëóôt
"¤MQ
"
Ä
¯
z

-
A*
J

¤MQ
"
®/¯
zâõ
¤MQ
"
$
Sz

-
A
-/.10243ï587
:9<;
¤M¬($
m (4-9)
Presence of clock skew in level-sensitive synchronous circuits significantly affects the system timing. The
latching [Eqs. (4-1) and (4-2)], synchronization [Eqs. (4-3) and (4-4)] and propagation [Eqs. (4-5) and (4-
6)] constraints presented previously are derived considering the presence of non-zero clock skew in the
clock tree network. These three sets of constraints naturally impose lower and upper bounds on clock skew.
Thus, the skew constraints are redundant if a typical minimization of the clock period problem is pursued.
However, if implementation-specific constraints modify or suppress any of the given constraints, such that,
the bounds on clock skew are invalidated, the skew constraints are essential to the correct analysis of the
circuit. The skew constraints are important and complete provide a timing analysis framework for level
sensitive circuits.
The effects of clock skew on synchronous circuit operation can be derived from the latching [Eqs. (4-1)
and (4-2)], synchronization [Eqs. (4-3) and (4-4)] and propagation [Eqs. (4-5) and (4-6)] constraints. Note
that the variable
d
$ described in Eq. (4-2) can be expressed as follows:
d
$
=
QZ"£¤MQ
"
$
S<{
¤
-<.40213,587
:9<;
A
-
m (4-10)
22
Substituting Eq. (4-4) in Eq. (4-10), then substituting the result into Eq. (4-2) leads to,
Ù^ÚÛ ó d
"¤MQ
"
Ä ¯
z

-
A*
J

¤MQ
"
®/¯
z õ
¤¦Q
" $
S<{
¤
-/.102436587
:9<;
A
-
N
-
AÞH/$
 (4-11)
conveniently represented by the first two sets of skew constraints [Eqs. (4-7) and (4-8)].
Eq. (4-1) must hold to prevent the early arrival of data signal, where t $ depends on Øc" as implied by
Eq. (4-5):
t
$
=
Øc"	¤¦Q
" $
Sz
¤
-<.402136587
:9<;
A
-
m (4-12)
Eq. (4-12) also depends upon whether the data signal arrives before or during the transparent state of the
latch. Substituting Eq. (4-3) into Eq. (4-12), Eq. (4-12) into Eq. (4-1) and rearranging the terms lead to the
last set of the skew constraints, Eq. (4-9). Eq. (4-9), re-written in Eq. (4-13), is a non-linear skew constraint,
as the elimination of the
Ù^ÚÛ
function is not straightforward:
¬($ZN
Ù^ÚÛTóôt
"¤MQ
"
Ä
¯
z

-
A*
J

¤¶Q
"
®/¯
z
õ
¤MQ
"
$
Sz
¤
-/.10243ï587
:9<;
A
-
m (4-13)
As stated before, the skew constraints [Eqs. (4-7), (4-8) and (4-9)] are redundant in the formulation of a typ-
ical clock period minimization problem, as these constraints are derived from the existing set of constraints
[Eqs. (4-1)–(4-6)]. The skew constraints are not included in the LP model presented in Section 5.2, but are
used in the verification of the proposed solution method in Section 6.3.
4.2 Iterative Solution Approach
The operational constraints (Section 4.1) provide a system of equations defining the abstract operation
of a level-sensitive synchronous circuit. Different versions of the constraints presented in Section 4.1 have
been used by designers in order to develop a timing analysis model for the level-sensitive circuits.
The most significantly-adapted and studied timing analysis approach is presented in
 
. The
timing analysis approach presented in this series of papers involve several algorithms targeting clock period
verification and minimization problems, all based on the framework of equations described in Section 4.1.
23
//Initialize the latch arrival times
for i = 1 to |r| 
d
r p
21ö
"
=
t
r p
21ö
"
= A)÷ ;
// iterate the evaluation of the departure and arrival time equa-
tions
// until convergence or a maximum of |r| iterations
iter = 0;
repeat
iter = iter + 1;
// update the latch departure times based on the latch arrival
times
// computed in the previous iteration
for
7
= 1 to |r| 
QZ" =
Ù^ÚÛ
(
d
r
p
21ö
"
, ¡£" + QZ" );
Øc" =
Ù^ÚÛ
(
t
S
p
21ö
"
, ¡" + ØB" );

;
// update the latch arrival times based on the just-computed
// latch departure times
for
7
= 1 to |r| 
d
" =
Ù(ÚÛ
E ( QeE + Q S<{ );
t
" =
Ù(äå
E ( ØE + Q Sz );

;
until ( ( (
d
" =
d
r
p
21ö
"
) && (
t
" =
t
r
p
24ö
"
) ) || ( iter + 1 > |r| ) ) )
;

;
// check and record setup and hold violations
for
7
= 1 to |r| 
H
q?±ø
x
[
74ù [ 7 ] =
d
" >
-
- H" + ØB" ;
¬
ùú
Ø
[
74ù [ 7 ] =
t
" < ¬ " + Q " ;

;
Figure 4.3 The iterative algorithm offered in
 
. Note that w is the number of registers in the synchronous
circuit. The
t
, Ø ,
d
and Q vectors are the earliest arrival/departure and latest arrival/departure times, respec-
tively, where the superscript x£w
qj
identifies the value of a variable in the previous clock cycle. The variables
H
q?1ø
x
[
71ù
and ¬ ùú Ø
[
74ù hold the timing violation information for each register.
24
The algorithms proposed in these papers are iterative algorithms. In particular, very small values are as-
signed to the timing variables of a circuit and the circuit is investigated for timing violations by iteratively
incrementing the values of the timing variables. The iterative algorithm offered in
 
for the clock period
minimization problem of level-sensitive circuits is presented in Figure 4.3. In this algorithm, the arrival
times are initialized to
t
"
=ûd
"
=
A)÷ , where the algorithm simulates the start-up timing of the circuit.
At each iteration step, the execution of the circuit at a clock cycle is simulated. Finally, once the arrival
and departure times of the latches are determined, the algorithm checks for potential setup and hold time
violations.
The algorithm presented in Figure 4.3 has been shown to converge to solutions quite rapidly
 
. The
algorithm complexity is reported as ü 5bý w ýný x ý
;
, where ý w ý is the number of latches in a circuit and ý x ý is the
number of edges of a circuit graph (Recall from Section 2.2 that the number of edges of a circuit graph is the
number of local data paths). However, it has been proven in   that in case of data-path loops (sequential
feedback) in the synchronous circuit, the arrival and departure times might increase without bound. This
leads to a setup violation and the described algorithm fails to provide reasonable run-times. In
 î
, a fix
is offered to the algorithm. The fix is based on the assumption that, a data path loop in the circuit can be
detected in ý w ý iterations. Thus, the algorithm is modified to artificially limit the number of iteration steps by
ý
w
ý
. In this algorithm, the worst case complexity of the resulting algorithm is cubic in the number of registers
w , as each iteration involves examining up to ý x ý edges, and x is at most ý w ý
¸
 î
.
The iterative algorithm presented in Figure 4.3 is later modified in
 
and
 ff
in order to account for
multiple clock domains and crosstalk, respectively. Briefly, even though the iterative algorithm provides an
initial and useful formulation for the timing analysis of level-sensitive circuits, the algorithm has fallacies in
presence of data path loops and is insufficient to provide a common framework for general timing analysis.
In the presented work, a novel model for the timing analysis of level-sensitive synchronous circuits is de-
veloped. The developed model constitutes a well-defined framework for general timing analysis problems.
Furthermore, the integration of clock skew scheduling into the timing analysis problem is introduced, which
permits operation at higher operating frequencies. Section 4.3 describes the necessary set of constraints in
order to derive the LP model formulation of the clock period minimization problem. The entirety of the
25
constraints presented in Section 4.1 (operational constraints) and Section 4.3 (constructional constraints)
form the necessary set of constraints for the derivation of the LP model problem.
4.3 LP Problem Approach
As mentioned in Section 4.1, certain conditions must be satisfied for every sequentially adjacent pair of
registers in a synchronous circuit in order to prevent timing hazards. These conditions are encapsulated by
the four sets of operational constraints and two sets of constructional constraints. The operational constraints
defined in Section 4.1 build an introductory model for the timing analysis problem of level-sensitive circuits.
The iterative solution methodology discussed in Section 4.2 is built using this model.
In the presented work, a novel timing framework for level-sensitive circuits is proposed. Furthermore,
a novel linear programming model for the clock period minimization problem is derived using the referred
framework. This section introduces the constructional constraints, which are introduced to fulfill the com-
pleteness of the timing framework. The constructional constraints, called validity and initialization con-
straints, are required to ensure the correctness of the proposed LP formulation. The first type of construc-
tional constraints, the validity constraints, are presented in Section 4.3.1. The second type of constructional
constraints, the initialization constraints, are presented in Section 4.3.2.
4.3.1 Validity Constraints
The definitions of the parameters
t
$ ,
d
$ , Ø`$ and Q^$ require the value of
t
$
5
Ø$
;
to be smaller than or
equal to the value of
d
$
5
Q^$
;
:
d
$
½t
$ (4-14)
Q^$

Ø`$
m (4-15)
While the four sets of operational constraints introduced in the preceding sections summarize the timing
properties of the circuit, the required sequentiality in time of the referred variables is not explicitly enforced.
Consistency in the definitions of
t
$ ,
d
$ , Ø`$ and Q^$ , must be maintained through post-solution checks or
26
by including additional constraints. A solution leading to a result where
t
$
i
d
$ , for instance, is incorrect
and must be disregarded.
Introducing the validity constraints [Eqs. (4-14) and (4-15)] in the LP model is preferred over performing
post-solution checks for two significant reasons. The first reason is to gain the ability to easily detect the
feasibility of the problem. The second reason is to preserve the automation of the solution procedure.
4.3.2 Initialization Constraints
Recall that the procedure proposed in this work is developed in order to minimize the clock period of a
synchronous circuit. Besides the minimum clock period, it may prove essential to accurately calculate the
nominal data arrival and departure times for each register. The initialization constraints are introduced in
order to generate a consistent timing schedule for the data signal propagation in a synchronous circuit.
The sensitivity ranges of the parameters included in the LP model are not crucially heeded in the given
formulation. Due to the slack
 ff
on data propagation times, the feasible solution set for some variables
can be a range of values rather than a specific value. For instance, suppose that the earliest arrival time of
a data signal at an arbitrary latch ! 0 can get any value in the interval
mþ
N
t
0
N
mþff
without changing
the minimum clock period of the circuit. For consistency, it is preferable to assign the smallest value to the
earliest arrival time 5
t
0
=ßmþB;
. In general, it is better to assign the smallest possible values to the earliest
arrival and departure time variables and the largest possible values to the latest arrival and departure time
variables (where applicable). Identification of such sensitivity information is essential for the consistency of
the generated timing schedule for any given circuit.
Note that, the earliest and latest data arrival times at all except for the input registers, are set to their
lowest and highest possible values, respectively, by the propagation constraints [Eqs. (4-5) and (4-6)]. The
values assigned to the earliest and latest data arrival times 5
t£bd%;
at the input registers do not affect the
minimum clock period unless the assigned values cause the departure times to change. It may even be
considered unimportant to define earliest and latest arrival time variables 5
t£bd%;
at the input registers as
the non-local data paths do not affect the circuit timing directly. For consistency and completeness of
27
the generated timing schedule, the data arrival times at the input registers are defined and the following
constraints are included in the LP formulation for each input register !¶P :
d
P
=
Ø`PAOàQ
P
®/¯
z or Q
P
Ä ¯
z
á 
!¶P
ý 
t`
A
7

5
!¶P
;
ý
= m (4-16)
Note that Eq. (4-16) is only valid for input registers.
28
5.0 PROBLEM FORMULATION AND THE PROPOSED SOLUTION PROCEDURE
The non-linear
Ù^ÚÛ
and
Ù^äå
functions in the constraints shown in Eqs. (4-3), (4-4), (4-5) and (4-
6) present a major challenge in solving the problem of minimizing the clock period. The MBM method
introduced in this work is used replace the non-linear constraints with equivalent linear constraints. The
equivalence between the non-linear programming (NLP) model formulation and the re-formulated linear
programming (LP) model problem is preserved.
The proposed linearization method is described in Section 5.1, and the LP model is offered in Sec-
tion 5.2.
5.1 Modified Big M (MBM) Method
The linearization of the constraints which exhibit non-linear behavior is a commonly applied procedure
in operations research. Non-linear constraints are manipulated to derive equivalent linear constraints, which
are inherently easier to solve. In this work, a collection of known linearization procedures are applied on
the non-linear constraints of the timing analysis problem. The collection of these procedures is named the
Modified big M (MBM) method  
 . It has been considered reasonable to denominate the collection of
linearization procedures the MBM method, as the research is developed by an inspiration from the big M
method
 ffB
. The big M method is a special case of the simplex algorithm
 ff
which has applications in a
completely distinct set of problems with respect to the MBM method. The only similarity between the big M
method and the MBM method is the use of the constant ¾ in both methods. The constant ¾ symbolically
represents a very large positive number used to assign an overwhelmingly large penalty to a variable in the
objective function in order to increase the priority of the variable in the optimization process.
The collection of linearization procedures composing the MBM method is presented in Table 5.1. For a
minimization type LP problem—subject to constraints that have Ù^äå and Ù^ÚÛ functions—the transforma-
tions listed in Table 5.1 are applied to replace non-linear constraints with linear constraints. Note that only
relevant constraints and relevant portions of the objective function are included in Table 5.1.
29
Define a finite set  , consisting of the variables 
=

t	mmm o
. Consider all variables in the
finite set  to be elements of the real numbers set 
=

t	mmm o

y . The objective function  is
a linear function of the variables 
tv	mmm o
and is defined y
 
C y . There are no limitations on
variables being inclusive, provided the linearity of the constraints is preserved.
Two different linearization scenarios are presented in Table 5.1. In the first scenario [linearization of
t =\Ù(ÚÛ
5
	;
expression], the variable t is constrained to be the greater of the variables  and  . The
constraint is replaced with two new constraints, explicitly requiring the variable
t
to be greater than or equal
to the variables

and

. The initial constraint and the relaxed constraints are equivalent if either of the
following conditions holds:
1. Equality condition is observed for at least one of the inequalities, while the other inequality operation
returns true.
2. Equality condition is observed for both inequalities.
The cost function denoted by the product ¾
t
is added to the objective function. The product ¾ t is
overwhelmingly large with respect to other cost functions in the objective function as a result of the highly-
weighed cost figure (recall the very large coefficient ¾ ). Thus, ¾ t is given the highest priority in the
minimization process. As a result, the greater of the variables

and

is assigned to variable
t
.
The relaxation method in the second scenario [linearization of t=ÅÙ^ä å 5 v	; expression] is also pre-
sented in Table 5.1. In this case, the cost function ¾
t
is subtracted from the objective function in order to
exploit the maximum value to be assigned to the variable
t
.
Table 5.1 Modified Big M transformations
Ù^äå
 C
Ù^äå
5
Þ¤½¾
t	;
t^= Ù^ÚÛ
5
	;
C
t 
t 
Ù^äå
 C
Ù^äå
5
 A¾
t	;
t^= Ù^äå
5
	;
C
t
N

t
N

30
Similar to its implementation in the big M method, the constant ¾ is defined sufficiently large to ensure
the equivalence of the linear and non-linear problems. The selection of a value for the constant ¾ depends
on the solution space of a specific problem (problem constraints) and the objective function  . Typically,
the number ¾ must be chosen significantly larger than the values of any parameter in the problem. However
selection of an extremely large ¾ may cause the LP solver to fail drastically
 ff
. A value of ¾
= 
was
experimentally found to be sufficiently large for the analysis of circuits with arrival and departure times up to
 (time units), number of registers up to  , and number of data paths up to ff . The interpretation
of value assignment and the derivation of a lower bound on the constant M fall outside the scope of this
thesis and will not be discussed. However, verification of the equivalence between the non-linear problem
and the MBM method-transformed linear problem is a straight-forward post-solution check.
5.2 LP Model
An equivalent LP model of the clock period minimization problem is generated through the application
of the MBM method. There are five sets of constraints in the finalized equivalent LP model. As explained
in Chapter 4, these sets are the latching [Eqs. (4-1) and (4-2)], synchronization [Eqs. (4-3) and (4-4)], prop-
agation [Eqs. (4-5) and (4-6)], validity [Eqs. (4-14) and (4-15)] and initialization [Eq. (4-16)] constraints.
Note that, for simplicity, the skew constraints [Eqs. (4-7), (4-8) and (4-9)] are not included in the LP model.
The finalized LP model for the clock period minimization problem is shown in Table 5.2.
The latching, validity and initialization constraints exhibit linear behavior. Therefore, these constraints
remain unchanged in both the LP and NLP models as shown in constraints (i-ii , vii-ix) of the formulation.
The synchronization constraints, however, are formed by the
Ù^ÚÛ
function and exhibit non-linear behavior.
The MBM method is used on the synchronization constraints in order to generate equivalent linear con-
straints for the LP model problem (constraints iii and iv). For instance, (iii) depicts the replacement of the
non-linear constraint presented in Eq. (4-3) with two linear constraints, where Øc" is greater than or equal to
both operands of the
Ù^ÚÛ
function, à
t
"¤MQ
"
Ä
¯
z
á and à - A* J ¤MQ "
®/¯
z
á
. Note that the cost function
¾ ØB" is added to the objective function. Propagation constraint on the latest data arrival time (Eq. (4-6)),
exhibits similar non-linearity with the synchronization constraints such that the
Ù^ÚÛ
function is used. The
31
Table 5.2 The transformed constraints for the ‘Modified big M’ method.
LP Model
Ù^äå
-
¤¦¾
 

5
ØE+¤ÞQeE
;
¤

ffflfi

Iffi	 "!
]
"
_
 #
l
5
d
0
A
t
0
;4
subject to
(i) t $  ¬($
[Latching-Hold time]
(ii) d $^N - AH/$
[Latching-Setup time]
(iii) ØB" ½t "	¤¦Q "
Ä ¯
z
ØB"

-
AÞ*
J

¤MQ
"
®/¯
z
[Synchronization-Earliest time]
(iv) QZ" ½d "¤MQ "
Ä ¯
{
QZ"

-
A*
J

¤MQ
"
®/¯
{
[Synchronization-Latest time]
(v) t $^N½ØB"8Óñ¤MQ "ÔÓ $Sz ¤ -<.10î213,587 l:9<; A -
.
.
.
t
$^N½ØB"%$#¤¦Q
"%$
$
Sz
¤
-<.40213,587

:9<;
A
-
[Propagation-Earliest time]
(vi) d $  QZ" Ó ¤¦Q "8Ó $S<{ ¤ -/.102436587 l:9<; A -
.
.
.
d
$

Q
"
$#¤MQ
"%$
$
S<{
¤
-<.402136587

:9<;
A
-
[Propagation-Latest time]
(vii) d $ ½t $
[Validity-Arrival time]
(viii) Q^$  Ø$
[Validity-Departure time]
(ix) d P = ØcPA 5 Q P
®/¯
z or Q
P
Ä
¯
z
;
,   !¶P&
ý 
tc
A
7

5
!#P
;
ý
= 
[Initialization]
equivalent propagation constraints in the LP model are shown in (vi). In the LP model, the variable d $
is greater than or equal to the expressions à
æ
QZ"£¤MQ
"
$
S<{
¤
-/.102436587
:9<;
ç
AÞ¡
"
$
á , evaluated for each fan-in
path of register !%$ . In the formulation, fan-in paths of !%$ are indexed by the parameter

.
Unlike other non-linear constraints in the formulation, the propagation constraint on the earliest arrival
time
t
$ is modeled by the
Ù^äå
function. In this type of linearization,
t
$ is set to be less than or equal to each
operand of the
Ù(äå
function. As shown in (v), the expressions
æ
Ø
"
¤MQ
"
$
Sz
¤
-<.10î213,587
:9<;
A
-
ç
evaluated
for each fan-in path of register !%$ are included in the finalized LP model.
In order to illustrate the derivation of the NLP and LP model formulations for a clock period mini-
32
mization problem, a simple synchronous circuit is investigated. The NLP model formulation of the clock
period minimization problem for the sample circuit (shown in Figure A-1) is presented in Appendix A. The
non-linear constraints in the NLP problem formulation are linearized using the MBM method described in
Section 5.1. The finalized LP model formulation for the clock period minimization problem is presented in
Appendix B.
33
6.0 AN EXAMPLE AND EXPERIMENTAL RESULTS
The circuit network shown in Figure 6.1 is analyzed in order to illustrate the application of the proposed
procedure. Without affecting the generality of the solution, zero setup and hold times and zero internal
delays are considered ( H " = ¬ " = Q^®/¯ = Q Ä ¯ =  ).
R1 R2 R3
R' 4
(
2 ) 9 * 3 +

(
5 * 7 +

, 3- 4.
 / 20 51
5
2

(
3 * 4 +

Figure 6.1 A simple synchronous circuit. Note that the minimum clock period with zero skew and using
flip-flops is -
=ÈÊ (time units).
Given single-phase synchronization under zero and non-zero clock skew, the clock period minimization
problems of three different synchronous circuits with same circuit topology are formulated. The simpler (in
terms of timing analysis) circuit, which is used as the basis of comparison for other circuits, is the zero clock
skew edge-sensitive circuit. The minimum clock period of a zero clock skew edge-sensitive circuit is defined
by the maximum data propagation time in the circuit
 
. Thus, the synchronous circuit network presented
in Figure 6.1 has a minimum clock period of -
=
Q
²
¸
S<{
= Ê (time units) when used with edge-triggered
flip-flops.
The second synchronous circuit of interest is the zero clock skew level-sensitive circuit. In order to
design a level-sensitive synchronous circuit, each flip-flop in the given circuit topology is replaced with a
level-sensitive latch. Zero clock skew level-sensitive circuits exhibit improved circuit performance due to
time borrowing. Finally in the third synchronous circuit, clock skew scheduling is applied. This non-zero
clock skew level-sensitive circuit exhibits performance improvement due to the simultaneous consideration
of time borrowing and clock skew scheduling.
The commercial optimization package CPLEX
 
is used to solve for the clock period minimization
problem of the generated synchronous circuits. The generic LP model constituting a fully-linear optimiza-
34
tion problem for such timing analysis is presented in Table 5.2. In order to solve this type of an LP problem,
CPLEX implements optimizers based on simplex algorithms (both primal and dual simplex). In the ex-
periments, the particular optimizer is automatically selected by the solver. The worst case analysis shows
that the simplex method and its variants may require exponential number of steps to reach an optimal solu-
tion
 ff
. However, a vast amount of practice has confirmed that in most cases, the number of iterations to
reach an optimal solution is a linear function of the number of variables

and a quadratic function of the
problem constraints Ł
 ff
. Thus, the expected computational effort of the presented procedure is similar to
the ü 5 Ł
¸
ñ;
of the simplex method.
Note that the number of problem constraints Ł is proportional to the number of registers w and the num-
ber of local data paths x in the circuit. Let 3 denote the number of input registers for which the initialization
constraints are defined. In the LP model clock period minimization problem shown in Table 5.2, there are
eight (8) constraints for each register, two (2) constraints for each local data path, and one (1) constraint
for each input register. Thus, the number of constraints in the problem formulation is Ł
=Å
we¤

x ¤43 .
The minimum clock period - is a problem variable. Also, there are five (5) problem variables defined for
each register leading to a total number of
ß= 
wZ¤

variables in the problem formulation. Thus, the
problem complexity is ü
ó
5

w)¤

xK¤53
;
¸
5

w)¤
v;
õ
. Note that the exact computational complexity cannot
be determined since the internal presolver, matrix-sparsity checker and large-scale optimizer
 ff
routines
employed within CPLEX are proprietary and unknown.
In the analysis, the minimum clock period for the zero clock skew level-sensitive circuit is calculated
as 4.66 (time units), which is a 33% improvement over the zero clock skew edge-sensitive synchronous
circuit. Note that the percentage improvement is calculated by the expression

5Ô-/
P
h
A
-

213
;6
-/
P
h
. As
stated before, clock skew scheduling is applied on the level-sensitive circuit in order to generate the non-
zero clock skew level-sensitive circuit. The calculated minimum clock period of 4.05 for the non-zero clock
skew level-sensitive circuit is a 13% improvement over the zero clock skew level-sensitive circuit and a
42% improvement over the zero clock skew edge-sensitive circuit. Note that 13% improvement is only due
to clock skew scheduling, while 42% improvement is due to time borrowing and clock skew scheduling.
Further analysis of the time borrowing and clock skew scheduling effects on circuit timing will be presented
35
CLK4
CLK3
CLK2
CLK1
CLK4
CLK3
CLK2
CLK1
T 7 4 8 66
Zero clock skew
T 7 4 8 05
Non-zero clock skew
A3 7 1 8 66 7 D1 9 4 : 4 8 66
A2 7 4 8 66 7 D3 9 7 : 4 8 66
Zero Skew
A3 7 2 8 025 7 D1 9 4 9<; 0 8 05 : 0 =>: 4 8 05
A2 ? 4 @ 05 ? D3 9 7 9<; 0 : 0 @ 925 =>: 4 @ 05
Non-Zero Skew Critical Path
R3 A R2
R1 A R3
Figure 6.2 Zero clock skew and non-zero clock skew clocking schedules for the synchronous circuit in
Figure 6.1. The clocking schedule for the zero clock skew circuit is shown on the left, with a minimum clock
period of BDCFEHGJIKI . Non-zero clock skew scheduling results with a minimum clock period of BFCFEHGMLON is
shown on the right. For non-zero clock skew scheduling, the optimal clock signal delays at the register are
PQR
CLSGMLON ,
PTQ"U
CVLSGJWKXKN ,
PTQffY
CVL and PTQffZ CVLSG[E\]N . The arrows represent data signal propagation on the
respective critical paths. Note that unlike the presented case, the critical paths for zero and non-zero clock
skew scheduling need not be identical.
in Section 6.2. The clocking schedules and the data propagation on the critical paths of the circuit in
Figure 6.1 in case of zero and non-zero clock skew scheduling are shown in Figure 6.2.
6.1 Digital Synchronous Circuit State of Operation
Presence of data path loops (cycles) and transient state errors are two major issues that needs to be
identified in the timing analysis of level-sensitive circuits. As discussed in Section 4.2, the iterative solution
procedure offered in ^`_bac was shown to suffer from excessive run-times and produce false negative outputs
in presence of data path loops ^[XEKc . In ^[XEKc , modifications are offered for the iterative algorithm in order to
detect and handle the effects of data path loops in the circuit. Also in ^[XEKc , it has been shown that synchronous
circuits are prone to suffer from transient state errors. The transient state errors occur due to the non-unique
solution sets of the problem parameters ^[XEKc . In circuits under transient state errors, setup violations occur
in certain registers after the system is initiated from a reset state. The arrival and departure times may not
36
be stable at start-up, in which case these times change during initial clock cycles, constituting the transient
state. As circuit operation progresses in time, the arrival and departure times converge to their steady-state
values.
There are two major conventions in evaluating the transient errors and determining the steady-state
behavior. The first convention overlooks the transient errors and presumes that the departure times converge
to the opening edge of the driving clock, which is the expected schedule for the steady-state of operation.
The second convention is more strict in that transient state errors are not permitted. The first convention is
more widely accepted within the proposed solution algorithms and leads to a generally acceptable solution
unless the transient state operation of the level-sensitive circuit is decisive to overall circuit operation. Given
that the second convention is adopted, the reset state is preferably extended until the steady state of operation
is reached ^[XEKc .
The LP model proposed in this work assumes the transient-state operation of a level-sensitive circuit to
be negligible. The aim of the generated model is to solve for the steady-state timing scheduling problem. The
simplex algorithm-based LP solver directs the gradual advancement of parameter values as they are enforced
by the LP model (Table 5.2). Previously offered algorithms are vulnerable to potential fallacies caused by
data path loops due to their iterative nature. However, in the presented procedure, complications posed by
the presence of data path loops are resolved within the mechanics of the LP solver without significantly
affecting the run-time or quality of the solution. If the problem remains feasible, the timing parameters for
the steady state operation of the circuit are calculated.
In order to illustrate the described phenomenon, the steady-state optimal timing schedule for the IS-
CAS’89 benchmark circuit s27 is presented in Figure 6.3. The circuit s27 has one input register and a data
path loop consisting of two other registers. The data signal departs from input register dfe and perpetually
propagates on the loop between dhg and dfi . The minimum clock period is calculated to be EHGj_ , where
the calculated propagation delays are indicated on the circuit graph. Note that the propagation delays are
calculated to be constants using the authors’ generic delay calculation procedure. This fact does not effect
the generality of the solution and the inclusion of variable propagation delay in the problem solution is
straightforward.
37
k-th clock cycle k k 1-th clock cycle k k 2-th clock cycle
k-th clock cycle k k 1-th clock cycle k k 2-th clock cycle
k-th clock cycle k k 1-th clock cycle k k 2-th clock cycle
l
k m 1 n T
l
k m 1 n T o 4 p 1
l
k m 1 n T o 8 p 2
l
k m 1 n T o 12 p 3
l
k m 1 n T o 16 p 4
timeglobal
CLKR1
CLKR2
CLKR3
1 p 3 3 p 35 5 p 4 7 p 45 9 p 5 11 p 55 13 p 6
3 p 8 5 p 85 7 p 9 9 p 95 12 14 p 05
2 p 05 6 p 15 10 p 250 4 p 1 8 p 2 12 p 3 16 p 4
tskew
l
R3 q R1 nsrtm 3 p 8
tskew
l
R3 q R2 nsrum 1 p 3
tskew
l
R1 q R2 nsr 2 p 5
8 p 65 12 p 75
R1 R2
R3
v
1 w 6 x
y
v
6 w 6 x
z
{
6 | 6}
~ 
5  4

a1 r 0 p 75 d1 r 2 p 05
A1 r 2 p 05 D1 r 2 p 05
t1 r 3 p 8
a2 r 2 p 05 d2 r 2 p 05
A2 r 2 p 05 D2 r 2 p 05
t2 r 1 p 3
a3 r 0 d3 r 2 p 05
A3 r 0 D3 r 2 p 05
t3 r 0
Figure 6.3 The optimized timing schedule for the benchmark circuit s27 operable with a minimum clock
period of B4CEHGj_ . Note that j& C4Łj> d

d

and 

C

C


C4



C4L are considered.
In Figure 6.3, the data propagation occurring on all data paths of the XO\ benchmark circuit is analyzed.
The clock signals u Q R  t Q U and u Q Y , where the subscripts indicate the register being synchro-
nized by the clock signal, build the frame for the analysis. The clock signals may not be completely aligned
in time due to the non-identical clock signal delays to the respective registers. The clock signal t QY
at the input register dfe has no delay in time with respect to the clock signal at the clock source  P e CL¢¡ .
Hence, the origin of the clock signal at the source is aligned with the origin of t QY . The clock signals
u
Q>R
and t QffU however, are shifted in time by P g£C¤¥GJa and P i C¦_]GJ¤ relative to the origin of the clock
signal at the source. The horizontal axis of Figure 6.3 represents the time, where the beginning ¨§ª©4_«¡¬B
of the § -th clock cycle of u QffY , is defined as the local time reference, with an assigned value of zero.
In Figure 6.3, the numbers associated with the enabling and latching edges of the clock signals label the
times with respect to the local time reference. The arrows illustrate the propagation between the registers
and are drawn to scale. Illustration of the data propagation on three consecutive clock cycles are sufficient to
38
analyze the behavior of the data path loop of the benchmark circuit XO\ ; the § -th, ¨§­4_«¡ -th and ¨§h­®XO¡ -th
clock cycles are selected to illustrate the behavior. The solid arrows represent the data propagation during
the selected clock cycles. For instance, the propagation between d¯e and dhg is represented by the arrows
initiating from the t Q Y row at times X¥GMLON and I¥Gj_bN , and concluding at the u Q U row at times a¥GJIKN and
_bX¥G°\]N , respectively. Data propagation on the data path loop between the registers d g and d i is visible by the
cross-structured arrows initiating and concluding in the corresponding clock signal rows. Note that the cal-
culated nominal arrival and departure times are illustrated on the circuit graph, inside the boxes associated
with each node.
In steady-state of operation, the departure times of the registers that constitute a data path loop converge
to the beginning of their respective clock cycles. The circuit s27 in Figure 6.3 is scrutinized in order to
provide a better insight on how the latest departure times converge to a certain value in the steady-state.
Define a variable ± , where ± is a very small period of time. Suppose that a deviation of ± occurs in the
departure time of the data signal from dfe . The signal departure from dfe occurs at time 2.05+ ± , delaying the
arrival times at d²g and dfi by ± . The departure from dfi is gradually delayed by ± every turn, which in turn
delays the arrival time at dhg . The arrival and departure times cumulatively increase in each turn of the data
signal around the loop. Eventually, the signal arrivals at the latches occur during the non-transparent state
of the latches. At this point, the signal departure times return to their starting values, which are the latching
edges of their respective clock cycles. It is evident that the arrival times will finally be restored to their initial
values when the source of the deviation vanishes. Thus, the assignment of the time-varying departure times
to the enabling edges of the synchronizing clock signals is referred to as the steady-state of operation for the
synchronous circuit.
6.2 Performance Results of the Procedure on the ISCAS’89 Benchmark Circuits
The timing analysis algorithm described in this thesis is applied on the selected suite of ISCAS’89
benchmark circuits in order to derive the performance results and illustrate the efficiency of the presented
algorithm. The original ISCAS’89 benchmark circuits are edge-sensitive synchronous circuits and the spe-
cific timing information of different circuit elements is not defined. A generic delay calculation procedure
39
is generated in order to provide timing information for each local data path. The data propagation times
³
 j
²´ on local data paths are calculated using pre-determined delay times for each logic gate type. The
number of fanout branches on each node is also considered effective on the data propagation time.
The level-sensitive circuit is generated by replacing each flip-flop in the original benchmark circuit with
a level-sensitive latch as explain in Chapter 6. Note that this procedure does not affect the operation of the
original circuit and preserves the circuit topology. In experimentation, 50% duty cycle is selected both for
the single phase clock signal. Without affecting the generality of the solution, the setup and hold times and
the internal delays are assumed to be zero ( 

Cµ

Cµ

Cµ
 
CµL ). The consideration of these
numeric constants in an actual problem is straightforward. Edge-sensitive and level-sensitive synchronous
circuit implementations are analyzed for zero and non-zero clock skew scheduling applications. The effects
of time borrowing and clock skew scheduling in circuit implementation are investigated. The results of the
analyses—computed on a 440MHz Sun Ultra-10 Workstation—are presented in Table 6.1. For each circuit,
the following data are listed—the circuit name, the number of registers ¶ and the number of paths · , the
clock periods B ¸]¹º¬»½¼¿¾ÀÀ for a zero skew circuit with flip-flops, B ¸]¹º¬»½¼¿¾Á for a zero skew circuit with latches,
B¯º¿»½¼¬¾&¼¿Â
ÀÀ for a non-zero skew circuit with flip-flops, B¯º¿»Ã¼¿¾&¼¿ÂÁ for a non-zero skew circuit with latches, and B ÄÁ
for a non-zero skew circuit where the clock delays to I/O registers are restricted to be equal. The subscripts
ÅÅ

 represent circuit topologies for flip-flop based and latch-based circuits, respectively. The superscripts
Æ>Ç
È§HÉbÊ

§HÉflÊfÉbË indicate zero or non-zero clock skew scheduling. Also listed are the calculation time
of B º¿»½¼¬¾&¼¿ÂÁ , P º¬»½¼¿¾"¼ÂÁ , and the clock period improvements ÌKÍ"ÎÁ , Ì
Ï¢Ï
ÀÀ and ÌOÍ"Î
ÏÏ
Á , where the superscripts
BÐ

²t

BÐÑ²Ò stand for time borrowing, clock skew scheduling and both, respectively.
The minimum clock periods calculated for the edge-sensitive synchronous circuits under zero and non-
zero clock skew scheduling ( B ¸]¹º¬»½¼¿¾ÀÀ and B º¬»½¼¿¾"¼ÂÀÀ , respectively) are borrowed from ^`_bXc . It is reported
in ^`_bX]c that, due to clock skew scheduling, an average improvement of 30% is reported in the minimum
clock period for the ISCAS’89 benchmark circuits.
The experimental results shown in Table 6.1 represent significant improvements in the minimum clock
period for synchronous circuits with level-sensitive latches. In digital synchronous circuits, utilizing latches
as memory elements instead of flip-flops may result in up to 33% improvement of the minimum clock
40
Table 6.1 ISCAS’89 benchmark circuits results showing the number of registers ¶ and paths · (before
modification). Optimal clock periods, improvements and calculation time are denoted by B , Ì and P , respec-
tively. Subscripts ÅÅ   represent circuit topologies for flip-flop based and latch-based circuits, respectively.
Superscripts Æ>Ç §HÉflÊ  È§HÉbÊ¯É«Ë  ¶ indicate zero or non-zero clock skew and restricted circuit (for clock peri-
ods only), and BÐ  ²Ò  B¯ÐhÒ stand for time borrowing, clock skew scheduling and both, respectively.
Circuit Info Zero CS I (%) Non-Zero CS I (%) T (sec) R I (%)
Circuit Ó Ô Õ×ÖØ¬Ù¨ÚÛÝÜÞÞ Õ×ÖØ¬Ù¨ÚÛÝÜß àbáKâß Õ×ÙÝÚÛÝÜKÛäãÞÞ ÕåÙ¨ÚÛäÜKÛäãß à«æçbçÞ Þ àbáKâ¥æ¢çbçß à«æ¢çbçß è Ù¨ÚÛÝÜKÛäãß Õåéß à éß
s27 3 4 6.6 5.4 18 4.1 4.1 38 38 24 0.02 4.1 38
s208.1 8 28 12.4 8.6 31 4.9 5.2 60 58 40 0.01 7.6 39
s298 14 54 13 10.6 18 9.4 9.4 28 28 11 0.02 10.6 18
s344 15 68 27 18.4 32 18.4 18.4 32 32 0 0.03 18.4 32
s349 15 68 27 18.4 32 18.4 18.4 32 32 0 0.03 18.4 32
s382 21 113 14.2 10.3 27 8.5 8.5 40 40 17 0.04 8.72 39
s386 6 15 17.8 17.3 3 17.3 17.3 3 3 0 0.03 17.3 3
s400 21 113 14.2 10.4 27 8.6 8.6 39 39 17 0.05 8.8 38
s420.1 16 120 16.4 12.6 23 6.8 7.2 59 56 43 0.04 10.27 37
s444 16 113 16.8 12.4 26 9.9 9.9 41 41 20 0.07 9.9 41
s510 6 15 16.8 14.8 12 14.8 14.3 12 15 3 0.02 14.8 12
s526 21 117 13 10.6 18 9.4 9.4 28 28 11 0.05 10.6 18
s526n 21 117 13 10.6 18 9.4 9.4 28 28 11 0.05 10.6 18
s641 19 81 83.6 66.2 21 61.9 61.9 26 26 6 0.05 63.1 25
s713 19 81 89.2 71.2 20 63.8 63.8 28 28 10 0.05 65 27
s820 5 10 18.6 18.3 2 18.3 18.3 2 2 0 0.01 18.3 2
s832 5 10 19 18.8 1 18.8 18.8 1 1 0 0.01 18.8 1
s838.1 32 496 24.4 20.6 16 8.3 9.1 66 63 56 0.28 15.6 36
s938 32 496 24.4 20.6 16 8.3 9.1 66 63 56 0.31 15.6 36
s953 29 135 23.2 21.2 9 18.3 18.3 21 21 14 0.10 21.2 9
s967 29 135 20.6 17.9 13 16.2 16.6 21 19 7 0.08 17.9 13
s991 19 51 96.4 91.6 5 79.4 79.4 18 18 13 0.02 79.4 18
s1196 18 20 20.8 16 23 10.8 7.8 48 63 51 0.03 16 23
s1238 18 20 20.8 16 23 10.8 7.8 48 63 51 0.01 16 23
s1423 74 1471 92.2 86.4 6 77.4 75.8 16 18 12 1.10 75.8 18
s1488 6 15 32.2 29 10 29 29 10 10 0 0.02 29 10
s1494 6 15 32.8 29.6 10 29.6 29.6 10 10 0 0.01 29.6 10
s1512 57 415 39.6 34.8 12 34.8 34.8 12 12 0 0.28 34.8 12
s3271 116 789 40.3 29.8 26 28.6 28.6 29 29 4 0.69 29 28
s3330 132 514 34.8 23.4 33 17.8 17.8 49 49 24 0.49 23.2 33
s3384 183 1759 85.2 77.4 9 67.4 67.4 21 21 13 1.88 76.2 11
s4863 104 620 81.2 75.4 7 69 69 15 15 8 0.64 69 15
s5378 179 1147 28.4 23.2 18 22 22 23 23 5 1.66 22 23
s6669 239 2138 128.6 124.6 3 109.8 109.8 15 15 12 3.62 109.8 15
s9234 228 247 75.8 64.8 15 54.2 54.2 28 28 16 4.59 59.2 22
s9234.1 211 2342 75.8 64.8 15 54.2 54.2 28 28 16 3.88 59.2 22
s13207 669 3068 85.6 67.4 21 57.1 57.1 33 33 15 14.86 57.1 33
s15850 597 14257 116 92.8 20 83.6 83.6 28 28 10 76.96 83.6 28
s15850.1 534 10830 81.2 71.4 12 57.4 57.4 29 29 20 58.89 57.4 29
s35932 1728 4187 34.2 34.1 0 20.4 20.4 40 40 40 80.03 20.4 40
s38417 1636 28082 69 54.8 21 42.2 42.2 39 39 23 603.49 43 39
s38584 1452 15545 94.2 76.4 19 65.2 65.2 31 31 16 321.74 64.8 31
Average 204 2141 44.7 38.1 15 29.6 32.6 30 27 14 28.01 34.29 23.74
41
period under zero clock skew. On the ISCAS’89 suite of benchmark circuits for instance, an average of 15%
improvement is observed when the flip-flops are replaced by latches (under zero clock skew). The recorded
improvement is solely due to time borrowing.
Utilizing non-zero clock skew, an even higher improvement is possible: up to 63% improvement—over
flip-flop based synchronous circuit with zero clock skew—is observed. The average improvement in the
minimum clock period in this case is calculated to be 27%. The recorded improvement is due to simulta-
neous application of clock skew scheduling and consideration of time borrowing. Note that the functional
characteristics of any synchronous circuit must not be affected by replacing the flip-flops with level-sensitive
latches. In addition, the clock skew distribution of a circuit block must be performed considering additional
application-specific constraints such as global layout design restrictions on placement and routing.
As mentioned earlier, the improvement in the minimum clock period for non-zero clock skew level-
sensitive circuits is due to simultaneous consideration of time borrowing and application of clock skew
scheduling. The improvement due to time borrowing is 15% and the improvement due to clock skew
scheduling is 14%. It is interesting to note that the improvements achieved through time borrowing and
clock skew scheduling are not fully additive in defining the overall improvement. Time borrowing and clock
skew scheduling are contradictory effects in performance improvement, thus leading to the degradation in
the overall improvement. There is a limited amount of slack propagation time on the critical paths and a
circuit where time borrowing is abundantly realized, cannot benefit as much from clock skew scheduling. It
has been shown however, that even though time borrowing and clock skew scheduling are battling effects,
dramatically shorter clock periods are achievable through the collaboration of both effects.
The zero clock skew level-sensitive circuit implementation is analogous to the circuits previously ana-
lyzed in ^`_  XKNÈc , which also solve for the clock period minimization problem. Note that unlike the unit-delay-
per-gate approach used in ^`_  XKNc , the combinational logic delay is calculated by assuming different delay
times for each logic gate type and considering effects of fanout on the propagation time. Thus, the obtained
results are not directly comparable to the previously published algorithm results. However, presuming the
accuracy and correctness of both procedures, the listed results for B ¸]¹º¬»½¼¿¾Á present the state of improvement
achieved through previous work in the literature.
42
It is evident that a fair comparison of the run times for the previous and presented procedures is not
feasible due to the differences in the problem formulations. Without any formal proof, assuming that the
procedure described in this thesis and previously published algorithms are equivalent formulations for the
clock period minimization problem, the calculated minimum clock periods B¸¹º¿»Ã¼¿¾Á are used in order to
compare the improvements achieved through both approaches. As will be discussed shortly, simultaneous
consideration of time borrowing and clock skew scheduling in the proposed procedure results in higher
improvements ( ÌOÍ"Î
ÏÏ
Á ) compared to consideration of time borrowing and constant clock skew in the pre-
viously published algorithms. Therefore, the procedure presented in this work is superior in terms of circuit
improvement. A comparison of algorithm run-times is not rational, however, as the problem formulations
are nonidentical.
6.3 Verification and Interpretation of Results
Certain synchronous circuits are inoperable with level-sensitive latches or fail to satisfy the timing con-
straints due to predetermined circuit or clock tree topologies. In such circuits, the minimum clock period
problem is infeasible. The proposed timing analysis procedure easily detects the infeasibility of a problem
and provides diagnostics messages. The slack and excess values associated with each constraint can be
examined in the sensitivity analysis output provided by the LP solver. Even though the details will not be
discussed here, careful interpretation of the sensitivity output leads to the identification of the necessary
modifications on the circuit topology to achieve the desired operating frequency. The sensitivity analysis
output of the LP solver CPLEX for the timing analysis discussed in Appendices A and B is presented in
Appendix C.
The interpretation of the timing schedule for a synchronous circuit presents a model to investigate the
effects of zero and non-zero clock skew scheduling on synchronous circuit operation. In the rest of this
section, the timing schedules generated for the synchronization of the ISCAS’89 benchmark circuit s938
with zero and non-zero clock skew scheduling are analyzed. The analyses include the data distributions for
various parameters, which are presented in Section 6.3.1. The verification of clock skew values is discussed
43
Propagation delay DP in time units
N
um
be
ro
fp
at
hs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300
75
5
10
15
20
25
30
35
40
45
50
55
60
65
70
Figure 6.4 Distribution of data propagation times for s938 with ¶êCF¤KX registers and ·C<E¢WKI data paths.
The height of each bar corresponds to the number of paths within a given delay range. For example, there
are nine (9) paths with delays between 4 and 5 time units.
in Section 6.3.2. In Section 6.3.2, the skew constraints of Section 4.1.4 are used to derive lower and upper
bounds on clock skew.
6.3.1 Parameter Data Distributions
In Section 3.2, data propagation time j is defined as the period of time the data is processed in the
combinational logic block of a local data path d


d

. Without loss of generality, an empirical calculation
method is used to calculate the data propagation times of each local data path of a circuit (a simple fan-out
delay model is used as timing data is not included in the ISCAS’89 benchmark circuits). The distribution of
the calculated data propagation times for the ISCAS’89 benchmark circuit WK¤Ka is illustrated in Figure 6.4.
Define the effective path delay ^`_bX]c as the time period between the departure of the data signal from the
initial register and the arrival of the same data signal at the final register. The effective path delay of a local
data path differs from data propagation delay, because of the additional propagation time provided by clock
skew and the time borrowing property of level-sensitive synchronous circuits. Note that in level-sensitive
synchronous circuits, the effective path delay is defined within a permissible range instead of a fixed value,
as the arrival and departure times are indeterminate. The nominal effective path delay is determined when
the arrival and departure times are realized in run-time as certain values in the permissible ranges ^ ë

	ì

c
44
Maximum effective path delay in time units
N
um
be
ro
fp
at
hs
í
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300
5
10
15
20
25
30
35
40
45
50
55
60
Figure 6.5 Distribution of the maximum effective path delays in data paths of s938 for zero clock skew.
The target clock period is B = 20.6. The height of each bar corresponds to the number of paths with an
effective path delay within a given range.
and ^ Ë




c , respectively. Specifically, the shortest effective path delay occurs when the data signal departs
at its latest time 

from the initial register d

and arrives at its earliest arrival time ë

at the final register
d

. The longest effective path delay is realized by the earliest departure Ë

of the data signal from d

and
latest arrival ì

at d

. Hence, the interval for the effective path delay of level-sensitive synchronous circuits
can be defined as:
ë

©î

©ïB
º¿»Ã¼¿¾
äð
ñ
¡­òBFó Effective path delay ó ì

©îË

©ïB
º¿»½¼¬¾
äð
ñ
¡>­òB G (6-1)
In this work, the longest effective path delay is investigated in order to illustrate the effects of clock
skew and time borrowing on data propagation. The aim is to observe the increase in the effective path delay
of a circuit, which in turn leads to a higher operating frequency, by replacement of flip-flops with latches
and introducing non-zero clock skew. Observe that the distribution of the propagation delays for the WK¤Ka
benchmark circuit presented in Figure 6.4 is exactly the same as the distribution of the effective path delay
of the same benchmark circuit ÈWK¤Ka , when operational with flip-flops (under zero-clock skew). In circuits
with flip-flops, the effective path delays are determinate ôjj ©ïB
º¬»½¼¿¾
äð
ñ
¡¬õ as the data departures occur at
the active transition of the clock signal.
45
Maximum effective path delay in time units
N
um
be
ro
fp
at
hs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300
5
10
15
20
25
30
35
40
45
50
55
Figure 6.6 Distribution of the maximum effective path delays in data paths of s938 for non-zero clock
skew. The target clock period is B = 9.085714. The height of each bar corresponds to the number of paths
with an effective path delay within a given range.
The distribution of the maximum effective path delays of the level-sensitive s938 circuit with zero clock
skew scheduling is shown in Figure 6.5. Note that the maximum effective path delay is calculated by
the expression ^ ì

©öË

©÷B
º¿»½¼¬¾
äð
ñ
¡>­ùø
j
c . It is observed by comparing Figures 6.4 and 6.5 that the
maximum effective path delays are increased in the level-sensitive circuit, as well as providing a smaller
minimum clock period úÝB ¸]¹º¬»½¼¿¾ÀÀ CXEHG[E v.s. B ¸]¹º¬»½¼¿¾Á CX]LSGJIKû . The increase in the effective path delays
is due to time borrowing. Cumulation of effective path delay values slightly below or above the minimum
operating clock period BCX]LSGJI is visible. Note that the effective path delay having larger values than the
minimum clock period is a sufficient but not a necessary condition for time borrowing. Thus, local data
paths where the effective path delay is calculated to be smaller than BüCýX]LSGJI may still benefit from time
borrowing. Furthermore, it can be observed that certain data paths in the circuit benefit more from time
borrowing, realizing an effective path delay close to the theoretical limit of þ%ø
%
­ù
Á
ß
© B
º¬»½¼¿¾
äð
ñ
¡

.
6.3.2 Skew Analysis
As discussed throughout this thesis, non-zero clock skew scheduling in synchronous circuits permits
smaller clock periods. Note that in presence of non-zero clock skew, the effective path delay for the data
signal over a data path most likely gets smaller compared to its value observed in zero clock skew schedul-
46
ing. This fact is directed by Eq. (6-1) ( B gets smaller). However, as the minimum clock period B gets
smaller, the percentage of the data paths, on which the effective path delay exceeds the minimum clock
period, significantly increases (see Figure 6.6). The effect of clock skew on improving the minimum clock
period is visible by comparing the histograms presented in Figures 6.5 and 6.6.
The skew constraints [Eqs. (4-7), (4-8) and (4-9)] introduced in Section 4.1.4 can be included in the LP
model (Table 5.2) in order to ensure the correctness of the solution. The skew constraints not only constitute
an extra measure to check for the feasibility of the solution but are also used in collecting statistical data on
clock skew values. Interpretation of Eqs. (4-11) and (4-13) lead to the upper and lower bound definitions for
the clock skew. In order to generate an expression for the upper bound, Eq. (4-11) rewritten as:


­5Łj
>
© ø
j
­òB
º¿»Ã¼¿¾
äð
ñ
¡ ó®B©ò

G (6-2)
In Eq. (6-2), the earliest possible time is assigned to 

in order to realize the upper bound on clock skew.
The earliest possible time that a data signal departs from a latch is   later than the leading edge of the
clock signal,
ú
B©ò
Á
ß
­ 

û
. Reordering the expression gives the upper bound on clock skew:
B
º¬»½¼¿¾
äð
ñ
¡£óø
j
­ù
Á
ß
©îŁj
>
©î

©ò

G (6-3)
The lower bound on the clock skew is derived similarly from Eq. (4-13), which leads to:
ë

­®j
&
ø
j
©ïB
º¬»½¼¿¾
äð
ñ
¡>­®

G (6-4)
In order to derive the lower bound, the data arrival time at d

must be considered to occur at its latest
possible time. The latest data arrival time is the setup time 

earlier than the trailing edge of the clock
signal, B©î

. Thus, the lower bound on the clock skew is:
B
º¬»½¼¿¾
äð
ñ
¡

ø
j
©ïB©ö
j
&
­ù

­5

G (6-5)
Combining Eq. (6-3) and Eq. (6-5), the theoretical limits on clock skew is expressed as follows:
ø
j
©÷B©ö%
&
­ù

­ 

ó®B
º¿»Ã¼¿¾
äð
ñ
¡ óø
j
­
Á
ß
©÷Łj
>
©ö

©ò

G (6-6)
47
Clock skew Tskew  i  f  in time units
N
um
be
ro
fp
at
hs
 20  19  18  17  16  15  14  13  12  11  10  9  8  7  6  5  4  3  2  1 0 1 20
5
10
15
20
25
30
35
40
45
50
55
60
65
70
Figure 6.7 Distribution of the clock skew values of the non-zero clock skew case for s938. The target
clock period is B C W¥GMLOaKNO\¥_½E . The height of each bar corresponds to the number of paths formed by
sequentially adjacent pair of registers which have a clock skew within the given range.
Recall that in experimentation, the parameters        




are considered zero and 50% duty
cycle is selected for the single-phase synchronization clock signal. In order to evaluate the upper and lower
bounds on clock skew in this simplified case, the parameters are substituted in Eq. (6-6):
© Łj
&
ó®B
º¬»½¼¿¾
äð
ñ
¡ ó<_]GJNÈB4©öŁj
>
G (6-7)
Specifically on the ISCAS’89 benchmark circuit s938, the clock skew bounds are verified using the exper-
imental values shown in Figure 6.4. For the benchmark circuit s938 with a minimum clock period of 9.09,
the minimum and maximum propagation delays are calculated to be 5 and 24.4, respectively. Thus, the value
set for the clock skew variable on the data paths of s938 is constrained by © XEHG[EŁó®B
º¿»Ã¼¿¾
äð
ñ
¡ óa¥GJIE .
The distribution of the clock skew values of s938, when operable with a minimum clock period of
W¥GMLOW , is presented in Figure 6.7. The calculated clock skew values are within the derived limits, most of
which are negative. Negative clock skew between registers help improve the minimum clock period of the
synchronous circuit due to the additional time it provides for data signal propagation. The data paths, on
which positive skew is recorded, most likely occur due to two reasons. The first reason is the presence of
data path loops within the circuit. The second reason are the—faster—paths which provide extra time for
neighboring critical paths.
48
Clock delay ti in time units
N
um
be
ro
fl
at
ch
es
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
0  5
1
1  5
2
2  5
3
3  5
4
Figure 6.8 Distribution of the clock delay values of the non-zero clock skew case for s938. The target
clock period is B4C<W¥GMLOaKNO\¥_½E . The height of each bar corresponds to the number of latches being driven by
a clock signal with a time delay within the given range.
The distribution of the clock delays to each register presented in Figure 6.8. The distribution is signif-
icantly wide-spread, ranging from 0 to 19 (time units), where the minimum clock period is B C W¥GMLOW . If
the clock tree network of the synchronous circuit is implemented to accommodate for these nominal clock
delays, operation at the target minimum clock period is achieved.
6.4 Further Considerations
The presented LP model formulation is demonstrated to be effective for the static timing analysis of syn-
chronous circuits. In the analyses, classical circuit implementations are investigated, such that, no additional
timing dependencies between the registers of a synchronous circuit, other than the dependencies leading to
the predefined timing constraints, are prescribed. As the application-specific integrated circuit (ASIC) de-
sign techniques become wide-spread and with the growing impact of secondary effects on the operation of
sub-micron devices, the need for a timing analysis model, merited to accommodate for application-specific
constraints, becomes essential. Unlike the previous work, the presented formulation is highly amenable to
such modifications, constituting a well-defined timing analysis framework.
49
The following describes a potential problem in the timing analysis of SOC designs. In an ASIC/SOC
implementation, the clock signal distribution between different IP blocks (or clock domains) are subject to
consideration as well as the distribution of the clock signal within an IP block. The clock delays to the
I/O registers of a synchronous IP block are less flexible compared to the clock signal delays to the internal
registers. It is likely that the timing analysis will be performed on individual IP blocks by the vendors,
without a priori information of the application environment. Therefore, timing violations may occur on
outgoing (non-local, intra-block) data paths, as the clock skew on these data paths will be unaccounted for
in the initial computation. A simple solution to avoid timing violations between the IP blocks is to equalize
all I/O register clock delay values. In the presented framework, additional timing constraints enforcing the
equality of the clock signal delays can easily be integrated into the constraints set of the LP model problem
presented in Table 5.2.
Another commonly encountered design constraint is to implement a predetermined—possibly non-
optimal—clock tree network for the synchronous circuit. If the clock tree topology is predetermined, the
minimum clock period problem must be solved with known clock delays to each register. The LP model
presented in Table 5.2 can easily be modified to account for such changes, by assigning the given clock
signal delays to the respective clock delay variables.
50
7.0 CONCLUSIONS AND FUTURE WORK
The timing analysis and optimization of synchronous circuits are subject to non-zero clock skew (in-
tentional or not) and other effects of process parameter variations. A novel timing analysis procedure is
introduced, which considers the simultaneous application of time borrowing and clock skew scheduling to
improve the performance of level-sensitive synchronous circuits. The described procedure is the first to
integrate non-zero clock skew scheduling in the timing analysis of level-sensitive circuits. The procedure
is based on a stand-alone LP model formulation (to be solved by any standard LP solver) which constitutes
a novel timing analysis framework for level-sensitive synchronous circuits. The timing framework is for-
mulated for a single-phase synchronization scheme. The optimal clocking and timing schedules for data
propagation between registers are computed as a result of the timing analysis.
In this thesis, the described timing analysis framework is used to automate the clock period minimization
problem of level-sensitive synchronous circuits. Accurate timing analyses of relatively bigger benchmarks
are performed by using the MBM method in order to generate equivalent LP model problems. The gener-
ated LP model formulation is sufficiently general and can be modified to accommodate application-specific
constraints and timing properties. The stand-alone nature of the presented timing framework supports easy
adaptation to various timing optimization problems such as the clock period verification problem and statis-
tical timing analysis.
The work presented in this thesis has been published in ^[XKI – XKac . In ^[XKI  XO\Èc and the thesis, single-
phase synchronization of level-sensitive circuits is analyzed. The proposed formulation (Section 4.3) and
solution (Section 5) procedures can be modified so that these procedures apply to multi-phase synchronized
circuits. The formulation of the timing analysis of level-sensitive circuits for multi-phase synchronization is
addressed in ^[XKa]c . Other enhancements on the formulation of the timing analysis of synchronous circuits for
multi-phase synchronization are among future directions of research.
A potential direction of research is to develop an algorithm to determine the optimal number of clock
phases for any given level-sensitive circuit. It is shown in ^[XKa]c that the optimal number of clock phases for
any two level-sensitive circuits need not be identical. The optimal number of clock phases in a multi-phase
51
synchronization scheme is fully dependent on the specific design. In general, as the number of clock phases
increases, the maximum operating frequency of the circuit increases at the expense of circuit area. It may be
possible to formulate an optimization problem to solve for the optimal number of clock phases for any given
level-sensitive circuit, by formulating the trade-off between the increase in circuit area and the improvement
in the maximum operating frequency.
Finally, the problem definition can be improved by targeting the statistical timing analysis of level-
sensitive circuits instead of the static timing analysis. The timing analysis procedures offered in this thesis
defines static timing variables in order to model circuit timing. The difference between a static timing vari-
able and a statistical timing variable is that, a static timing variable identifies the permissible range for a
variable, but does not identify the probability of that variable having a particular value in the given per-
missible range. While the static timing analysis identifies the circuit operation at the operating frequency
margins, it fails to provide a probabilistic profile for different states of circuit operation. The statistical tim-
ing analysis of a synchronous circuit is performed in order to derive the probabilities of a circuit operating
at any feasible frequency (or more generally, at any feasible timing schedule). It may be possible to use
the proposed formulation as a template or a well-defined starting point in order to derive a novel problem
formulation for the statistical timing analysis of level-sensitive circuits. In very deep sub-micron (VDSM)
circuits, the uncertainty in circuit operation including the uncertainty in the precision of circuit timing sig-
nificantly increases. This fact leads to the growing attention of the digital circuit designers to statistical
timing analysis approaches for synchronous circuits. Statistical timing analysis is worth exploring in future
research.
52
APPENDIX A
APPENDIX A
NONLINEAR PROBLEM FORMULATION
This appendix demonstrates the Nonlinear Programming (NLP) model problem formulation of the clock
period minimization problem. The circuit network shown in Figure 6.1 is investigated for the clock period
minimization problem and the NLP model problem formulation is demonstrated. The circuit network in
Figure 6.1 is presented in Figure A-1 for convenience.
R1 R2 R3
R	 4


2  9  3 



5  7 

 3 4
  2 5
5




3  4 

Figure A-1 A simple synchronous circuit.
(Obj) Dð Æ B
such that
(i) Latching Constraints - Hold Time
ëg

L ëi

L
ë¥e

L ëff

L
(ii) Latching Constraints - Setup Time
ì
gt©ïBFóL
ì
i ©ïBóùL
ì
e ©ïBFóL
ì
 ©ïBóùL
54
(iii) Synchronization Constraints - Earliest Time
Ë g£Cflfiffi! >Ýë g

LSGJNÈB¡ Ë¢i Cflfi"ffi! Ýë¥i

LSGJNÈBf¡
Ëe Cflfiffi! >Ýëe

LSGJNÈB¡ Ë#Cflfi"ffi! Ýë$

LSGJNÈBf¡
(iv) Synchronization Constraints - Latest Time
 g C%fiffi! 
ì
g

LSGJNÈBf¡  i Cflfiffi! >
ì
i

LSGJNÈB¡
e C%fiffi! 
ì
e

LSGJNÈBf¡ &Cflfiffi! >
ì


LSGJNÈB¡
(v) Propagation Constraints - Earliest Time
ë¥i Cflfi')(^%ÝËHg×­5X¥GJW ­
P
gt©
P
i ©ïBf¡

¨Ëet­5N ­
P
e ©
P
i£©÷Bf¡

ÝË#u­ù¤ ­
P
 ©
P
i ©ïB¡¬c
ë¥e C4Ë g×­5¤ ­
P
gt©
P
e ©ïB
ë$C4Ëet­5X¥GJN ­
P
e ©
P
 ©÷B
(vi) Propagation Constraints - Latest Time
ì
i Cflfiffi! ^jÝªg×­5¤­
P
gt©
P
i ©ïBf¡

¨et­\ ­
P
e£©
P
i ©ïBf¡

Ý&u­ E¯­
P
 ©
P
i © B¡¿c
ì
e C4ªg×­ Ef­
P
gt©
P
e ©ïB
ì
C4et­5N­
P
e ©
P
 ©ïB
(vii) Validity Constraints - Arrival Time
ì
gt©öëg

L
ì
i ©öë¥i

L
ì
e
©öë
e

L
ì

©öë


L
(viii) Validity Constraints - Departure Time
ªgt©îËHg

L i ©÷Ë¢i

L
e ©îË¢e

L & ©÷Ë#

L
(ix) Initialization Constraints
ì
g£C4Ë g
55
APPENDIX B
APPENDIX B
LP PROBLEM FORMULATION
This appendix demonstrates the Linear Programming (LP) model problem formulation of the clock
period minimization problem. The circuit network shown in Figure 6.1 is investigated for the clock period
minimization problem and the LP model problem formulation1 is derived. The circuit network in Figure 6.1
is presented in Figure B-1 for convenience.
R1 R2 R3
R	 4


2  9  3 



5  7 

 3 4
  2 5
5




3  4 

Figure B-1 A simple synchronous circuit.
(Obj) Dð Æ B­_flLKLKLKËHg£­_flLKLKLKËi­F_flLKLKLKË¢e­_flLKLKLKËff­_flLKLKLKªg ­F_flLKLKLKi­_flLKLKLKe ­_flLKLKLK&­
_flLKLKL
ì
i£­_flLKLKL
ì
eu­_flLKLKL
ì
¯©ù_flLKLKLKëi ©ù_flLKLKLKëe©ù_flLKLKLKë$
such that
(i) Latching Constraints - Hold Time
*
_,+ë g

L
*
X&+Oëi

L
*
¤&+ëe

L
*
E+Oëff

L
1The constraints are labeled c1–c43 in order to improve the output readability.
57
(ii) Latching Constraints - Setup Time
*
N&+
ì
gt©ïBóùL
*
I&+
ì
i ©÷BFóùL
*
\-+
ì
e ©ïBóùL
*
a&+
ì
 ©÷BFóùL
(iii) Synchronization Constraints - Earliest Time
*
W&+Ë g ©öë g

L
*
_flL.+¢Ë g ©îLSGJNÈB

L
*
_K_/+OË¢i ©öë¥i

L
*
_bX.+OËi ©öLSGJNÈB

L
*
_b¤+OË¢e ©öë¥e

L
*
_½E"+OËe ©öLSGJNÈB

L
*
_bN+OË# ©öë$

L
*
_bI.+OËff ©öLSGJNÈB

L
(iv) Synchronization Constraints - Latest Time
*
_«\.+Oªgt©
ì
g

L
*
_ba.+Oªgt©îLSGJNÈB

L
*
_bW+O
i
©
ì
i

L
*
X]L+O
i
©îLSGJNÈB

L
*
XS_/+O
e
©
ì
e

L
*
XKX.+O
e
©îLSGJNÈB

L
*
XK¤+O& ©
ì


L
*
XE"+O& ©îLSGJNÈB

L
(v) Propagation Constraints - Earliest Time
*
XKN+Oëi ©öË gt©
P
g­
P
it­îBFóX¥GJW
*
XKI&+¢ë¥e ©öËHgt©
P
g×­
P
e ­ BFó¤
*
XO\.+Oëi ©öËe ©
P
e ­
P
it­îBFóN
*
XKa+Oëff ©îË¢e ©
P
eÒ­
P
u­ BFóùX¥GJN
*
XKW+Oëi ©öËff ©
P
Ò­
P
it­îBFó¤
(vi) Propagation Constraints - Latest Time
*
¤]L"+
ì
i ©öªgt©
P
gå­
P
it­òB

¤
*
¤S_,+
ì
e£©îªgt©
P
g×­
P
eÒ­òB

E
*
¤KX+
ì
i ©öe ©
P
eÒ­
P
it­òB

\
*
¤K¤&+
ì
 ©îe ©
P
et­
P
t­òB

N
*
¤E0+
ì
i ©ö& ©
P
t­
P
it­òB

E
(vii) Validity Constraints - Arrival Time
*
¤KN+
ì
g
©öë
g

L
*
¤KI&+
ì
i
©îë
i

L
*
¤O\.+
ì
e
©öë
e

L
*
¤Ka&+
ì

©îë


L
58
(viii) Validity Constraints - Departure Time
*
¤KW+Oªgt©öËHg

L
*
EOL.+¢i ©÷Ë¢i

L
*
E _/+Oe ©öË¢e

L
*
E¢X&+¢& ©÷Ë#

L
(ix) Initialization Constraints
*
E¢¤+
ì
g ©öË g C4L
59
APPENDIX C
APPENDIX C
LP PROBLEM SOLUTION - CPLEX OUTPUT
This appendix includes the solution of the LP model problem describing the clock period minimization
problem of the circuit network shown in Figure B-1. The LP model problem shown in Appendix B is
solved using the industrial solver CPLEX ^`_flL]c and the results are shown below. In the results, SECTION 1
- ROWS section presents the optimal solution for each constraint and SECTION 2 - COLUMNS section
presents the optimal results for each variable.
Note that the optimal objective function value is not completely relevant to the clock period minimization
problem. Obtaining the minimum value for the clock signal period is the main objective of the clock period
minimiation problem and the minimum clock period is presented in SECTION 2 ( B CýEHGMLON ). Likewise,
the optimal values for the data signal arrival and departure times are presented in SECTION 2, constituting
the optimal clocking and timing schedules for the synchronous circuit under investigation. For detailed
information about CPLEX operation and output formatting, see ^`_flL]c .
PROBLEM NAME fig7.lp
DATA NAME
OBJECTIVE VALUE 26254.05
STATUS OPTIMAL SOLN
ITERATION 27
OBJECTIVE obj (MIN)
RHS
RANGES
BOUNDS
SECTION 1 - ROWS
NUMBER .ROW... AT .ACTIVITY... SLACK ACTIVITY .LOWER LIMIT. .UPPER LIMIT. .DUAL ACTIVITY
1 obj BS 26254.05 -26254.05 NONE NONE 1
2 c43 EQ 0 0 0 0 -0
3 c1 BS 0 0 0 NONE 0
4 c5 BS -2.025 2.025 NONE 0 -0
5 c9 BS 2.025 -2.025 0 NONE 0
6 c10 BS 0 -0 0 NONE 0
61
7 c17 BS 0 -0 0 NONE 0
8 c18 LL 0 0 0 NONE -2000
9 c35 BS 2.025 -2.025 0 NONE 0
10 c39 LL 0 0 0 NONE -2500.5
11 c26 UL 3 0 NONE 3 1000
12 c31 LL 4 0 4 NONE -3500.5
13 c25 UL 2.9 0 NONE 2.9 2500.5
14 c32 BS 6.95 -3.95 3 NONE 0
15 c4 BS 0 0 0 NONE 0
16 c8 BS -1.55 1.55 NONE 0 -0
17 c15 BS 2.025 -2.025 0 NONE 0
18 c16 LL 0 0 0 NONE -1000
19 c23 LL 0 0 0 NONE -1000
20 c24 BS 0.475 -0.475 0 NONE 0
21 c38 BS 2.5 -2.5 0 NONE 0
22 c23 BS 0.475 -0.475 0 NONE 0
23 c29 BS 2.475 0.525 NONE 3 -0
24 c34 BS 6.05 -2.05 4 NONE 0
25 c2 BS 0 0 0 NONE 0
26 c6 UL 0 0 NONE 0 500.5
27 c11 BS 2.025 -2.025 0 NONE 0
28 c12 LL 0 0 0 NONE -1000
29 c19 LL 0 0 0 NONE -1000
30 c20 BS 2.025 -2.025 0 NONE 0
31 c36 BS 4.05 -4.05 0 NONE 0
32 c40 BS 2.025 -2.025 0 NONE 0
33 c3 BS 1.025 -1.025 0 NONE 0
34 c7 BS -2.025 2.025 NONE 0 -0
35 c13 BS 1 -1 0 NONE 0
36 c14 BS 0 -0 0 NONE 0
37 c21 LL 0 0 0 NONE -2500.5
38 c22 LL 0 0 0 NONE -2000
39 c37 BS 1 -1 0 NONE 0
40 c41 LL 0 0 0 NONE -1000
41 c27 BS 2.95 2.05 NONE 5 -0
42 c28 UL 2.5 0 NONE 2.5 2000
43 c32 LL 7 0 7 NONE -2500.5
44 c33 LL 5 0 5 NONE -2000
SECTION 2 - COLUMNS
NUMBER .COLUMN. AT .ACTIVITY... ..INPUT COST.. .LOWER LIMIT. .UPPER LIMIT. .REDUCED COST.
45 T BS 4.05 1 0 NONE 0
46 D1 BS 2.025 1000 0 NONE 0
47 BD1 BS 2.025 1000 0 NONE 0
48 A4 LL 0 -1000 0 NONE 1000
49 BA4 BS 2.5 1000 0 NONE 0
50 D4 BS 2.025 1000 0 NONE 0
51 BD4 BS 2.5 1000 0 NONE 0
52 A2 LL 0 -1000 0 NONE 1500.5
53 BA2 BS 4.05 1000 0 NONE 0
54 D2 BS 2.025 1000 0 NONE 0
55 BD2 BS 4.05 1000 0 NONE 0
56 A3 BS 1.025 -1000 0 NONE 0
57 BA3 BS 2.025 1000 0 NONE 0
58 D3 BS 2.025 1000 0 NONE 0
59 BD3 BS 2.025 1000 0 NONE 0
60 BA1 BS 2.025 0 0 NONE 0
61 A1 LL 0 0 0 NONE 0
62 T1 BS 0.05 0 0 NONE 0
63 T3 LL 0 0 0 NONE 0
64 T2 BS 0.925 0 0 NONE 0
65 T4 BS 0.475 0 0 NONE 0
62
BIBLIOGRAPHY
BIBLIOGRAPHY
[1] T. M. Burks, K. A. Sakallah, and T. N. Mudge. Critical paths in circuits with level-sensitive latches.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 3(2):273–291, June 1995.
[2] M. R. Dagenais and N. C. Rumin. On the calculation of optimal clocking parameters in synchronous
circuits with level-sensitive latches. IEEE Transactions on Computer-Aided Design, CAD-8(3):268–
278, March 1989.
[3] S.-C. Fang and S. Puthenpura. Linear Optimization and Extensions: Theory and Algorithms. AT&T.
Prentice Hall, 1993.
[4] J. P. Fishburn. Clock skew optimization. IEEE Transactions on Computers, C–39(7):945–951, July
1990.
[5] W. Ford and W. Topp. Data Structures with C++. Prentice Hall, 1996.
[6] E. G. Friedman. Clock Distribution Networks in VLSI Circuits and Systems. IEEE Press, 1995.
[7] P. Gronowski and W. Bowhill. Dynamic logic and latches ii. IEEE VLSI Circuits Workshop, 1996.
[8] D. Harris and M. Horowitz. Skew-tolerant domino circuits. IEEE Journal of Solid-State Circuits,
32(11):1702–1711, November 1997.
[9] http://public.itrs.net/. International technology roadmap for semiconductors (itrs 2002). Technical
report, 2002.
[10] ILOG. CPLEX 7.1 User’s Manual, 2001.
[11] I. S. Kourtev and E. G. Friedman. A quadratic programming approach to clock skew scheduling
for reduced sensitivity to process parameter variations. In Proceedings of the 1999 IEEE ASIC/SOC
Conference, 1999.
[12] I. S. Kourtev and E. G. Friedman. Timing Optimization Through Clock Skew Scheduling. Kluwer
Academic Publishers, 2000.
[13] J. Lee, D. T. Tang, and C. K. Wong. A timing analysis algorithm for circuits with level-sensitive
latches. IEEE Transactions on Computer-Aided Design, CAD-15(5):535–543, May 1996.
[14] I. Lin, J. A. Ludwig, and K. Eng. Analyzing cycle stealing on synchronous circuits with level-sensitive
latches. Proceedings of the 29th ACM/IEEE Design Automation Conference, pages 393–398, June
1992.
[15] B. Lockyear and C. Ebeling. Optimal retiming of level-clocked circuits using symmetric clock sched-
ules. IEEE Transactions on Computer-Aided Design, CAD-13(9):1097–1109, Sep 1994.
[16] D. A. Pucknell and K. Eshraghian. Basic VLSI Design. Prentice Hall, 1994.
[17] J. M. Rabaey. Digital Integrated Circuits. Prentice Hall, 1996.
64
[18] K. A. Sakallah, T. N. Mudge, and O. A. Olukotun.   	
 and  
 : Timing verification and opti-
mal clocking of synchronous digital circuits. Proceedings of the IEEE/ACM International Conference
on Computer–Aided Design, pages 552–555, November 1990.
[19] K. A. Sakallah, T. N. Mudge, and O. A. Olukotun. Analysis and design of latch-controlled synchronous
digital circuits. IEEE Transactions on Computer-Aided Design, CAD-11(3):322–333, March 1992.
[20] A. S. Sedra and K. C. Smith. Microelectronic Circuits. Oxford University Press, 1998.
[21] N. Shenoy, R. K. Brayton, and A. L. Sangiovanni-Vincentelli. Graph algorithms for clock schedule
optimization. Proceedings of the IEEE/ACM International Conference on Computer–Aided Design,
pages 132–136, November 1992.
[22] B. Stroustrup. The C++ Programming Language. Addison Wesley, 2000.
[23] Synopsys Inc. Synopsys Online Documentation, 2002.
[24] T. G. Syzmanski and N. Shenoy. Verifying clock schedules. In Proceedings of the IEEE/ACM Inter-
national Conference on Computer–Aided Design, 1992.
[25] T. G. Szymanski. Computing optimal clock schedules. Proceedings of the 29th ACM/IEEE Design
Automation Conference, pages 399–404, June 1992.
[26] B. Taskin and I. S. Kourtev. Linear timing analysis of soc synchronous circuits with level-sensitive
latches. In Proceedings of the Fifteenth Annual IEEE ASIC/SOC Conference, pages 358–362, 2002.
[27] B. Taskin and I. S. Kourtev. Performance optimization of single-phase level-sensitive circuits using
time borrowing and clock skew scheduling. In ACM/IEEE International Workshop on Timing Issues
in the Specification and Synthesis of Digital Systems, pages 111–118, 2002.
[28] B. Taskin and I. S. Kourtev. Linearization of the timing analysis and optimization of level-sensitive
circuits. In IEEE Transactions on Very Large Scale Integration (VLSI) Systems, in submission.
[29] J. Uyemura. Introduction to VLSI Circuits and Systems. Wiley, 2002.
[30] W. L. Winston. Operations Research Application and Algorithms. PWS-Kent Publishing Company,
second edition, 1991.
[31] H. Zhou. Clock schedule verification crosstalk. In ACM/IEEE International Workshop on Timing
Issues in the Specification and Synthesis of Digital Systems, pages 78–83, 2002.
65
