Steiner network construction for signal net routing with double-sided timing constraints by Li, Qiuyang
 
STEINER NETWORK CONSTRUCTION FOR SIGNAL NET ROUTING WITH 
DOUBLE-SIDED TIMING CONSTRAINTS 
 
 
 
 
 
 
A Thesis 
by 
QIUYANG LI 
 
 
 
 
Submitted to the Office of Graduate Studies of 
Texas A&M University 
in partial fulfillment of the requirements for the degree of 
 
MASTER OF SCIENCE 
 
 
 
 
 
 
August 2006 
 
 
 
 
 
Major Subject: Computer Engineering 
 
  
 
STEINER NETWORK CONSTRUCTION FOR SIGNAL NET ROUTING WITH 
DOUBLE-SIDED TIMING CONSTRAINTS 
 
 
 
 
 
 
A Thesis 
 
by 
 
QIUYANG LI 
 
 
 
 
 Submitted to the Office of Graduate Studies of 
Texas A&M University 
in partial fulfillment of the requirements for the degree of 
 
MASTER OF SCIENCE 
 
 
 
 
Approved by: 
 
Chair of Committee,   Jiang Hu 
Committee Members,  Gwan Choi 
Donald K. Friesen 
Head of Department,   Costas N. Georghiades 
 
 
 
 
August 2006 
 
 
Major Subject: Computer Engineering 
  
iii 
ABSTRACT 
 
 
Steiner Network Construction for Signal Net Routing with  
 
Double-sided Timing Constraints. 
 
 (August 2006) 
 
Qiuyang Li, B.S., Nankai University; 
M.S., Nankai University 
Chair of Advisory Committee: Dr. Jiang Hu 
 
     Compared to conventional Steiner tree signal net routing, non-tree topology is often 
superior in many aspects including timing performance, tolerance to open faults and 
variations. In nano-scale VLSI designs, interconnect delay is a performance bottleneck and 
variation effects are increasingly problematic. Therefore the advantages of non-tree 
topology are particularly appealing for timing critical net routings in nano-scale VLSI 
designs. We propose Steiner network construction heuristics which can generate either tree 
or non-tree of signal net with different slack wirelength tradeoffs, and handle both long 
path and short path constraints. Extensive experiments in different scenarios show that our 
heuristics usually improve timing slack by hundreds of pico seconds compared to 
traditional tree approaches while increasing only slightly in wirelength. These results show 
that our algorithm is a very promising approach for timing critical net routings. 
  
iv 
 ACKNOWLEDGEMENTS 
This thesis represents about one and a half years of work at Texas A&M University. 
This work would not have been possible without the boundless assistance of mentors, 
colleagues, and friends. 
It is with the deepest of gratitude that I would like to thank my advisor, Jiang Hu, who 
always gave me great directions. I would also like to express my heartfelt thanks to the 
other professors, who have always kept their doors open, ready to discuss and encourage 
new ideas. Particular thanks go to Gwan Choi and Donald K. Friesen who could always 
find time, despite any other responsibilities or deadlines they may have had. 
In my time at Texas A&M University, I have had the opportunity to collaborate with a 
number of different people without whose help this thesis could not have been completed. I 
would like to thank Weiping Shi, Sunil P. Khatri, Di Wu, Wei Zhuang, Mankang Mai, and 
Chin-Ngai Sze (Cliff) for their time and their efforts. While not a collaborator as such, I 
would also like to thank Tammy Carda and Linda Currin for their regular guidance and 
tireless assistance.  
Finally, I would like to express my sincere thanks to friends and family whose 
unwavering support has been indispensable to me over the last years. Thank you to my 
family: Dad, Mom, my elder brother and sister. And thank you to the friends that I have 
made along the way: Mankang, Wei Zhuang, Dawen, Xiaomin, Ying Li, Jiong Yan and 
Shengquan. 
 
  
v 
TABLE OF CONTENTS 
Page 
ABSTRACT......................................................................................................................iii 
ACKNOWLEDGEMENTS………………………………………………………………iv 
TABLE OF CONTENTS……………………………………………………...…………….v 
LIST OF FIGURES……………………………………………………………….……….vi 
LIST OF TABLES………....………………………….……….…………………..….…vii 
CHAPTER…..……………………………………………………………………………1 
 I     INTRODUCTION……………………………………………………………1 
       1.1   Previous Work….……….……………………………………1 
       1.2    Outline……………………………………………………………..3 
 I I    P R E L I M I N A R Y … … … … … … … … … … … … … … … … … … … . . 4 
 I I I    A L G O R I T H M … … … … … … … … … … … … … … … … … … … . . 7 
3.1    Discussion on Topology……………..……………………………….10 
  3.2  Constructive Steiner Network Heuristic.…………………………….11 
IV   EXPERIMENTAL RESULTS………………………………..….16 
4.1  Cases with Single Critical Sink……..……………………………….17 
  4.2 Cases with Multiple Critical Sinks.…………………………….23 
V   CONCLUSIONS AND FUTURE WORK………………………………..….24 
REFERENCES………………………………………………………………………….25 
VITA ...………………………………………………………………………………….26 
 
 
 
  
vi 
LIST OF FIGURES 
FIGURE                                                                                                                             Page 
1     Insert a link in an RC  network……………………………………………………4 
2 Chain-like topology and star-like topology……………………………...…..11 
3 Root-root merging and shortest merging……………………….……………..14 
4 Slack (AHHK+link vs Steiner network)………………………………….….19 
5 Wirelength (AHHK+link vs Steiner network)……………………………………..19 
6 2-connected wire (AHHK+link vs Steiner network)………….....………….….20 
7 Monte Carlo: standard deviation of slack (AHHK+link vs Steiner network)…..22 
 
 
 
 
 
 
 
 
 
 
 
 
  
vii 
LIST OF TABLES 
 TABLE                                       Page 
I  Cases with 1 critical sink, comparison between AHHK and AHHK+link..……….17 
II Cases with 1 critical sink, comparison between AHHK+link and Steiner network.18 
III Monte Carlo results corresponding to table I ……..………….………………..21 
IV Monte Carlo results corresponding to table II……….…………………………..22 
V Cases with multiple critical sinks……………….…………………………..23 
 
               
  
1 
CHAPTER I 
INTRODUCTION 
The interconnect delay is a well-known performance bottleneck in VLSI circuit 
designs. Therefore for timing optimization, the optimization of interconnect topology, say 
singal net routing, is very important for the circuit design.  
In practice, because Steiner tree [1] is cost-effective and its delay is relatively easy 
to compute, people almost always use it for signal net routing. However, non-tree topology 
has some remarkable advantages compared to trees. Non-tree routing can significantly 
improve signal propagation delay, reduce signal skew, and afford increased reliability with 
respect to open faults that may be caused by manufacturing defects and electro-migration 
[3]. That is, the redundant paths in a non-tree network provide certain tolerance to open 
faults and therefore can improve manufacturing yield and reliability [2]. Moreover, non-
tree topology sometimes can reduce delay variations [2]. Although non-tree delay 
computation is more expensive than that of trees, the computation overhead of non-tree can 
usually be alleviated by the advancement on computation techniques and facilities. And the 
design needs often eventually outweigh computation overhead if the overhead is not 
prohibitively large. 
1.1 Previous Work 
Perhaps the first non-tree routing work is [3]. It starts with a Steiner tree topology. 
Then it iteratively searches for a new edge to add, so that the maximum source-sink delay 
in the resulting routing graph will be minimized. It keeps on doing this until no further 
delay improvement is possible. The later work of [4] inserts links sequentially between the  
___________ 
This thesis follows the style of IEEE Transactions on Microwave Theory and Techniques. 
2 
source and the sink with the maximum delay in the topology with shortest feasible length. 
The recent work of [2] is focused on the reliability and manufacturing yield of non-tree 
routing. It augments extra edges to an existing tree to increase the percentage of 2-
connected wires, which implies tolerance to open faults. The works of [3, 4] on timing 
driven non-tree routing have two main weaknesses. Since they start from an existing tree, 
and then add wires on it, the performance of the resulting non-trees depends on the initial 
trees. The arbitrary starting tree cannot guarantee a good non-tree solution. The other 
weakness is that they [3, 4] optimize only delay without considering timing constraints. In 
reality, maximizing slack or minimizing wire cost subject to timing constraints is a more 
common and useful problem formulation [5].  
   The timing constraints in previous works [5] almost always consider only the 
upper bounds for sink delays. In fact, there are delay lower bounds due to the short path 
(hold time) constraints in synchronous circuits. Some gate sizing works [6] consider both 
delay upper bound and lower bound at the same time. To the best of our knowledge, there 
is no signal net routing work considering the double-sided timing constraints yet. This is 
perhaps due to the reason that delay lower bound can be easily satisfied by padding extra 
delay. The delay padding can be implemented by wire detour, adding dummy capacitors or 
inserting redundant buffers. The former two approaches may increase the delay along the 
long path. The later approach of redundant buffers may intensify the leakage power 
problem. They all can increase the unnecessary complexity. Thus, we need to handle the 
short path constraints in a more careful manner. 
 
 
3 
1.2 Outline 
In this thesis, we propose Steiner network construction heuristics which consider 
delay upper bound and lower bound simultaneously for timing critical nets. We will show 
that sometimes a link insertion can simultaneously reduce long path delay and increase 
short path delay. One heuristic is a greedy link insertion in an existing tree or non-tree, 
which is similar to [3] but the solution search is trimmed for the double-sided timing 
constraints. The other is a dynamic programming based constructive algorithm which can 
generate a set of solutions with different slack-wirelength tradeoff and can reach either tree 
or non-tree topology.  
By comparing to the traditional AHHK tree results, our extensive experimental 
results show that this Steiner network construction usually improves slack by hundreds of 
pico seconds. The non-tree approach may bring some wirelength and runtime overhead, but 
from the experimental results, this overhead is in a relatively small range. And moreover, 
because it is applied to only a small number of timing critical nets, the impact of overhead 
to overall chip design is very limited. Beside this, we also do the Monte Carlo simulation 
with process variations considered. The results show that our method can improve timing 
yield greatly with both nominal slack improvement and delay variability (standard 
deviation) reduction. 
The rest of this thesis is organized as following. Chapter II discusses a lemma of 
link insertion and then gives our problem formulation. Chapter III addresses our algorithm. 
In Chapter IV, we show our experiment results. And then we conclude in Chapter V. 
4 
CHAPTER II 
PRELIMINARY 
In this chapter, we will show that proper link insertion in an existing tree or nontree 
can reduce long path delay and increase short path delay simultaneously. That is, link 
insertion may reduce the difference of the maximum path delay and the minimum path 
delay. 
Considering insert a link between two nodes i  and j  in an  network (Fig.1), 
which can be either a tree or a nontree. Let the link resistance be 
RC
R  and link capacitance be 
. According to the ∏ -model, this link insertion is equivalent to adding capacitance  
at node i  and 
C / 2C
j , respectively, and inserting resistance R  between i  and j . 
 
 i
j
2
CR
2
C
 
Fig.1   Insert a link in an  network. RC
 
 
The link capacitance always increases the delay by , ,( )2i lc i i i j
Ct R R= + ,  and 
, ,( )2j lc i j j j
Ct R R= + , ). Here  is the path resistance from the source to node . And , ,(i i j jR R ( )i j
,i jR  is the transfer resistance which equals the voltage at node i  when 1A current is 
5 
injected into node j  and all the other node capacitances are set to zero [4]. After the link 
insertion, the delay to i  and j  are changed from  and it jt  to  and it jt  according to the 
following equations [7]: 
(1 )( ) ( ),t t t t ti i i lc j j, lcα α= − + + +                   (1) 
,(1 )( ) ( ),j j j lc i i lct t t t tβ β= − + + +                      (2) 
Where i
i j
r
R r r
α = + −  and 
j
i j
r
R r r
β = + − . In general,  and ir jr  are equal to the 
Elmore delay at i  and j , respectively, when node capacitance 1, 1i jC C= = −  and the other 
node capacitances are set to zero [4]. 
The above equations show that the link capacitance always increases signal delay 
while the link resistance attempts to average the delay between i  and j . It is 
strightforward to derive the following condition on the simultaneous improvement for both 
long path and short path delay. 
Lemma: If a link with resistance R  and capacitance C  is inserted between a node i  
on a long path and a node j  on a short path in a Steiner network, the necessary and 
sufficient condition of simultaneously reducing delay to node i  and increasing delay to 
node j  is , ,
1( 1)i i lc j j lct t tα≥ − + + t . 
When considering double sided timing constraints, each sink  has a delay upper 
bound 
iv
iq  and a delay lower bound iq . The delay upper bound is the same as the required 
arrival time (RAT) in traditional methods. We define the late slack of a sink  as iv
6 
i is q t= − i  where  is the delay. Similarly, the early slack of a sink  is defined as it iv
i i i
s t q= − . The slack of a sink  is iv min( , )ii is s s= . The late slack, early slack and slack of 
a network (or subnetwork) are the minimum late slack, early slack and slack among all 
sinks in the network, respectively. For a network (or subnetwork), the sink having the 
minimum late (early) slack is called late (early) critical sink. Here is our problem 
formulation: 
Timing Driven Steiner Network Construction: 
Given a source node , a set of sink nodes  with each sink  having 
load capacitance , lower delay bound 
0v 1 2{ , ,..., }nv v v iv
ic iq  and upper delay bound iq , construct a rectlinear 
Steiner network spanning the source and the sinks such that the slack of the network is 
maximized. 
7 
CHAPTER III 
ALGORITHM 
Before we dive into the details of the algorithm, let’s review our problem 
formulation. 
Given a source node , a set of sink nodes  with each sink  having 
load capacitance , lower delay bound 
0v 1 2{ , ,..., }nv v v iv
ic iq  and upper delay bound iq , construct a rectlinear 
Steiner network spanning the source and the sinks such that the slack of the network is 
maximized. 
Then comes the procedure of our algorithm, constructive Steiner network heuristic. 
Step 1. Initialization: A set of subnetworks are initialized with the sink nodes. It is 
the first candidate solution. 
n  = number of sinks 
0O = new empty solution  
for  to  1i = n
      = new empty subnetwork iG
      add sink node  to  iv iG
      add  to  iG 0O
add  to solution set  0O O
 
 
Step 2. Merging selection: In a candidate solution , select two subnetworks to 
merge.  
iO
For two different scenarios: (1) If long path constraints and short path constraints 
are almost equally tight, we first choose the subnetwork with the maximum ( ) / 2q q t′+ + , 
8 
and then merge it with its nearest neighboring subnetwork. (2) If long path constraints 
dominate, we choose a pair of subnetworks whose merging root is farthest from the source 
among all pairs. 
for every subnetwork  in  G kO
      Choose  with maximum iG ( ) / 2q q t′+ +  
 for every other subnetwork G  in  kO
     Choose jG  with minimum j i j ix x y y− + −  
call Step 3 to merge  and iG jG  
kn = the number of subnetworks in  kO
max =i = j  = 0 
for i  = 1 to  kn
    for j  =  to  i kn
         if j i j ix x y y− + −  > max then  m=i; n = j 
call Step 3 to merge  and  mG nG
 
Step 3. Merging: Merge these two subnetworks. 
Use two different method to merge two selected subnetwoks  and iG jG  of solution 
. One is root-root merging. And the other method is shortest merging where two nodes 
from the two subnetworks with the minimum distance are connected directly. By doing this, 
we get two candidate solutions. 
kO
mO  = new empty solution 
mO  =  kO
mG  = new empty subnetwork 
node set of  = { node set of  } { node set of  mG iG ∪ jG } 
edge set of  = { edge set of  }∪{ edge set of  mG iG jG } 
mv  = new node with coordinate { , } ( , , )i j omedian x x x ( , , )i j omedian y y y
9 
add  to and it is the new root mv mG
add edges { ,  and { ,}i mv v }j mv v  to  mG
delete  and iG jG  in  mO
call Step 4 to insert link in  mG
 
nO  = new empty solution 
nO  =  kO
nG  = new empty subnetwork in  nO
node set of  = { node set of  } { node set of  nG iG ∪ jG } 
edge set of  = { edge set of  } { edge set of  nG iG ∪ jG } 
min =  ∞
for every node  in  iv iG
    for every node jv  in jG  
         if j i j ix x y y− + −  < min  then  p=i; q = j 
nv  = new node with coordinate { ( , , )p q omedian x x x , ( , , )p q omedian y y y } 
add  to and it is the new root nv nG
add edges { ,  and { ,}i nv v }j nv v  to  nG
delete  and iG jG  in  nO
call Step 4 to insert link in  nG
 
delete old solution  kO
 
Step 4. Link insertion: Insert link into the result subnetwork G  of solution . kO
For the two subnetworks obtained from mergings, we insert a link in each of them. 
Then we get two new candidate solutions. 
mO =  // the following operations are taken in this new solution  kO mO
ev  is the early critical sink of  G
lv  is the late critical sink of  G
10 
use dijkstra’s algorithm to find the shortest path ,e lp G∈  which connects  and .  ev lv
for each node ,  ,i ev p∈ l
for each edge ,j e le p∈    // je :{ ( , )j jx y , ( , )k kx y } 
tentatively insert link between  and { , }. 
for all temporarily inserted link  
iv ( , , )i j kmedian x x x ( , , )i j kmedian y y y
we finally insert the one with the maximum slack improvement.  
 
Step 5. Candidate solution pruning: For a new candidate solution, compare it with 
previous generated candidate solution for pruning.  
If  candidate solution  has the exactly same sink set as ,  ,i kO ,i kO
If , , ,i j i kC C≤ ,i j i kq q≤ ,  and , ,i j i kq q≥ ,  
prune  ,i kO
 
Step 6. Solutions at the source: Choose the best solution at the source. 
for every solution  in the solution set O  iO
choose the one with maximum slack or minimum capacitance without negative 
slack. 
 
Each step will be explained in detail later. 
3.1 Discussion on Topology 
The effect of link insertion depends on the initial tree topology. There is area-radius 
tradeoff among different tree topologies. The area refers to the total wirelength and the 
radius is the maximum source-sink path length in a tree. The two extreme cases of this 
trade-off are: (1) chain-like topology (Fig. 2(a)), which has small area and large radius, and 
is usually derived from minimum spanning tree algorithms; (2) star-like topology (Fig. 2(b)) 
with relatively large area and small radius, and can be obtained from the shortest path tree 
or Rectilinear Steiner Aborescence (RSA) algorithms [1]. 
11 
 
(a) Chain-like topology                 (b) Star-like topology 
Fig. 2   Chain-like topology and star-like topology.  
 
 
The major weakness of a tree with chain-like topology is that the delay of some 
sinks may suffer from the long path length. For example, if  in Fig. 2(a) is the late critical 
sink with tight delay upper bound, the long detour may cause large delay constraint 
violation. If we include non-tree topology into consideration, we may reach different 
conclusions. If a link (dashed line) is inserted in the chain-like topology as in Fig. 2(a), the 
long detour problem is eliminated and the small wirelength is still enjoyed. However, if the 
late critical sink is  instead, perhaps the star-like topology in Fig. 2(b) is still better. Thus, 
it is not clear which tree topology can facilitate a good non-tree solution in general. Our 
constructive algorithm probes different topologies so that the chance of capturing good 
non-tree solutions can be increased. 
1v
2v
3.2 Constructive Steiner Network Heuristic 
If we treat a network as a tree plus links, the problem of network construction can 
be accordingly decomposed into finding a proper tree topology and link insertions. We 
combine these two concerns into a dynamic programming based heuristic. This heuristic is 
a bottom-up merging procedure where multiple candidate solutions are generated to probe 
good topologies and link insertions. At the beginning, a set of subnetworks are initialized 
with the sink nodes. In each iteration, a pair of subnetworks is selected to be merged. 
12 
Different merging solutions are generated. For each new subnetwork resulting from a 
merging, another candidate solution is generated by inserting a link in it. These candidate 
solutions are propagated toward the source. 
Solution characterization. A candidate solution  is a set of subnetwork . It 
can be characterized by the total load capacitance , delay lower bound 
iO ,i jG
,i jC ,i jq  and delay 
upper bound ,i jq  at each root jv . It is easy to derive that the delay upper bound ,i jq  is same 
as the late slack of . Similarly, the delay lower bound ,i jG ,i jq  is equal to the negative of 
early slack.  
Solution pruning. If there is another candidate solution  with the subnetwork set 
have the exactly same sink sets as candidate solution , the two solutions can be 
compared for pruning. If for each corresponding subnetwork 
kO
iO
j  has , , ,i j k jC C≤ , ,i j k jq q≤  
and ,i j k jq q≥ , , solution  is inferior and can be pruned. kO
Merging selection. We propose two merging selection criteria for two different 
scenarios: (1) long path constraints and short path constraints are almost equally tight, and 
(2) long path constraints dominate. 
For the first scenario, we use a merging scheme similar to prescribed skew clock 
tree routing [9]. In fact, when the delay upper bound of each sink is equal to its delay lower 
bound, i.e., the delay constraints degenerate to a single value target, this problem is 
equivalent to prescribed skew clock routing. In prescribed skew clock routing, the subtree 
with the maximum delay target is merged first to reduce the chance of wire detour [9]. 
Since we have delay upper and lower bound instead of a single delay target, we use the 
average ( ) / 2q q t′+ +  as the criterion. The t′  is the anticipated wire delay from the source 
13 
node to the root of the subnetwork. This is to encourage subnetworks with roots far away 
from the source to be merged early. In each iteration, we first choose the subnetwork with 
the maximum ( ) / 2q q t′+ + , and then merge it with its nearest neighboring subnetwork. 
The second scenario is more like traditional signal routing [1]. Therefore, we adopt 
a merging criterion similar as that of Rectilinear Steiner Aborescence (RSA) [10]. That is, 
we choose a pair of subnetworks whose merging root is farthest from the source among all 
pairs. If we consider merging subnetworks rooted at ,( i i )x y  and ,( )j jx y , then the merging 
root is at ( ( 0 0, , ), ( , , ))m i j m i jmedian x x x y median y y y= = 0 0( , ) where xx y  is the location of 
the source node. Then, the pair with the maximum value 0m m 0x x y y− + −  is selected for 
a merging. Our method is different from the well-known RSA algorithm [10] which 
restricts all sinks in one quadrant if the source is at (0,0). Our merging selection can handle 
the cases that sinks are distributed in multiple quadrants. 
Merging. After a pair of subnetworks is selected, we consider two types of 
mergings between them. One is the root-root merging as in Fig. 3(a) where subnetwork  
and  are merged at node . The other is the shortest merging where two nodes from the 
two subnetworks with the minimum distance are coneected directly. After the merging, the 
node closest to the source is selected as the root for the merged network. For example, in 
Fig. 3(b), the merging between  and  is obtained by connecting  and  where 
reroot occurs. Then  is chosen as the root. 
5G
6G 7v
5G 6G 2v 3v
5v
14 
v4
v3
v1
v2
v6
v5
v7
v0
v3
v4
v1
v2
v6
v5
v0
 
                (a) root-root merging                         (b) Shortest merging 
Fig. 3  Root-root merging and shortest merging. 
 
 
The root-root merging is very similar as the RSA [10] heuristic which leads to star-
like topology. The shortest merging is more likely to result in chain-like topology. By 
having these two different types of merging, various topologies can be generated to 
compete for the best slack solution. 
Link insertion. For the two subnetworks obtained from merging, we insert a link in 
each of them such that the slack is maximized considering the double-sided timing 
constraints. 
For the given subnetwork G , which can be either tree or non-tree, we first identify 
its early critical sink  and late critical sink  (both defined in Chapter II). Next we find 
the shortest path  which connects the two critical sinks. For each node , we 
tentatively insert a link between  and each edge 
ev lv
,e lp ∈G l,i ev p∈
iv ,j e le p∈  with the shortest connection. If 
node  is at coordinate iv ( , )iix y , and the two ending nodes of je  are at ( , )j jx y  and 
( , )k kx y , respectively, the link is inserted between node  and location iv ( , )c cx y  where 
( , , )c i j kx median x x x=  and . For each link insertion result, we 
evaluate the slack S of the network. For all temporarily inserted link we finally insert the 
one that gives the maximum slack improvement.  
( , , )c iy median y y y= j k
15 
Solutions at the source. At the source, there are a set of solutions with different 
capacitance and slack trade-off. We can choose either the maximum slack solution or the 
minimum capacitance solution without negative slack. 
16 
CHAPTER IV 
EXPERIMENTAL RESULTS 
All algorithms are implemented in C++ and the experiments are performed on a PC 
computer with 3.2GHz processor and 1G memory. We generated different testcases with 
the number of sinks ranging from 5 to 25. Without loss of generality, we let the source be at 
coordinates (0,0). In some cases, all of the sinks are in one quadrant while some other cases 
have sinks distributed in four quadrants. For example, in the data tables, the notation of 
“15s, 2Q” means there are 15 sinks and they are distributed in two quadrants. The 70nm 
technology parameters reported in [12] are employed. We compare the following methods 
in the experiments: 
AHHK. This is a Steiner tree heuristic [1] which can achieve different area-radius 
tradeoff by varying a parameter [0,1]α ∈ . When the value of α  is shifted from 0 to 1, the 
resulting tree gradually changes from chain-like to star-like topology [1]. Although it is not 
directly timing driven, we can achieve very good timing performance by trying different α  
and choosing the result with the best slack. We tested AHHK trees with α = 0, 0.5, 1 in the 
experiments. 
AHHK+detour. If there is short path violation, the edge incident to the early 
critical sink is elongated to increase the delay till the early slack is close to the late slack, so 
that the overall slack is maximized. 
AHHK+link. Inserting links greedily similar as described in our algorithm step 4 
link insertion. The only difference is that here we try to insert links greedily until there isn’t 
timing performing improvement anymore. This method is similar to [3]. 
Steiner network. The dynamic programming based Steiner network construction 
17 
proposed in Chapter IV. 
4.1 Cases with Single Critical Sink 
For the testcases, we generated 15 nets with 5, 10, 15, 20, 25 sinks and sinks in 1 
quadrant, 2 quadrants and 4 quadrants, respectively. Each net has a single critical sink 
which is often on the long path. Therefore, wire detour is rarely necessary here.  
 
 
TABLE I 
CASES WITH 1 CRITICAL SINK, COMPARISON BETWEEN AHHK AND 
AHHK+LINK  
 
AHHK AHHK+link  
Case α  S W S W #L            2-C 
15s,1Q 
15s,2Q 
15s,4Q 
20s,1Q 
20s,2Q 
20s,4Q 
25s,1Q 
25s,2Q 
25s,4Q 
0 
0.5 
0 
0 
0 
0.5 
0 
1 
0 
-494 
-56
-709 
-704 
-2137
-243 
-746 
-49
-3106
8751
9055
15333
9887
12453
17519
9596
18183
19954
-196
-4
-455
-248
-1377
77
-442
-1
-1749
10700 
11004 
17799 
11963 
14415 
19491 
11290 
20411 
21612 
1     44%
1     42%
1     57%
1     41%
1     35%
1     24%
1     39%
1     33%
1     20%
Average  -916 13415 -488 15409 1     37%
 
Notes:  Comparison on slack S( ), total wirelength W(ps mμ ), the number of inserted links 
#L and percentage of 2-connected wires 2-C. 
 
In Table I above, we compare AHHK and AHHK+link on 9 cases among the 15 nets 
where links are indeed inserted. The average results in the last row show that link insertion 
can improve slack by about 428ps with about 15% increase on wirelength. The link 
insertion can also achieve about 37% 2-connected wires, which means about 37% of the 
wires are tolerant to open faults. 
 
 
 
 
18 
TABLE II 
CASES WITH 1 CRITICAL SINK, COMPARISON BETWEEN AHHK+LINK AND 
STEINER NETWORK 
 
AHHK+link Steiner network  
Case α  S ( )ps  W ( )mμ  #L      2-C CPU ( )s  S ( )ps W ( )mμ  #L          2-C CPU ( )s  
5s,1Q  
5s,2Q  
5s,4Q  
10s,1Q 
10s,2Q 
10s,4Q 
15s,1Q 
15s,2Q 
15s,4Q 
20s,1Q 
20s,2Q 
20s,4Q 
25s,1Q 
25s,2Q 
25s,4Q 
0 
1 
0 
1 
1 
0.5 
1 
1 
1 
1 
0.5 
0.5 
1 
1 
1 
-3 
-15 
14 
26 
54 
112 
102 
-13 
-12 
7 
-3 
77 
-5 
-1 
-18 
5122
7442
9804
7409
12831
10120
9566
12135
18345
12372
14214
19491
13869
20411
23144
0          0
0          0
0          0
0          0
0          0
0          0
0          0
0          0
0          0
0          0
0          0
1     24%
0          0
1     33%
0          0
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.02
0.02
0.02
0.02
0.02
0.02
120
82
123
124
188
199
320
283
130
177
177
242
534
308
211
7544
8641
11184
7743
14763
13468
8892
13195
18836
12429
17355
18639
14493
17019
24128
1      58% 
1      30% 
1      25% 
1      15% 
1      42% 
3      53% 
0           0 
1      20% 
1      22% 
1      19% 
2      39% 
1      19% 
2      34% 
1      18% 
1      17% 
0.01
0.01
0.02
0.05
0.09
0.49
0.25
0.38
0.72
2.28
4.95
0.53
1.41
10.48
0.66
Average  21 13085        3.8% 0.01 215 13889      27.4% 1.49
 
Notes:  Comparison on slack S( ), total wirelength W(ps mμ ), the number of inserted links 
#L, percentage of 2-connected wires 2-C and running time CPU(s). 
 
In Table II, we compare our constructive Steiner network heuristic and AHHK+link 
for the entire 15 nets. For AHHK+link, we pick the results of α  with the best timing slack. 
Among multiple solutions generated by the constructive heuristic, we report the solution 
with best slack and largest wirelength. According to the last row of Table II, it can improve 
the slack from 21ps to 215ps on average. The wirelength increase of our Steiner network 
heuristic is only 6% over the AHHK+link results. The following figures (Fig.4, Fig.5 and 
Fig.6) show these results. 
 
19 
 
Fig.4   Slack (AHHK+link vs Steiner network). 
 
Notes:  With our constructive Steiner network heuristic, we can get better slack result. 
 
 
 
Fig.5   Wirelength (AHHK+link vs Steiner network). 
 
Notes:  Comparing with AHHK+link, our Steiner network heuristic has only a little 
wirelength increase. 
20 
 
 
Fig.6   2-connected wire (AHHK+link vs Steiner network). 
 
Notes:  In some of our testcases, we didn’t insert links in AHHK, therefore there isn’t 2-
connected wire. While for our constructive Steiner network heuristic, we get more 2-
connected wires which can be more tolerant to open faults. 
 
The dynamic programming based Steiner network construction can generate a set of 
solutions with different slack-wirelength tradeoff.  
We also do Monte Carlo simulations (5000 runs for each result) to observe the 
behaviors of these algorithms under process variations. We consider wire width, sink 
capacitance and driver resistance variations which are assumed to follow Gaussian 
distribution with standard deviation equal to 5% of nominal value.  
 
 
 
 
 
 
 
 
21 
TABLE III 
MONTE CARLO RESULTS CORRESPONDING TO TABLE I 
 
AHHK AHHK+link  
Case α  sμ  sσ  Y sμ  sσ  Y 
15s,1Q 
15s,2Q 
15s,4Q 
20s,1Q 
20s,2Q 
20s,4Q 
25s,1Q 
25s,2Q 
25s,4Q 
0 
0.5 
0 
0 
0 
0.5 
0 
1 
0 
-497
-57
-711
-707
-2144
-245
-751
-51
-3117
34
27
50
35
69
42
40
43
91
0
2
0
0
0
0
0
12
0
-197
-4
-456
-249
-1382
  77
-447
-1
-1754
27 
27 
45 
29 
58 
42 
41 
44 
69 
0 
44
0 
0 
0 
97
0 
49
0
Average  -920 47.9 1.6% -490 42.4 21.1%
 
Notes:  mean slack sμ ( ), standard deviation of slack ps sσ ( ) and timing yield Y (the 
probability of non-negative slack). 
ps
 
 
The comparison between AHHK trees and AHHK+link results is in Table III above. 
Comparing with the deterministic results in Table I, we can see that the mean values sμ  of 
the slacks are about the same. On average, AHHK+link can reduce the standard deviation 
sσ  of slack by about 10% and increase timing yield from 1.6% to 21.2%.  
As in Table IV, the data indicate that our constructive method can reduce the 
standard deviation further by about 10% (Fig.7) and improve the timing yield from about 
61% to 100% compared to AHHK+link. 
 
 
 
 
 
 
 
22 
 
TABLE IV 
MONTE CARLO RESULTS CORRESPONDING TO TABLE II 
 
AHHK AHHK+link  
Case α  sμ  sσ  Y sμ  sσ  Y 
5s,1Q  
5s,2Q  
5s,4Q  
10s,1Q 
10s,2Q 
10s,4Q 
15s,1Q 
15s,2Q 
15s,4Q 
20s,1Q 
20s,2Q 
20s,4Q 
25s,1Q 
25s,2Q 
25s,4Q 
0 
1 
0 
1 
1 
0.5 
1 
1 
1 
1 
0.5 
0.5 
1 
1 
1 
-4
-15
14
24
52
111
102
-14
-12
7
-5
77
-8
-1
-20
20
23
27
23
41
26
26
36
44
29
37
42
38
44
54
43%
25%
70%
86%
89%
100%
100%
35%
39%
60%
44%
97%
42%
49%
35%
119
82
123
125
189
199
318
283
130
176
176
241
534
308
210
17 
19 
26 
22 
34 
29 
22 
28 
40 
27 
39 
39 
32 
36 
50 
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
Average  21 34.0 60.9% 214 30.7 100%
 
Notes:  mean slack sμ ( ), standard deviation of slack ps sσ ( ) and timing yield Y. ps
 
 
Fig.7   Monte Carlo: standard deviation of slack (AHHK+link vs Steiner network). 
 
Notes:  Comparing with the AHHK+link results, our Steiner network heuristic can reduce 
the standard deviation of slack. 
23 
 
 
4.2 Cases with Multiple Critical Sinks 
We also tested the algorithms in cases with multiple critical sinks. That is, there 
may be several sinks with similar timing criticality in each net. In order to see the effect on 
fixing short path delay constraint violations, these testcases usually have tighter constraints 
on short path than on long path.  
The wire detour method can increase the delay to the early critical sink but at the 
cost of increasing long path delay, while our approach can increase short path delay and 
reduce long path delay simultaneously. Moreover, wire detour cannot lead to any tolerance 
to open faults as in non-tree.  
The average results in Table V show that our Steiner network heuristic can improve 
the slack by about 80ps on average when compared to performing wire detour on existing 
trees. The wirelength increase due to our method is about 4% with respect to the wire 
detour results. 
TABLE V 
CASES WITH MU RITICAL SINKS 
AHHK AHHK+detour AHHK+link Steiner network 
LTIPLE C
 
S W S W S W S   W
-74 14 3 2 7 7 2 1 44 18 1667 43 1564 97 1729
 
otes:  Comparison on average results (10 nets with 5-25 sinks) of slack S( ), total N ps
wirelength W( mμ ). 
 
 
24 
CHAPTER V 
CONCLUSIONS AND FUTURE WORK 
This work investigates timing driven routing by using non-tree topology. We 
propose a constructive Steiner network heuristic algorithm to do the signal net routing, 
which can improve the timing performance greatly. Our constructive Steiner network 
heuristic method considers the double-sided timing constraints, adopts dynamic 
programming to construct the non-tree solution, and uses greedy link insertion to insert 
links in subnetwork. And it can handle those cases whose sinks are distributed in multiple 
quadrants. Experimental results show that this is a very promising approach even when 
both long path and short path constraints are considered. In future, we can find non-tree 
routing method using more accurate delay model and study buffered non-tree routings. 
 
 
 
 
 
 
25 
REFERENCES 
[1] A. B. Kahng and G. Robins. On Optimal Interconnections for VLSI. Kluwer Academic 
Publishers, Boston, MA, 1995. 
 
[2] A. B. Kahng, B. Liu, and I. I. Mandoiu. “Non-tree routing for reliability and yield 
improvement”. International Conference on Computer Aided Design, pp. 260-266, 2002. 
 
[3] B. A. McCoy and G. Robins. “Non-tree routing”. Design Automation and Test in 
Europe, pp. 430-434, 1994. 
 
[4] T. Xue and E. S. Kuh. “Post routing performance optimization via tapered link insertion 
and wiresizing”. Design Automation and Test in Europe, pp. 74-79, 1995. 
 
[5] J. Lillis, C. K. Cheng, T. T. Lin, and C. Y. Ho. “New performance driven routing 
techniques with explicit area/delay tradeoff and simultaneous wire sizing”. Design 
Automation Conference, pp. 395–400, 1996. 
 
[6] W. Chuang, S. S. Sapatnekar, and I. N. Hajj. “Delay and area optimization for discrete 
gate sizes under double-sided timing constraints”. Custom Integrated Circuits Conference, 
pp. 9.4.1-9.4.4, 1993.  
 
[7] P. K. Chan and K. Karplus. “Computing signal delay in general RC networks by 
tree/link partitioning”. Transactions on Computer-Aided Design of Integrated Circuits and 
Systems, vol. 9, pp. 898-902, August 1990. 
 
[8] D. Lam, C.-K. Koh, Y. Chen, J. Jain, and V. Balakrishnan. “Statistical based link 
insertion for robust clock network design”. International Conference on Computer Aided 
Design, pp. 588-591, 2005. 
 
[9] R. Chaturvedi and J. Hu. “An efficient merging scheme for prescribed skew clock 
routing”. Transactions on Very Large Scale Integration Systems, vol. 13, pp. 750-754, June 
2005. 
 
[10] S. K. Rao, P. Sadayappan, F. K. Hwang, and P. W. Shor. “The rectilinear Steiner 
arborescence problem”. Algorithmica, vol. 7, pp. 277-288, 1992. 
 
[11] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University 
Press, Baltimore, MD, 1996. 
 
[12] A. B. Kahng and B. Liu. “Q-Tree: a new iterative improvement approach for buffered 
interconnect optimization”. in Proc. IEEE Computer Society Annual Symposium on VLSI, 
pp. 183-188, 2003. 
 
 
26 
VITA 
Name:   Qiuyang Li 
Address:  Department of Electrical and Computer Engineering  
C/O  Dr. Jiang Hu 
Texas A&M University  
College Station, TX 77843-3259 
Email Address:   qiuyang@tamu.edu, qiuyang.li@gmail.com 
Education:   B.S., Computer Science, Nankai University, 1999 
                          M.S., Computer Science, Nankai University, 2002 
M.S., Computer Engineering, Texas A&M University, 2006 
 
 
