Interconnect tree optimization algorithm in nanometer very large scale integration designs by Eh Kan, Chessda Uttraphan
  
INTERCONNECT TREE OPTIMIZATION ALGORITHM IN NANOMETER 
VERY LARGE SCALE INTEGRATION DESIGNS 
 
 
 
 
 
 
 
 
CHESSDA UTTRAPHAN EH KAN 
 
 
 
 
 
 
 
 
 
 
UNIVERSITI TEKNOLOGI MALAYSIA 
INTERCONNECT TREE OPTIMIZATION ALGORITHM IN NANOMETER 
VERY LARGE SCALE INTEGRATION DESIGNS 
 
 
 
CHESSDA UTTRAPHAN EH KAN 
 
 
 
A thesis submitted in fulfilment of the 
requirement for the award of the degree of 
Doctor of Philosophy (Electrical Engineering) 
 
 
 
 
 
 
Faculty of Electrical Engineering 
Universiti Teknologi Malaysia 
 
 
 
 
 
 
MARCH 2016 
iii 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
To Nukkul, Sukkrith, Su Naas, Father and Mother 
iv 
 
ACKNOWLEDGEMENT 
 
I would like to express my deepest appreciation to all those who provided me 
the possibility to complete this thesis. A special gratitude I give to my supervisor, Dr. 
Nasir Shaikh Husin, whose contribution in stimulating suggestions and 
encouragement, helped me to coordinate my research especially in writing this thesis. 
I am also very thankful to Prof. Dr. Mohamed Khalil Mohd. Hani who gives countless 
ideas and supports in completing this thesis.  
Furthermore, I would also like to acknowledge with much appreciation the 
crucial role of the staff member of VLSI research lab especially to En. Zulkifli Md. 
Yusof who contributed many ideas during the discussion of this work. Many thanks to 
Dr. Usman Ullah Sheikh for help me in debugging the codes. I am also indebted to 
Universiti Tun Hussein Onn (UTHM) and Ministry of Education, Malaysia for funding 
my Ph.D study. 
 My thanks, love and appreciations also go to my family. My wife, Su Naas 
who understands and always provide care and support throughout this challenging 
work. To my sons, Nukkul and Sukkrith who suffer from autism, I sincerely apologize 
if during this work I sometime spent less time with both of you.  
 
  
v 
 
ABSTRACT 
 
 This thesis proposes a graph-based maze routing and buffer insertion algorithm 
for nanometer Very Large Scale Integration (VLSI) layout designs. The algorithm is 
called Hybrid Routing Tree and Buffer insertion with Look-Ahead (HRTB-LA).  In 
recent VLSI designs, interconnect delay becomes a dominant factor compared to gate 
delay. The well-known technique to minimize the interconnect delay is by inserting 
buffers along the interconnect wires. In conventional buffer insertion algorithms, the 
buffers are inserted on the fixed routing paths. However, in a modern design, there are 
macro blocks that prohibit any buffer insertion in their respective area. Most of the 
conventional buffer insertion algorithms do not consider these obstacles. In the 
presence of buffer obstacles, post routing algorithm may produce poor solution. On 
the other hand, simultaneous routing and buffer insertion algorithm offers a better 
solution, but it was proven to be NP-complete. Besides timing performance, power 
dissipation of the inserted buffers is another metric that needs to be optimized. 
Research has shown that power dissipation overhead due to buffer insertions is 
significantly high. In other words, interconnect delay and power dissipation move in 
opposite directions. Although many methodologies to optimize timing performance 
with power constraint have been proposed, no algorithm is based on grid graph 
technique. Hence, the main contribution of this thesis is an efficient algorithm using a 
hybrid approach for multi-constraint optimization in multi-terminal nets. The 
algorithm uses dynamic programming to compute the interconnect delay and power 
dissipation of the inserted buffers incrementally, while an effective runtime is achieved 
with the aid of novel look-ahead and graph pruning schemes. Experimental results 
prove that HRTB-LA is able to handle multi-constraint optimizations and produces up 
to 47% better solution compared to a post routing buffer insertion algorithm in 
comparable runtime. 
vi 
 
ABSTRAK 
 
 Tesis ini mencadangkan algoritma penghalaan pagar sesat berasaskan graf dan 
sisipan penimbal untuk reka bentuk bentangan Pengamiran Skala Sangat Besar 
(VLSI). Algoritma ini dipanggil Penghalaan Pokok dan Penyisipan Penimbal Hibrid 
dengan Lihat ke Depan (HRTB-LA). Dalam reka bentuk VLSI terkini, lengah saling 
hubung menjadi lebih dominan berbanding lengah get. Teknik terkenal untuk 
meminimumkan lengah saling hubung adalah dengan memasukkan penimbal di 
sepanjang wayar saling hubung. Dalam algoritma sisipan penimbal konvensional, 
sisipan penimbal dilakukan di atas laluan saling hubung tetap. Namun begitu, dalam 
reka bentuk moden, terdapat blok-blok makro yang menghalang sebarang sisipan 
penimbal di kawasan mereka masing-masing. Kebanyakan algoritma sisipan penimbal 
konvensional tidak mengambil kira halangan tersebut. Dengan kehadiran halangan 
penimbal, algoritma sisipan penimbal pasca penghalaan ini berkemungkinan 
menghasilkan penyelesaian yang kurang baik. Sebaliknya, algoritma penghalaan dan 
sisipan penimbal serentak berupaya untuk menghasilkan penyelesaian yang lebih baik, 
tetapi ia telah terbukti sebagai NP-lengkap. Selain dari prestasi pemasaan, pelepasan 
kuasa oleh penimbal adalah satu lagi metrik yang perlu dioptimumkan. Kajian telah 
menunjukkan bahawa pelepasan kuasa overhed dalam sisipan penimbal optimum 
adalah cukup tinggi. Dengan kata lain, lengah saling hubung dan pelepasan kuasa 
bergerak dalam arah yang bertentangan. Walaupun terdapat banyak metodologi untuk 
mengoptimumkan prestasi pemasaan dengan penghadan kuasa telah dicadangkan, 
namun tiada algoritma yang berasaskan teknik graf grid. Oleh itu, sumbangan utama 
tesis ini adalah sebuah algoritma yang menggunakan pendekatan hibrid untuk 
pengoptimuman multi-penghadan dalam jaring multi-terminal. Algoritma ini 
menggunakan pemprograman dinamik untuk menghitung lengah saling hubung dan 
pelepasan kuasa penimbal secara tambahan, waktu jalan efektif pula dicapai dengan 
bantuan skema lihat ke depan dan pemangkasan graf. Keputusan eksperimen 
membuktikan HRTB-LA berupaya untuk mengendalikan pengoptimuman multi-
penghadan dan menghasilkan penyelesaian sehingga 47% lebih baik berbanding 
dengan algoritma sisipan penimbal pasca penghalaan dalam waktu jalan yang setara.  
vii 
 
TABLE OF CONTENTS 
 
 
 
 
CHAPTER TITLE PAGE 
   
 DECLARATION ii 
 DEDICATION iii 
 ACKNOWLEDGEMENTS iv 
 ABSTRACT v 
 ABSTRAK vi 
 TABLE OF CONTENTS vii 
 LIST OF TABLES xi 
 LIST OF FIGURES xiv 
 LIST OF ABBREVIATIONS xix 
 LIST OF APPENDICES xxi 
   
1 INTRODUCTION 1 
 1.1 Overview 1 
 1.2 Problem statement 2 
 1.3 Research objectives 6 
 1.4 Problem formulation 7 
 1.5 Scope of works 8 
 1.6 Research contributions 9 
 1.7 Thesis outline 9 
   
2 LITERATURE REVIEW 11 
 2.1 Interconnect routing 11 
  2.1.1 Two-terminal nets routing 12 
  2.1.2 Multi-terminal nets routing and topology  
viii 
 
   optimization 13 
 2.2 Post routing optimization with buffer insertion 16 
  2.2.1 Closed-form solution 17 
  2.2.2 Dynamic programming 19 
 2.3 Simultaneous routing and buffer insertion  20 
  2.3.1 Two-terminal net 21 
  2.3.2 Multi-terminal net 21 
 2.4 Tree adjustment technique 23 
 2.5 Multi-constraint optimization techniques 23 
 2.6 Other delay models 26 
 2.7 Summary 27 
   
3 FUNDAMENTAL THEORY AND MODELLING 29 
 3.1 Algorithm and complexity analysis 29 
 3.2 Dijkstra’s shortest path algorithm 32 
  3.2.1 Implementation of Dijkstra’s algorithm 
 using priority queue 
 
34 
  3.2.2 Time complexity of Dijkstra’s algorithm 40 
 3.3 Interconnect delay model 40 
 3.4 Power dissipation in buffered path interconnect 43 
 3.5 van Ginneken algorithm 45 
 3.6 Simultaneous routing and buffer insertion 49 
  3.6.1 Modelling VLSI routing with buffer 
 insertion as a shortest-path problem 
 
49 
  3.6.2 Simultaneous routing and buffer insertion 
 for two-terminal nets 
 
50 
  3.6.3 Simultaneous routing and buffer insertion 
 for multi-terminal nets 
 
53 
 3.7 Delay and power model formulation for HRTB-
LA 
 
55 
  3.7.1 Delay and power computation for 
upstream path expansion 
 
55 
    3.7.2 Delay and power computation for  
ix 
 
 downstream path expansion 57 
 3.8 Multi-constraint routing 59 
 3.9 Multi-constraint routing with look-ahead scheme 63 
 3.10 Summary 66 
   
4 DESIGN AND DESCRIPTION OF HRTB-LA 
ALGORITHM 
67 
 4.1 HRTB-LA overview 67 
 4.2 Tree adjustment  68 
 4.3 Graph pruning  72 
 4.4 Path expansion and look-ahead scheme 78 
  4.4.1 Path expansion in HRTB 78 
  4.4.2 Path expansion in HRTB-LA 84 
 4.5 Time complexity of HRTB-LA 90 
 4.6 Numerical illustration of HRTB and HRTB-LA 91 
  4.6.1 Numerical illustration of HRTB 92 
  4.6.2 Numerical illustration of HRTB-LA 103 
 4.7 Summary 108 
   
5 SOFTWARE DESIGN OF HRTB-LA 109 
 5.1 HRTB-LA data structure 109 
  5.1.1 Data structure of the pre-processing 
data 
 
110 
  5.1.2 Data structure of the candidate solutions 110 
 5.2 Linked list functions 113 
  5.2.1 Inserting data into a linked list 114 
  5.2.2 Retrieving data from a linked list 114 
 5.3 Priority queue in HRTB-LA 115 
 5.4 Pseudo-code of HRTB-LA’s main function 115 
 5.5 Summary 118 
   
 
 
  
x 
 
6 VERIFICATION AND PERFORMANCE TEST 
OF HRTB AND HRTB-LA 
119 
 6.1 Overview 119 
 6.2 Wire and buffer parameters 120 
 6.3 Verification of the proposed algorithm 122 
  6.3.1 Verification for timing optimization 122 
  6.3.2 Verification of the iterative power  
  computation scheme 
 
125 
 6.4 Performance test 1 – solution quality 130 
 6.5 Performance test 2 – solution quality, runtime 
 and the number of candidate solutions 
 
133 
 6.6 Performance test 3 – Delay-power optimization 140 
 6.7 Summary 146 
   
7 CONCLUSIONS AND FUTURE WORKS 147 
 7.1 Conclusions 147 
 7.2 Future works 149 
   
REFERENCES 152 
  
Appendices A – C     160 - 189 
 
xi 
 
LIST OF TABLES 
 
TABLE NO. TITLE PAGE 
3.1 Time complexity for heap operations for binary, binomial and 
Fibonacci heap (Cormen et al. 2009) 
 
35 
4.1 Previ and Prevj for all vertices for the graph in Figure 4.9 81 
4.2 List of the predicted end-to-end delay, end-to-end power and 
predicted end-to-end path length lp(P) at vertex 8 in Figure 
4.10 
 
 
88 
4.3 The values in the list and priority queue for path expansion 
from sink1 to Steiner node in HRTB for graph in Figure 4.12. 
The key in the grey box indicates the lowest key value that 
will be extracted in the next EXTRACT_MIN(Q) (a) initial 
values of the list and priority queue (b) to (l) the values in the 
list and priority queue after 1st extraction to 11th extraction 
respectively (m) the values in the list and priority queue after 
the path expansions were completed (the text in red colour 
indicates that the candidate solution is dominated) 
 
 
 
 
 
 
 
 
93 
4.4 The values in the list and priority queue for path expansions 
from sink2 to Steiner node in HRTB for graph in Figure 4.12. 
The key in the grey box indicates the lowest key value and it 
will be extracted in the next EXTRACT_MIN(Q) (a) initial 
values of the list and priority queue (b) and (c) the values 
after the 1st and 2nd extractions respectively (d) the final 
values after the queue is empty 
 
 
 
 
 
 
100 
 
xii 
 
4.5 The values in the list and priority queue for path expansions 
from the Steiner node to the source in HRTB for graph in 
Figure 4.12. The key in the grey box indicates the lowest key 
value and it will be extracted in the next EXTRACT_MIN(Q) 
(a) initial values of the list and priority queue (b) to (e) the 
values after 1st extraction to 4th extraction respectively (f) the 
final values after the queue is empty 
 
 
 
 
 
 
101 
4.6 The values in the list and priority queue for path expansions 
from the Steiner node to the source in HRTB for graph in 
Figure 4.12. The key in the grey box indicates the lowest key 
value and it will be extracted in the next EXTRACT_MIN(Q) 
(a) initial values of the list and priority queue (b) to (e) the 
values after 1st extraction to 4th extraction respectively (f) the 
final values after the queue is empty 
 
 
 
 
 
 
104 
4.7 The values in the list and priority queue for the path 
expansions from sink2 to Steiner node in HRTB-LA for 
graph in Figure 4.12. The key in the grey box indicates the 
lowest key value and it will be extracted in the next 
EXTRACT_MIN(Q) (a) initial values of the list and priority 
queue (b)  and (c) the values after the 1st and 2nd extraction 
respectively 
 
 
 
 
 
 
107 
4.8 The values in the list and priority queue for path expansions 
from the Steiner node to the source in HRTB-LA for graph in 
Figure 4.12. The key in the grey box indicates the lowest key 
value and it will be extracted in the next EXTRACT_MIN(Q) 
(a) initial values of the list and priority queue (b) to (d) the 
values after the 1st extraction to 3rd extraction respectively 
 
 
 
 
 
107 
5.1 Attributes of a candidate solution 111 
6.1 Wire dimension and parameters 121 
6.2 Buffer library 121 
6.3 Characteristics of the test nets and graphs 124 
xiii 
 
6.4 Solution from HRTB and FBI running on test nets 125 
6.5 Delay at source comparison between FBI, RIATA, HRTB 
and HRTB-LA 
 
132 
6.6 Solution quality, runtime and number of candidate solutions 
for net N5 
 
135 
6.7 Solution quality, runtime and number of candidate solutions 
for net N10 
 
135 
6.8 Solution quality, runtime and number of candidate solutions 
for net N25 
 
136 
6.9 Performance comparison between HRTB and HRTB-LA for 
net N5 
 
142 
6.10 Performance comparison between HRTB and HRTB-LA for 
net N25 
 
142 
 
 
 
xiv 
 
LIST OF FIGURES 
 
FIGURE NO. TITLE PAGE 
1.1 (a) Buffer insertion on fixed routing tree that ignores buffer 
obstacles (b) buffer insertion on fixed routing tree that avoids 
obstacles (c) buffer insertion on the fixed routing tree with 
tree adjustment (RIATA) and (d) simultaneous routing tree 
and buffer insertion on the adjusted tree 
 
 
 
 
5 
1.2 A tree on uniform grid graph G = (V, E) 8 
2.1 (a) Rectilinear minimum spanning tree (R-MST) (b) 
rectilinear Steiner minimal tree (R-SMT). Hollow dots 
indicate net terminals while solid dots are the Steiner nodes 
 
 
13 
2.2 (a) A wire of length L and (b) Corresponding -model RC 
circuit 
 
17 
2.3 A wire inserted with N-1 number of buffers 18 
3.1 O-notation gives an upper bound for a function to within a 
constant factor (Cormen et al. 2009) 
 
30 
3.2 Runtime trend of algorithms 31 
3.3 Relaxation on edge (u, v) with weight w(u, v) = 3. (a) The 
relaxation step when v.d > u.d + w(u, v) and (b) The 
relaxation step when v.d < u.d + w(u, v) 
 
 
33 
3.4 A 5  4 grid graph 34 
   
 
xv 
 
3.5 Illustration of Dijkstra’s algorithm on general graph, with a as 
the source (a) initialization (b) path expansion from ab, ac 
(c) path expansion from cd, ce (d) path expansion from 
bd (e) path expansion from de, df (f) final expansion 
from ef gives shortest path from acdef with cost = 
7 
 
 
 
 
 
37 
3.6 Illustration of Dijkstra’s algorithm on uniform grid graph (a) 
Initial graph v.d =  (b) expansion 12 and 4 (c) expansion 
23 and 5 (d) expansion 45 and 7 (e) expansion 36 (f) 
expansion 56 and 8 (g) expansion 78 (h) expansion 69 
(i) expansion 89 
 
 
 
 
39 
3.7 A simple RC tree illustrating the process of calculating Elmore 
delay 
 
41 
3.8 Buffered path interconnect 44 
3.9 Fixed routing tree connecting source node to the Steiner node 
and all sink nodes 
 
46 
3.10 Candidate solutions at each node. The red colour is the best 
solution for the given path 
 
47 
3.11 Path expansion in 2-terminal net simultaneous routing and 
buffer insertion (a) expansion from Sink node to vertices 5, 9 
and 15 (b) expansion from vertex 5 to vertex 4 
 
 
52 
3.12 The 2D grid graph representing the interconnect tree in Figure 
3.9 
 
54 
3.13 Simultaneous routing and buffer insertion in multi-terminal 
net. The arrows show the direction of the path expansions 
 
54 
3.14 Wire expansion from vertex v to vertex u for upstream path 
expansion 
 
56 
3.15 Wire expansion from vertex v to vertex u and insert buffer at v 56 
xvi 
 
3.16 Wire expansion from vertex u to vertex v for downstream path 
expansion 
 
58 
3.17 Wire expansion from vertex u to vertex v and insert buffer at v 58 
3.18 Path expansion in multi-constraint graph (a) Initialization (b) 
first path expansion (c) expansion for path cd, ce (d) 
expansion for path bd (e) expansion for path ef (f) 
expansion for path de, df extracted from 0.6 (g) expansion 
for path de, df extracted from 0.7 (h) expansion for path 
ef 
 
 
 
 
 
62 
3.19 Path expansion in multi-constraint graph with look-ahead 
scheme (a) initialization (b) first path expansion (c) expansion 
for path cd (d) expansion for path  df (e) expansion from 
path  bd 
 
 
 
65 
4.1 Main stages in HRTB-LA 68 
4.2 Tree adjustment technique (a) a Steiner node m is inside the 
obstacle (b) an alternative Steiner node m’ is generated (Hu et 
al. 2003) 
 
 
69 
4.3 Flowchart of the tree adjustment 70 
4.4 Sample tree for illustrating tree adjustment 71 
4.5 Flowchart of the graph pruning 73 
4.6 Vertex v is pruned as L_ToEnd[v] + L_ToStart[v] > 
L_StartEnd 
 
76 
4.7 Graph pruning in HRTB-LA (a) initial graph (b) graph pruning 
for sink1  Steiner node expansions (c) graph pruning for 
sink2  Steiner node expansions (d) graph pruning for Steiner 
node  Source expansions 
 
 
 
76 
4.8 Flowchart of the path expansion in the proposed algorithm 80 
4.9 Example of candidate solutions at each vertex 80 
xvii 
 
4.10 Sample routing graph and path expansion of the proposed 
algorithm 
83 
4.11 The look-ahead weight vectors for the graph in Figure 4.10 87 
4.12 Sample grid graph 92 
4.13 Routing solution 102 
4.14 Look-ahead weight vectors for (a) first set (b) second set 103 
5.1 Node labelling for a net with two sinks in HRTB-LA 110 
5.2 Illustration of Previ and Prevj attributes 112 
5.3 Linked list for vertex v with three candidate solutions 113 
6.1 Sample tree with 5 sinks 123 
6.2 Solution from FBI algorithm 123 
6.3 Solution from HRTB algorithm 124 
6.4 Illustration of the iterative power computation (a) sample net 
(b) upstream computation (c) downstream computation 
 
128 
6.5 Illustration of the iterative power computation for multi buffer 
types (a) sample net (b) upstream computation (c) downstream 
computation 
 
 
129 
6.6 Plot of net N5 test results (a) slack at source (b) runtime (c) 
number of candidate solutions 
 
137 
6.7 Plot of net N10 test results (a) slack at source (b) runtime (c) 
number of candidate solutions 
 
138 
6.8 Plot of net N25 test results (a) slack at source (b) runtime (c) 
number of candidate solution 
 
139 
6.9 Plot of net N5 test results for delay-power constraint 
optimization (a) runtime (b) number of candidate solutions 
 
143 
6.10 Plot of net N25 test results for delay-power constraint 
optimization (a) runtime (b) number of candidate solutions 
 
144 
xviii 
 
6.11 Routing solutions for 4-sink net for different power constraints 
(a) routing solution for maximum slack with no power 
constraint  (b) routing solution when power was constrained at 
30 mW (c) routing solution when power was constrained at 20 
mW 
 
 
 
 
145 
7.1 Sample net where a pair of nodes are on the same horizontal 
line (a) before pruning (b) effective search space for path 
expansions between sink1 and Steiner node 
 
 
150 
A1 Fibonacci heap structure 160 
A2 Inserting a node into a Fibonacci heap (a) a Fibonacci heap H 
(b) Fibonacci heap H after inserting the node with key 21 
 
160 
A3 The process of EXTRACT_MIN(H) (a) meld the childs into 
root list (b) label the rank (c) to (e) mark current node and 
updating rank list from left to right (f) link 23 into 17 (g) to (h) 
link 24 into 7 (i) to (k)  link 41 into 18 (l) final heap 
 
 
 
161 
A4 DECREASE-KEY(H, x, k), if the heap order not violated (a) 
initial heap structure (b) the key of x is decreased from 46 to 
29 
 
 
163 
A5 DECREASE-KEY(H, x, k), if the heap order is violated (a) 
decrease the key (b) cut the tree rooted at x,  meld into the root 
list and former parent node is marked 
 
 
164 
 
xix 
 
LIST OF ABBREVIATIONS 
 
A-Tree – Arborescence-Tree 
BPRIM – Bounded Prim 
BRBC – Bounded Radius Bounded Cost 
BR-MRT – Bounded Radius - Minimum Routing Tree 
CMOS – Complementary Metal Oxide Semiconductor 
C-Tree – Clustered-Tree 
DP – Dynamic Programming 
ED – Elmore Delay 
FBI – Fast Buffer Insertion 
HRTB – Hybrid Routing Tree and Buffer insertion 
HRTB-LA – Hybrid Routing Tree and Buffer insertion with Look-Ahead 
ITRS – International Technology Roadmap for Semiconductors 
L-RST – L-shaped Rectilinear Steiner Tree 
MCOP – Multi-Constraint Optimal Path 
MCP – Multi-Constraint Path 
MOS – Metal Oxide Semiconductor 
MRSA – Minimum Rectilinear Steiner Arborescence 
MST – Minimum Spanning Tree 
NP – Non-deterministic Polynomial time 
PDF – Probability Density Function 
QoS – Quality-of-Service 
RAT – Required Arrival Time 
RC – Resistor-Capacitor 
RIATA – Repeater Insertion with Adaptive Tree Adjustment 
RLC – Resistor-inductor-Capacitor 
xx 
 
RMP – Recursive Merging and Pruning 
R-MST – Rectilinear Minimal Spanning Tree 
R-SMT – Rectilinear Steiner Minimal Tree 
RTBW – Routing Tree with Buffer insertion and Wire sizing 
SAMCRA – Self-Adaptive Multi-Constrained Routing Algorithm 
SMT – Steiner Minimal Tree 
S-RABILA – Simultaneous Routing and Buffer Insertion with Look-
Ahead 
VLSI – Very Large Scale Integration 
 
xxi 
 
LIST OF APPENDICES 
 
APPENDIX TITLE PAGE 
A Fibonacci heap operations 160 
B C Code for HRTB-LA algorithm 165 
C Output sample 187 
 
 
 
CHAPTER 1 
INTRODUCTION 
1.1 Overview 
The demand for high speed and low power consumption for today’s applications has 
forced dramatic changes in the design and manufacturing methodologies for very large 
scale integration (VLSI) circuits (Celik et al. 2002; Ekekwe 2010; ITRS 2012; 2013). 
To meet the demand, the number of devices (i.e. transistors) on a single chip must be 
increased and this requires decrease of the device size and also will need a larger layout 
area to support huge amounts of devices. 
 As the size of devices decreases and the device operates at a higher speed, the 
interconnect delay becomes much more significant compared to the device delay. Most 
of the delay in integrated circuits is due to the time it takes to charge and discharge the 
capacitance of the wires and the gates of the transistors. The resistance R = rl of a wire 
increases linearly with its length l and so does its capacitance C = cl. Where c and r 
are unit capacitance and unit resistance respectively. Hence, the RC delay of the wire 
is D = ½RC = ½rcl2 (van Ginneken 1990). Clearly, the delay increases quadratically 
with the length of the wire (Saxena et al. 2004; ITRS 2012).  
One of the effective techniques to reduce the interconnect delay is by inserting 
a buffer to restore the signal strength along the interconnect tree. As design dimensions 
continue to shrink, more and more buffers are needed to improve the performance. 
However, buffer itself consumes power and it has been shown that power dissipation 
overhead due to optimal buffer insertion is significantly high (Ekekwe 2010). 
2 
 
 According to (Saxena et al. 2004), the critical inter buffer length (the minimum 
wire segment length where the buffer is required) decreased at the rate of 68% when 
the VLSI technology migrates from 90 nm to 45 nm. This inter buffer length scaling 
significantly outpaces the VLSI technology scaling which is roughly 0.5 times for 
every two generations. The total block cell count made up of buffers will reach 35% 
in the 45-nm technology node and 70% in 32-nm technology.  
 The dramatic buffer scaling undoubtedly generates large and profound impact 
on VLSI circuit design. With millions of buffers required per chip, almost nobody can 
afford to neglect the importance of optimal buffer insertion as compared to a decade 
ago when only a few thousands of buffers are needed for a chip (Cong 1997). Because 
of this importance, buffer insertion algorithms and methodologies need to be deeply 
studied on various aspects. First, a buffer insertion algorithm should deliver solutions 
of high quality because interconnect and circuit performance largely depend on the 
way that buffers are placed. Second, a buffer insertion algorithm needs to be 
sufficiently fast so that millions of nets can be optimized in reasonable time. Third, 
accurate delay models are necessary to ensure that buffer insertion solutions are 
reliable. Fourth, buffer insertion techniques are expected to simultaneously handle 
multiple objectives, such as timing, power and signal integrity (Alpert et al. 2009).  
 
1.2 Problem statement 
Interconnect is a wiring system that distributes clock and other signals to the various 
functional blocks of a CMOS integrated circuit. When the VLSI technology is scaled 
down, gate delay and interconnect delay change in opposite directions. Smaller devices 
lead to less gate switching delay. In contrast, thinner wire leads to increased wire 
resistance and greater signal propagation delay along wires. As a result, interconnect 
delay has become a dominating factor for VLSI circuit performance (ITRS 2012; 
2013).  
3 
 
 Among the available techniques, buffer insertion has been proven to be one of 
the best techniques to reduce the interconnect delay for a long wire. The main 
challenge in interconnect buffer insertion is how to determine optimal number of 
buffers and their placement in the given interconnect tree. The most influential and 
systematic technique was proposed by (van Ginneken 1990). Given the possible buffer 
locations, this algorithm can find the optimum buffering solution for the fixed signal 
routing tree that will maximize timing slack at the source according to Elmore delay 
model (Elmore 1948). As the number of buffers inserted in the circuits increases 
dramatically, an algorithm that is fast and efficient is essential for the design 
automation tools. van Ginneken’s algorithm utilized dynamic programming which 
tries to find an optimal solution to a problem by first finding optimal solutions to sub 
problems and then merging them to find an optimal solution to the larger problem.  
 Recently, many techniques to speedup van Ginneken’s algorithm and its 
extensions were proposed such as in (Lillis et al. 1996), (Shi and Li 2003), (Shi and Li 
2005), (Li and Shi 2006) and (Li et al. 2012). However, van Ginneken’s algorithm and 
its extensions can only operate on fixed routing tree. They will give optimal solution 
when the best routing tree is given but produce a poor solution when a poor routing 
tree is provided especially when there are obstacles in the designs. In today’s VLSI 
design, some regions may be occupied by predesigned libraries such as IP blocks and 
memory arrays. Some of these regions do not allow buffer or wire to pass through and 
some regions only allow wire to go through but are restricted for any buffer insertion. 
Therefore, buffer insertion has to be performed with consideration of this buffer and 
wire obstacles (Alpert et al. 2009; Khalil-Hani and Shaikh-Husin 2009). The best way 
to handle the obstacles is to perform the routing and buffer insertion simultaneously 
using a grid graph technique. However, research has shown that simultaneous routing 
and buffer insertion is NP-complete (Hu et al. 2009). The available known techniques 
today are either explore dynamic programming to compute optimal solution in the 
worst-case exponential time or design efficient heuristic without performance 
guarantee.  
 The dynamic programming algorithm such as RMP (recursive merging and 
pruning) algorithm can find an optimal buffering solution for multi-terminal nets 
4 
 
(Cong and Yuan 2000),  but it is not efficient when the number of sinks and the number 
of possible buffer locations are big as the search space is very large. Indeed, (Hu et al. 
2003) show that the searching in RMP is NP-complete, and they also proposed a 
heuristic algorithm to solve multi-pin nets buffer insertion problem by constructing a 
performance driven Steiner tree and create an alternative Steiner node if the original 
Steiner node is inside the obstacle. The algorithm is called RIATA for Repeater 
Insertion with Adaptive Tree Adjustment. RIATA is very fast because it operates on a 
fixed tree. However, the quality of the solution may not be good enough if many paths 
of the adjusted tree still overlap with the buffer obstacles as illustrated in Figure 1.1.  
 Figure 1.1 shows example of possible solutions for a net with a tree structure 
(multi-terminal) where the grey areas represent buffer obstacles. It has three sinks s1, 
s2 and s3 with S0 as the source. In this illustration, appropriate parameters for wires and 
buffers are applied (will discuss in detail in Chapter 2). Figure 1.1(a) shows the 
solution from van Ginneken’s algorithm where the slack at source is -899.74 ps (the 
slack is the required arrival time at sink minus the accumulated delay). This means that 
the timing is not met because most of the routing paths are inside the buffer obstacles 
where buffer insertion is not allowed. One can rerout the tree such that all the paths 
avoid the buffer obstacles as shown in Figure 1.1(b). The slack is improved to -44.39 
ps but still violates the timing requirement due to increased wire length. The tree 
adjustment technique according to RIATA produces a solution as shown in Figure 
1.1(c). Now the timing requirement is met, with slack at source of 11.64 ps. RIATA is 
efficient in terms of runtime but its solution quality still depends on its newly generated 
tree. If most of the paths are inside the buffer obstacles, the room for timing 
improvement is still limited. 
  Instead of fully constructing the routing path simultaneously with buffer 
insertion like in RMP algorithm, one can utilize the simultaneous approach on the 
adjusted tree. Figure 1.1(d) illustrates the routing path generated by this approach. The 
slack obtained at source is improved to 217.65 ps. Clearly, this hybrid technique 
produces the best result compared to the techniques that perform buffer insertion on  
the fixed routing path like van Ginneken’s algorithm (and its extensions) and RIATA. 
The runtime of this hybrid technique can be improved by adopting the technique called 
5 
 
look-ahead proposed by (Shaikh-Husin 2008; Khalil-Hani and Shaikh-Husin 2009) to 
solve the simultaneous routing and buffer insertion for two terminals (single-sink) net 
problems. 
  
(a) (b) 
  
(c) (d) 
Figure 1.1 (a) Buffer insertion on fixed routing tree that ignores buffer obstacles (b) 
buffer insertion on fixed routing tree that avoids obstacles (c) buffer insertion on the 
fixed routing tree with tree adjustment (RIATA) and (d) simultaneous routing tree and 
buffer insertion on the adjusted tree 
 Another issue that the previous dynamic programming algorithms did not take 
into consideration is power consumed by the buffers inserted along the interconnect 
tree. It has been found that power dissipation overhead due to optimal buffer insertion 
is significantly high and can be as high as 20% of total chip power dissipation 
(Nalamalpu and Burleson 2001). Hence, in addition to timing performance, power 
dissipation constraint should also be integrated into buffer insertion algorithm 
(Nalamalpu and Burleson 2001; Ekekwe 2010). Many methodologies to optimize 
propagation delay with power constraint have been proposed, (Nalamalpu and 
 
 
 
 
s
1
 
s
2
 
s
3
 
S
0
  
 
 
 
 
s
1
 
s
2
 
s
3
 
S
0
  
 
 
 
 
s
1
 
s
2
 
s
3
 
S
0
  
 
 
 
 
s
1
 
s
2
 
s
3
 
S
0
  
6 
 
Burleson 2001; Banerjee and Mehrotra 2002; Wason and Banerjee 2005; Li et al. 2005; 
Narasimhan and Sridhar 2010) but none of them can be integrated into buffer insertion 
algorithm that is based on dynamic programming on grid graph. The grid graph 
technique is used because the simultaneous routing and buffer insertion utilizes the 
maze search algorithm that is best implemented using the graph search algorithm 
(Cormen et al. 2009). Furthermore, the uniform grid graph allows the buffers to be 
inserted anywhere (except in buffer obstacle areas), hence, improve the solution 
quality. Meanwhile, the advantage of dynamic programming is that it allows the use 
of multiple buffer types.  
 From the discussion above, the problem is now summarized as follows; 
buffering in a multi-terminal net is known to be NP-complete and the existing available 
algorithms that give an optimum solution is too slow while heuristic algorithms are 
fast but produce poor solutions. Even though buffer insertion is one of the most studied 
problems in VLSI physical design, finding an efficient algorithm with provably good 
performance still remains an active research area. Also, as design dimensions 
continuously shrink, more and more buffers are needed to improve the performance 
(i.e. speed and signal integrity) of the designs but the buffer itself consumes power. 
Therefore, we need a new algorithm that is capable to handle these constraints 
efficiently.  
 
1.3 Research objectives  
The objectives of this research are as follows: 
(1) To propose an efficient graph-based maze routing and buffer insertion 
algorithm for nanometer VLSI layout designs. The algorithm is designed for 
multi-terminal nets and multi-constraint optimization. The constraints are as 
follows; routing obstacles, timing performance and power dissipation. 
(2) To propose a power computation scheme for the proposed algorithm that can 
be computed iteratively based on dynamic programming framework. 
7 
 
1.4 Problem formulation 
The simultaneous routing and buffer insertion problem in VLSI layout design is 
essentially a buffered routing path search problem. In this work, it is formulated as a 
shortest-path problem in a weighted graph specified as follows. Given a routing grid 
graph G = (V, E) corresponding to VLSI layout where v  V and e  E is a set of 
internal vertices and a set of internal edges respectively, with a source vertex S0  V, 
n sink vertices s1, s2, …, sn  V, n – 1 Steiner vertices m1, m2, …, mn-1  V, required 
arrival time RAT(s1), RAT(s2), …, RAT(sn), a power constraint Pc, a buffer library B, 
and a wire parameter W. The goal is to find a routing tree simultaneously with buffer 
insertion such that the slack at source and power dissipation of buffers satisfy the given 
constraints. A vertex vi  V may belong to the set of buffer obstacle vertices, denoted 
VOB or a set of wire obstacle vertices, denoted as VOW. A buffer library B contains 
different types of buffer. For each edge e = u  v, signal travels from u to v, where u 
is the upstream vertex and v is the downstream vertex and u, v  VOW. A uniform grid 
graph illustrating some of the parameters for the problem formulation is shown in 
Figure 1.2. 
 The proposed algorithm is called HRTB-LA which stands for Hybrid Routing 
Tree and Buffer insertion with Look-Ahead. Instead of a fixed routing tree as in van 
Ginneken’s algorithm and RIATA, we use maze routing to find the solution. However, 
HRTB-LA will not explore the entire 2D graph as in RMP because we use an initial 
tree as a reference for determining the Steiner nodes as in RIATA. We also incorporate 
the technique of graph pruning and look-ahead to speed up the runtime of the 
algorithm. 
8 
 
 
Figure 1.2 A tree on uniform grid graph G = (V, E) 
 
1.5 Scope of works 
The scopes of this research are as follows: 
(a) Elmore delay metric is used to calculate the interconnect delay due to its high 
fidelity and speed (Alpert et al. 2007; Li et al. 2012; ITRS 2012; 2013). 
(b) Uniform grid graph is used to represent VLSI layout and maze routing (Zhou et 
al. 2000; Khalil-Hani and Shaikh-Husin 2009) is used for path search. 
(c) There are many algorithms for tree construction in VLSI routing (i.e. Steiner 
minimal tree) and the Steiner tree construction itself is a hard problem. 
Therefore, we assume that the pre-processing tree is available.  
(d) The performance of the proposed algorithm is benchmarked with available 
similar algorithms. 
 
 
9 
 
1.6 Research contributions 
We propose a new algorithm for simultaneous tree construction and buffer insertion 
with multi-constraint optimization. The contributions of this research can be listed as 
follows: 
(a) The concept of look-ahead scheme (Khalil-Hani and Shaikh-Husin 2009) which 
is proven to be efficient for two-terminal (single-sink) nets is adopted into this 
work such that it can handle multi-terminal nets. 
(b) The algorithm is designed such that it can also optimize multiple constraints such 
as obstacles, timing and power dissipation of the buffered interconnect tree. 
(c) The iterative computation of power dissipation in dynamic programming 
framework is proposed. 
(d) In algorithm design, the time complexity of the algorithm is used to measure the 
efficiency of the algorithm. Therefore, the time complexity of HRTB-LA is 
analysed and presented. 
 
 
1.7 Thesis outline 
This thesis consists of seven chapters. Chapter 1 presents research background, 
problem formulation and objectives of the research. The literature review is presented 
in Chapter 2 which discusses the evolution of interconnect optimization techniques 
ranging from two-terminal to multi-terminal nets. Next, the post routing optimization 
(focusing on buffer insertion) is discussed. In this section, we reviewed the buffer 
insertion algorithms on fixed tree followed by the simultaneous routing and buffer 
insertion algorithms. Lastly, the buffer insertion algorithms with multi-constraint 
optimization and other delay models are discussed.  
10 
 
 Chapter 3 presents research background and the theories associated with this 
research. First, the concept of algorithm and its complexity analysis is presented 
followed by Dijkstra’s shortest path algorithm. Next, the Elmore delay and power 
dissipation in buffered interconnect are discussed. The details of buffering algorithms 
are also discussed in this chapter. Lastly, the delay and power formulation for the 
proposed algorithm is presented followed by the fundamental concept of multi-
constraint routing and the look-ahead scheme. 
 Chapter 4 presents the design description of the proposed algorithm, HRTB-LA. 
The main stages of HRTB-LA are discussed in detail. The path expansion process, 
which is the core of the algorithm, is presented with the aid of numerical examples. 
We present two types of path expansion which are (1) the normal path expansion 
without look-ahead scheme and (2) path expansion with look-ahead scheme. The 
numerical examples demonstrate the advantages of the novel look-ahead scheme in 
HRTB-LA. 
 Chapter 5 gives detail descriptions of the software design of HRTB-LA. It 
focuses on the data structures that are used by the algorithm which are; array data 
structure, linked list data structure and priority queue implemented using a heap data 
structure. The pseudo-code of HRTB-LA’s main functions are also presented in this 
chapter. 
 Chapter 6 presents the verification and performance test of the proposed 
algorithm. HRTB-LA is benchmarked with other similar algorithms and the results are 
presented. And finally, Chapter 7 concludes the research and recommendations for 
future works are given. 
 
REFERENCES 
 
 
 
Alpert, C.J. and Devgan, A. (1997). Wire segmenting for improved buffer insertion. 
Proceedings of the 34th annual Design Automation: 588–593. 
Alpert, C.J. et al. (2002). Buffered Steiner trees for difficult instances. IEEE Trans. 
Computer-Aided Design of Integrated Circuits and Systems. 21(1): 3–14. 
Alpert, C.J. et al. (2004). Closed-Form Delay and Slew Metrics Made Easy. IEEE 
Trans. Computer-Aided Design of Integrated Circuits and Systems. 23(12): 
1661–1669. 
Alpert, C.J. et al. (2007). Techniques for fast physical synthesis. IEEE invited paper. 
95(3): 573–599. 
Alpert, C.J., Devgan, A. and Quay, S.T. (1999). Buffer insertion for noise and delay 
optimization. IEEE Trans. Computer-Aided Design of Integrated Circuits and 
Systems. 18(11): 1633–1645. 
Alpert, C.J., Mehta, D.P. and Sapatnekar, S.S. (2009). Handbook of algorithms for 
physical design automation: Auerbach Publications. 
Bakoglu, H.B. (1990). Circuits, interconnections, and packaging for VLSI. Addison-
Wesley. 
Banerjee, K. and Mehrotra, A. (2002). A power-optimal repeater insertion 
methodology for global interconnects in nanometer designs. IEEE Trans. 
Electron Devices. 49(11): 2001–2007. 
153 
 
Boese, K.D. et al. (1995). Near-optimal critical sink routing tree constructions. IEEE 
Trans. Computer-Aided Design of Integrated Circuits and Systems. 14(12): 
1417–1436. 
Bogdan, T., Florentin, D. and Pileggi, L. (1996). An explicit RC-circuit delay 
approximation based on the first three moments. In ACM Design Automation 
Conf. 611–616. 
Borah, M., Owens, R. and Irwin, M. (1994). An edge-based heuristic for Steiner 
routing. IEEE Trans. Computer-Aided Design of Integrated Circuits and 
Systems. 13(12): 1563–1568. 
Cadence (2008). Global synthesis for design closure: The impact on physical quality 
of silicon. Cadence Inc. 
Cai, Y. et al. (2014). Obstacle-avoiding and slew-constrained clock tree synthesis with 
efficient buffer insertion. IEEE Trans. Very Large Scale Integration (VLSI) 
Systems. 23(1): 142–155. 
Caignet, F., Delmas-Bendhia, S. and Sicard, E. (2001). The challenge of signal 
integrity in deep-submicrometer CMOS technology. Proceedings of the IEEE. 
89(4): 556–573. 
Celik, M., Pileggi, L. and Odabasioglu, A. (2002). IC interconnect analysis, 
Massachusetts: kluwer academic publishers.  
Chandrakasan, A.P., Sheng, S. and Brodersen, R.W. (1992). Low-power CMOS 
digital design. IEEE Journal of Solid-state Circuits. 27(4): 473–484. 
Cong, J. and Yuan, X. (2000). Routing tree construction under fixed buffer locations. 
In Proc. 37th Annual Design Automation Conference. ACM: 379–384. 
Cong, J. et al. (1992). Provably Good Performance-Driven Global Routing. IEEE 
Trans. Computer-Aided Design. 11(6): 739–752. 
Cong, J. (1997). Challenges and opportunities for design innovations in nanometer 
technologies. In SRC Design Sciences Concept Paper. 
154 
 
Cong, J., Leung, K.-S. and Zhou, D. (1993). Performance-driven interconnect design 
based on distributed RC delay model. In Proc. 30th Design Automation Conf. 
New York, New York, USA: 606–611. 
Cormen, T.H., Leiserson, C.E. and Rivest, R.L. (2009). Introduction to Algorithms 3rd 
ed., Boston, MA: McGraw-Hill. 
Dechu, S., Shen, C. and Chu, C. (2005). An Efficient Routing Tree Construction 
Algorithm Obstacle Considerations. IEEE Trans. Computer-Aided Design of 
Integrated Circuits and Systems. 24(4): 600–608. 
Devgan, A. (1997). Efficient Coupled Noise Estimation for On-chip Interconnects. 
IEEE International Conference on Computer-Aided Design: 147–151. 
Dexter, K. (1992). The Design and Analysis of Algorithms, Springer US. 
Dijkstra, E.W. (1959). A note on two problems in connexion with graphs. Numerische 
Mathematik: 269–271. 
Ekekwe, N. (2010). Power dissipation and interconnect noise challenges in nanometer 
CMOS technologies. IEEE Potentials. 29(3): 26–31. 
Ekekwe, N. (2010). Power dissipation and interconnect noise challenges in nanometer 
CMOS technologies. IEEE Potentials: 29(3): 26–31. 
Elgamel, M.A. and Bayoumi, M.A. (2003). Interconnect noise analysis and 
optimization in deep submicron technology. IEEE Circuits and Systems 
Magazine. 6(4): 6–17. 
Elmore, W. (1948). The transient response of damped linear networks with particular 
regard to wideband amplifiers. Journal of Applied Physics. 19(1): 55–63. 
Gupta, R., Tutuianu, B. and Pileggi, L.T. (1997). The Elmore Delay as a Bound for 
RC Trees with Generalized Input Signals. IEEE Trans. Integrated Circuits and 
Systems. 16(1): 95–104. 
155 
 
Hasani, F. and Masoumi, N., 2008. Interconnect sizing and spacing with consideration 
of buffer insertion for simultaneous crosstalk-delay optimization. In Design and 
Technology of Integrated Systems in Nanoscale Era. IEEE: 1–6. 
Hentschke, R.F. et al. (2007). Maze routing Steiner trees with effective critical sink 
optimization. In Proceedings of International Symposium on Physical Design:  
135–142. 
Ho, J., Vijayan, G. and Wong, C.K. (1989). A new approach to the rectilinear Steiner 
tree problem. In ACM Design Automation Conf: 161–166. 
Ho, J., Vijayan, G. and Wong, C.K. (1990). New algorithms for the rectilinear Steiner 
tree problem. IEEE Trans. Computer-Aided Design of Integrated Circuits and 
Systems. 9(2): 185–193. 
Hu, J. et al. (2003). Buffer insertion with adaptive blockage avoidance. IEEE Trans. 
Computer-Aided Design of Integrated Circuits and Systems. 22(4): 492–498. 
Hu, S., Li, Z. and Alpert, C.J. (2009). A fully polynomial time approximation scheme 
for timing driven minimum cost buffer insertion. In Proceedings of the 46th 
Annual Design Automation Conference DAC ’09. New York, New York, USA: 
424–429. 
Huang, F. (1976). On Steiner minimal trees with rectilinear distance. SIAM Journal of 
Applied Math. 30(1): 104–114. 
Hwang, F.K., Richards, D.S. and Winter, P. (1992). The Steiner tree problem, 
Amsterdam: Elsevier Science Publishers. 
Ismail, Y.I., Friedman, E.G. and Neves, J.L. (2000). Equivalent Elmore delay for RLC 
trees. IEEE Transactions on Computer-Aided Design of Integrated Circuits and 
Systems. 19(1): 83–97. 
ITRS, (2012). International Technology Roadmap for Semiconductors (ITRS), 
Available at: http://www.itrs.net/. 
ITRS, (2013). International Technology Roadmap for Semiconductors (ITRS), 
Available at: http://www.itrs.net/. 
156 
 
Ivanov, A.O. and Tuzhilin, A.A. (1994). Minimal networks: the Steiner problem and 
its generalizations, Boca Raton, Florida: CRC Press. 
Jiang, Z. et al. (2006). A New RLC Buffer Insertion Algorithm. In International 
Conference on Computer-Aided Design (ICCAD): 553–557. 
Kahng, A.B. and Robins, G. (1992). A new class of iterative Steiner tree heuristics 
with good performance. IEEE Trans. Computer-Aided Design of Integrated 
Circuits and Systems. 11(7): 893–902. 
Kahng, A.B. and Robins, G. (1995). On Optimal Interconnections for VLSI, Boston, 
MA: Springer US. 
Khalil-Hani, M. and Shaikh-Husin, N. (2008). Simultaneous Routing and Buffer 
Insertion Algorithm for Interconnect Delay Optimization in VLSI Layout 
Design. In 20th International Conference on Microelectronics ICM: 175–178. 
Khalil-Hani, M. and Shaikh-Husin, N. (2009). An optimization algorithm based on 
grid-graphs for minimizing interconnect delay in VLSI layout design. Malaysian 
Journal of Computer Science. 22(1): 19–33. 
Kruse, R.L., Tondo, C.L. and Leung, B.P. (1997). Data structures and program design 
in C Second Edi., Prentice-Hall. 
Lai, M. and Wong, D.F. (2002). Maze routing with buffer insertion and wiresizing. 
IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems. 
21(10): 1205–1209. 
Li, R. et al. (2005). Power-optimal simultaneous buffer insertion/sizing and wire sizing 
for two-pin nets. IEEE Trans. Computer-Aided Design of Integrated Circuits 
and Systems. 24(12): 1915–1924. 
Li, Z. and Shi, W. (2006). An O (bn2) time algorithm for optimal buffer insertion with 
b buffer types. In Proceedings on Design, Automation and Test in Europe. IEEE: 
484 – 489. 
157 
 
Li, Z., Zhou, Y. and Shi, W. (2012). An O (mn) time algorithm for optimal buffer 
insertion of nets with m sinks. IEEE Trans. Computer-Aided Design of 
Integrated Circuits and Systems. 31(3): 437–441. 
Lillis, J., Cheng, C.K. and Lin, T.T.Y. (1996). Optimal wire sizing and buffer insertion 
for low power and a generalized delay model. IEEE Journal of Solid-State 
Circuits. 31(3): 437–447. 
Lin, S. (1965). Computer Solutions of the Traveling Salesman Problem. Bell System 
Technology. 44(10): 2245–2269. 
Maheshwari, V. et al. (2012). Delay Model for VLSI RLCG Global Interconnects. In 
Asia Pacific Conference on Postgraduate Research in Microelectronics and 
Electronics:  201–204. 
Md-Yusof, Z. et al. (2008). Iterative RLC Models For Interconnect Delay 
Optimization in VLSI Routing Algorithms. In Proceedings of Student 
Conference on Research and Development (SCOReD): 83–93. 
Md-Yusof, Z. et al. (2009). Optimizing multi-constraint VLSI interconnect routing. 
Proc. 12th International Symposium on Integrated Circuits: 655–658. 
Md-Yusof, Z. et al. (2012). s.RABILA2: An optimal VLSI routing algorithm with 
buffer insertion using iterative RLC model. 2012 IEEE International Conference 
on Circuits and Systems (ICCAS): 48–53. 
Michael L, F. and Robert Endre, T. (1987). Fibonacci heaps and their uses in improved 
network optimization algorithms. ACM. 34(3): 596–615.  
Mieghem, P. van and Kuipers, F. (2004). Concepts of exact QoS routing algorithms. 
IEEE Trans. Networking. 12(5): 851–864. 
Mikami, K. and Tabuchi, K. (1968). A computer program for optimal routing of 
printed circuit connectors. In IFIPS Proceedings: 1475–1478. 
Murgan, T. et al. (2006). Simultaneous Placement and Buffer Planning for Reduction 
of Power Consumption in Interconnects and Repeaters. 2006 IFIP International 
Conference on Very Large Scale Integration: 302–307.  
158 
 
Nalamalpu, A. and Burleson, W. (2001). A practical approach to DSM repeater 
insertion: satisfying delay constraints while minimizing area and power. IEEE 
Con. on ASIC/SOC: 152–156. 
Narasimhan, A. and Sridhar, R. (2010). Variability aware low-power delay optimal 
buffer insertion for global interconnects. IEEE Trans. Circuits and Systems I. 
57(12): 3055–3063. 
Newell, A. and Ernst, G. (1965). Search for generality. In IFIP congress: 17–24. 
Prim, R. (1957). Shortest connection networks and some generalizations. Bell system 
technical journal. 36: 1389–1401.  
Rubinstein, J., Penfield, P. and Horowitz, M.A. (1983). Signal delay in RC tree 
networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits 
and Systems. 2(3): 202–211. 
Saxena, P. et al. (2004). Repeater scaling and its impact on CAD. IEEE Trans. 
Computer-Aided Design of Integrated Circuits and Systems. 23(4): 451–463. 
Shaikh-Husin, N. (2008). Optimization of Routing Algorithm for Interconnect Delay. 
Universiti Teknologi Malaysia. 
Shi, W. and Li, Z. (2003) An O(nlogn) time algorithm for optimal buffer insertion. 
Design Automation Conference, IEEE: 580–585. 
Shi, W.  Li, Z. (2005). A fast algorithm for optimal buffer insertion. IEEE Trans. 
Computer-Aided Design of Integrated Circuits and Systems. 24(6): 879–891. 
Sinha, S.M. (2006). Mathematical Programming, Elsevier Science.  
Tang, X. et al. (2001). A new algorithm for routing tree construction with buffer 
insertion and wire sizing under obstacle constraints. IEEE Con. Computer Aided 
Design: 49–56. 
van Ginneken, L.P.P.P. (1990). Buffer placement in distributed RC-tree networks for 
minimal Elmore delay. In Proc. Int. Symp. Circuits and Systems. IEEE: 865–
868. 
159 
 
Wang, Y., Cai, Y. and Hong, X. (2005). A fast buffered routing tree construction 
algorithm under accurate delay model. In IEEE Con. on VLSI Design. IEEE: 91–
96. 
Wason, V. Banerjee, K. (2005). A probabilistic framework for power-optimal repeater 
insertion in global interconnects under parameter variations. Proceedings of the 
2005 international symposium: 131–136. 
Zhang, T. and Sapatnekar, S.S. (2007). Simultaneous Shield and Buffer Insertion for 
Crosstalk Noise Reduction in Global Routing. IEEE Transactions on Very Large 
Scale Integration (VLSI) Systems. 15(6): 624–636. 
Zhou, H. et al. (1999). Simultaneous routing and buffer insertion with restrictions on 
buffer locations. Journal of the 36th annual ACM/IEEE, 2: 96–99. 
Zhou, H. et al. (2000). Simultaneous routing and buffer insertion with restrictions on 
buffer locations. IEEE Trans. Computer-Aided Design of Integrated Circuits 
and Systems. 19(7): 819–824. 
 
 
