STAIRoute: Early Global Routing using Monotone Staircases for Congestion
  Reduction by Kar, Bapi et al.
ar
X
iv
:1
81
0.
10
41
2v
1 
 [c
s.O
H]
  2
4 O
ct 
20
18
STAIRoute: Early Global Routing using
Monotone Staircases for Congestion Reduction
Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
Abstract With aggressively shrinking process technologies, physical design faces
severe challenges and early detection of failures is mandated. It may otherwise lead
to many iterations and thus impact time-to-market. This has encouraged to devise a
feedback mechanism from a lower abstraction level of the design flow towards the
higher levels. Some of these efforts include placement driven synthesis, routability
(timing) driven placement etc. Motivated by this philosophy, we propose a novel
global routing method using monotone staircase routing regions (channels), defined
at the floorplanning stage. The intent is to identify the feasibility of a floorplan
topology of the given design netlist by estimating routability, routed net length and
the number of vias while taking into account global congestion scenario across the
layout. This framework works on both unreserved as well as HV reserved layer
model forM(≥ 2) metal layers and accommodates different capacity profiles of the
routing resources, due to uniform or different cases of metal pitch variation across
the metals layers akin to the latest technologies. This algorithm takes O(n2kt) time
for a given design with n blocks and k nets having at most t terminals. Experimental
results on MCNC/GSRC floorplanning benchmarks show 100% routability while
congestion in the routing regions restricted to 100%. The routed net length for all
t-terminal (t ≥ 2) nets is comparable with the steiner length computed by FLUTE.
An estimation on the number of vias for different capacity profiles are also obtained.
Key words: VLSI global routing, routing region definition, floorplan, monotone
staircase routing, congestion
Bapi Kar
Indian Institute of Technology, Kharagpur, India, e-mail: bapi.kar@gmail.com ,
Susmita Sur-Kolay
Indian Statistical Institute, Kolkata, India, e-mail: ssk@isical.ac.in,
Chittaranjan Mandal
Indian Institute of Technology, Kharagpur, India, e-mail: chitta@iitkgp.ac.in
1
2 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
1 Introduction
In IC design flow, global routing (GR) is indispensable, particularly as an aide to
detailed routing (DR) of the wires through different metal layers. Shrinking fea-
ture dimensions with technological advances in IC fabrication process pose more
challenges on the physical design phase. There has been a tremendous increase in
routing constraints arising from not only stringent layout design rules but also pro-
cess variations and sub-wavelength effects of optical lithography. Successful routing
completion of the nets without too many iterations or sacrifice in the performance
of the designs is thus mandated.
Routing
Success?
Placement
Global Routing
Detailed Routing
Success?
Placement
Detailed
Circuit Netlist
Layout
Y
(a)
Circuit Netlist
N
NY
Too many
Iterations?
YN
Layout
(b)
Floorplanning Floorplanning
STAIRoute
Global Routing
Early
Fig. 1 Physical Design Flow: (a) Conventional, and (b) Proposed
In grid based routing methods, multi-terminal nets are decomposed into two ter-
minal segments using Steiner tree decomposition using Rectilinear minimum span-
ning tree (RMST) [18], or Rectilinear Steiner minimal tree (RSMT) [16] topology
as an initial solution with minimum length. Subsequently, congestion driven rout-
ing for each two terminal net segment is adopted through the routing regions. The
congestion models in those methods have been formulated based on the capacity of
the grids and the routing demands through them, along with a penalty function. For
any unsuccessful routing due to over congestion (≥ 100%), Rip-up and Re-reroute
(RRR) techniques using maze routing [11, 19] have been applied for possible rout-
ing completion while compromising in net length due to detour. The major chal-
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 3
lenge in the event of unsuccessful routing is to get back to placement stage in order
to generate a new placement solution, but with no guarantee for successful routing
completion (vide Figure 1 (a)). This may lead to several iterations until the goal
is achieved and thus prove to be very costly if the entire design implementation is
not completed within a stipulated time frame. In other words, this may have severe
impact on time-to-market of the intended design.
The possibility of recurring iterations at the placement stage (vide Figure 1 (a))
due to failure at global routing stage may however be avoided if we can predict
the feasibility of global routing as early as at the floorplanning stage, as depicted
in Figure 1 (b). This comprises the identification of monotone staircase channels as
the routing resources while estimating their capacity and formulating the congestion
model. These types of routing resources are known to have advantages of acyclic
routing order for successful routing completion [20, 21] and avoidance of switch
box routing [19]. They also allow easy channel resizability [21] to mitigate heavy
congestion (≥ 100%).
In the recent past, single bend (L shaped) [10], two bend (Z shaped) [10, 16],
or even with more bends such as monotone staircase patterns [2] has gained sig-
nificant importance in grid-based global routing. With increasing number of bends,
they yield more flexibility in order to find a possible routing path, but at the cost
of more vias. It is also shown that pattern based routing [10] is much faster than
maze routing, and monotone staircase pattern routing [2] has the same time com-
plexity as with Z shaped patterns. A thoughtful trade off between routability (also
net length) and the number of vias has to be made while keeping in mind that the
routing resources are not over congested. Recent work monotone staircase bipar-
titioning method [9] attempted to address the minimization of the number of vias
along a monotone staircase routing path by minimizing the number of bends in it
[2]. Additionally, the pattern based routing are shown to help in cross talk minimiza-
tion [10].
2 Our Contribution
2.1 Outline of the proposed Global Routing technique
In this paper, we propose a new global routing method following the floorplanning
stage using monotone staircase channels for routing completion with no congestion
in the routing regions. It is important to note that this global routing framework
is not grid based alike [2, 10, 14, 16, 18]. Outline of the proposed global router
STAIRoute (vide Figure 2) are as follows:
1. Identification of the routing regions as monotone staircase channels derived from
a given floorplan topology by using monotone staircase bipartitioning algorithm;
a graph theoretic formulation with these staircase channels is used to determine
a feasible routing path for all the nets;
4 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
Net Ordering
Construct Routing Graph
Identify Routing Segments
Shortest Path based routing
Y N
Decompose into 2−terminal net
segment using MST algorithm
#terminals = 2?
Input Circuit Floorplan
Monotone Staircase Channel Definition [7]
for each net
Fig. 2 Outline of the proposed global router
2. Net ordering based on half perimeter wire length (HPWL) and the number of
terminals (Netdegree);
3. Decomposing multi-terminal nets into an equivalent set of two-terminal net seg-
ments using minimum spanning tree algorithm and defining a new Steiner tree
topology;
4. Routing solution for a given number of metal layers using a shortest path al-
gorithm to find the best possible routing path while respecting the prevailing
congestion scenario (of< 100% utilization) across the layers;
5. Ensuring congestion in the routing regions is restricted to 100% across a given
number of metal layers.
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 5
2.2 Salient Features
The salient features of the proposed global routing method presented in this paper
are as follows:
1. monotone staircase channels as the routing resources for improved flexibility in
identifying a routing path of a net
2. routing the nets through the monotone staircase channels only
3. over-congestion free global routing model
4. new steiner tree topology for multi-terminal net decomposition
5. compatible to both unreserved and reserved layer model for a given number of
metal layers
6. estimation of the number of vias
This paper is organized as follows: Section 3 revisits the preliminaries of mono-
tone staircase bipartitioning paradigm followed by Section 4 that includes related
topics and the proposed global routing method using monotone staircases as the
routing resources. Results are presented in Sections 5, and concluding remarks in 6.
3 Preliminaries
In this section, we briefly review monotone staircase channels and the birpartition-
ing framework in order to obtain them immediately after the floorplan stage. Meth-
ods for top-down hierarchical monotone staircase bipartitioning of VLSI floorplans,
both in Area-balanced and Number-balanced bipartition appear in [5, 7, 9, 12, 13].
Area-balanced bipartition is employed when the area of the blocks in a given floor-
plan have significant variance, whereas Number-balanced bipartition is applicable
for negligible variance in the area of the blocks. In [12, 13], the balanced biparti-
tioner used iterative max-flow based [23] min-cut algorithm and thereby incurred
higher time complexity at each level of the hierarchy. In [5], emphasis has been
given to the hierarchical number balanced monotone staircase bipartitioning using
depth-first traversal method in linear time at a given level of the hierarchy.
Recently, a faster yet more accurate top-down hierarchical monotone staircase
bipartition [7] has been proposed to generate monotone staircase cuts, abbreviated
as ms-cut as we subsequently refer to it. Their algorithm takes O(nk logn) time and
also ensures ms-cuts of increasing (decreasing) orientation at alternate levels of the
hierarchy (vide Figure 4 (a) and (b)), namelyMSC tree.
In order to identify monotone increasing (decreasing) staircase channels (CI (CD)
as depicted in Figure 3 (a) ((b)), abbreviated as MIS (MDS), for a given a planar
embedding of a floorplan topology with n blocks, an unweighted directed graph
[7, 12], called block adjacency graph (BAG) G(Vb,Eb) is formulated. The graph is
defined as follows: Vb = {bi|∀ blocks bi in the floorplan} and Eb = {< bi,b j > |
block bi is either on the left of or above (below) its adjacent block b j}. Note that
|Vb| = n and |Eb| = 3(n− 1) (vide Lemma 2).
6 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
A B C
D
E F
G
H J
A B C
D
E F
G
H J
C I
C I C D
   
A
B
C
D E
F
G H J
source
sinkC D
(a) (b)
A B
C
D E
F
G H
J
source
sink
Fig. 3 A floorplan with staircase channels and the corresponding ms-cuts in its block adjacency
graph (BAG): (a) monotone increasing staircase (MIS), (b) monotone decreasing staircase (MDS)
[7]
Lemma 1. Given a floorplan with n blocks, its MSC tree (Vm,Em) corresponding to
the set C of monotone staircase channels has n− 1 ms-cuts (internal nodes).
Proof. In a full binary tree, an internal node has two children (out degree = 2);
whereas an external (leaf) node has no children (out degree = 0). In our case, the
internal nodes correspond to the ms-cuts in the MSC tree, and the external nodes are
the blocks in the given floorplan.
Hence ∑iOutDeg(vi) = |Em| = |Vm|−1, where Tm = G(Vm,Em) is the resulting MSC
tree as shown in Figure 4(b).
⇒ 2 ∗ |C| + 0 ∗ n = (|C| + n) -1; where |C| and n are the number of ms-cuts and
blocks respectively.
⇒ |C| = n− 1.
⊓⊔
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 7
4 Global Routing using monotone staircase channels
4.1 Routing Region Definition
Using the recent hierarchical monotone staircase bipartition framework [7], we ob-
tain a set of MIS (MDS) channels C = {Ci} at alternate levels of the hierarchy in
MSC tree. These channels are used as the routing resources for the proposed global
routing framework. Each monotone staircase channel consists of one or more recti-
linear segments, called channel segment, bounded by a distinct pair of blocks. For
each channel and its segment(s), the number of nets to be routed through it, de-
noted as its reference capacity rCap, is computed from the number of cut nets in the
respective ms-cut node in the MSC tree. During routing, its capacity usage uCap,
gives the channel utilization; it is initialized to 0. For the rest of the paper, we re-
fer channel and segment to monotone staircase channel and its rectilinear segment
respectively.
C1
C4
C7
C0
Cn1
Cn2
Cn4
C2
C6
Cn3C5C0
C3
Cn1
C1
Cn2 C4 C0
C6
Cn4
C7
C2
Cn3C5C0
C3
C0
C1 C2
C6C5C3
C7
1J 5J
0J
4J
3J
10J
9J
6J
11J
15J
14J
12J
8J2J
13J7J
1J
0J
5J
4J
3J
2J
6J
7J
8J
10J
9J
11J
12J
13J
14J
15J
C4
C 2 _s 0
C 5 _s 1
C 5 _s 0
C 0 _s 1
C 0 _s 3C 0 _s 2
C 1 _s 0
C
0
_
s
4 C 0 _s 5
C
0
_
s
6 C 0 _s 7
C 6 _s 0
_sC n4 0
_
s
C
n
1
0
C 3 _s 1
C 7 _s 0C
4
_
s
0
C 4 _s 1
C 4 _s 2
C 1 _s 1
(a) (b)
(d)(c)
A B C
D
E F
G
H J
MIS
H
MIS
MDS
B
J C FGD A
E
Fig. 4 A floorplan: (a) its hierarchy of monotone staircase channels, (b) the corresponding MSC
tree, (c) T-junctions at which staircases intersect, and (d) the corresponding junction graph.
8 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
As in Figure 3 (a), the highlighted ms-cut CI on BAG contains seven cut edges
{GH, DH, EH, EJ, EF , BF and BC} corresponds to the MIS channel CI having
seven segments. Additionally, it has two more horizontal segments: one for the bot-
tom side of the block G at the bottom-left corner with the boundary of the floorplan
while the other is on the top side of the block C at the top-right corner of the floor-
plan. Figure 3 (b) illustrates a case of MDS channelCD having six segments. Figure
4 (a) shows a flooplan along with a set of MIS/MDS channels C0 to C7. Channels
with one segment having either vertical or horizontal orientation are termed as de-
generate monotone staircase channel. The channelC7 in Figure 4 (a) is an example
of such a channel.
However, there exist a few more isolated segments along the boundary of the
floorplan that are not identified as part of the MSC tree generation, and can be
termed as non-MS channels. In Figure 4(a),Cn1 toCn4 are the example of a few such
channels. Their capacity rCap is computed based on the number terminals on it, and
those with nonzero rCap contribute to global routing as valid routing resources.
4.2 Junction Graph and the Congestion model
In this section, we present our global routing framework using monotone staircase
channels and their intersection points, the T-junctions (vide Figure 4 (c)). It is ev-
ident that there exists a segment between each pair of adjacent T-junctions; hence-
forth referred as junctions.
Lemma 2. Given a floorplan with n blocks, the number of T junctions in it is 2n−2.
Proof. Every internal face in BAG corresponds to a T-junction, and is bounded by
3 edges. Thus we have 3( f − 1) = 2m excluding the exterior face, where f and m
being the number of faces and edges in BAG respectively. Using Euler formula for
planar graphs, n−m+ f = 2, and replacing f by 2m/3+ 1, we get m= 3(n− 1).
Hence, the number of T-junctions in the floorplan = f − 1 = 2m/3 = 2n− 2. ⊓⊔
Using the notion of T-junctions, we construct a weighted undirected graph (vide
Figure 4(d)), called junction graph, G j = (V j,E j), where V j = {Jp}, corresponds to
a set of junctions, and E j = {{Jp,Jq} | a pair of adjacent junctions {Jp,Jq} with a
segment sk of a channel Cm ∈ C between them}. As depicted in Figure 4(c), all
the junctions, except those near the corners of the floorplan with degree two, have
degree of three in G j, i.e., have edges with three adjacent junctions. Using Lemma
2, it can be shown that |E j| = 3n− 7.
The weight of each edge epq ∈ E j is computed as
wt(epq) = length(sk)/(1− psk) (1)
where psk , the normalized usage through the segment sk, is defined as:
psk = uCapsk/rCapsk (2)
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 9
And, we define (1− psk) as the usage penalty on the edge weight for routing a net
through the corresponding segment sk. In Figure 5, we illustrate the variation of
edge weight with respect to the normalized usage psk .
skp1.0
− Infinity
+ Infinity
0.0
1.0
Under−Congestion Over−Congestion
Ed
ge
 W
eig
ht
Fig. 5 Junction Graph Edge weight (wt(epq)) vs. normalized usage (psk )
In the proposed global routing framework, congestion is avoided in all the seg-
ments by restricting psk to be no more than 1.0. This is achieved by setting the edge
weight to Infinity whenever psk = 1.0. The corresponding edge is virtually removed
from E j. This ensures that the case of psk > 1.0 does not occur. In Figure 5, we mark
the regions psk ≤ 1.0 and psk > 1.0 as Under-Congestion and Over-Congestion re-
gions respectively. Therefore, we restrict to Under-Congestion while formulating
the global routing graph such that there is no congestion in any of the routing re-
sources. However, it may be noted that routing may fail for some of the nets due
to insufficient capacity of some of the routing resources for a specified number of
metal layers.
Lemma 3. The construction of the junction graph takes O(n) time.
Proof. By Lemma 2, we know that there are O(n) edges in the BAG, where each
edge corresponds to a channel segment. Therefore, for each segment sk having a
pair of junctions {Jp,Jq} as its endpoints, an edge is inserted in the G j. Hence, the
construction of the junction graph G j takes O(n) time. ⊓⊔
10 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
4.2.1 Global Staircase Routing Graph
In this section, we present our proposed global routing framework by extending the
junction graph G j for each net.
Let N be a set of nets for a given floorplan. For each t-terminal (t ≥ 2) net ni ∈ N,
we use G j as the backbone to derive the corresponding Global Staircase Routing
Graph (GSRG) as depicted in Figure 6 (a) and 6 (b). The GSRG is defined as Gri
= (Vri,Eri), where Vri = V j
⋃
{tl |tl ∈ ni}, and Eri = E j
⋃
El p. Each pin-junction edge
el p ∈ El p is defined as el p = {tl ,Jp} | ∀tl ∈ ni and ∃Jp ∈ J, the pin tl resides on a
segment sk associated with the junction Jp}. As before, we calculate the weight of a
pin-junction edge el p as:
wt(el p) = distance(tl ,Jp)/(1− psk). (3)
and define (1− psk) as the usage penalty on the edge weight for routing a net through
the corresponding segment sk.
Lemma 4. For a t-terminal net, the construction of its GSRG takes O(t) time.
Proof. As defined, the GSRG Gri = (Vri,Eri) for a given net ni with t terminals
is obtained by augmenting the junction graph G j = (V j,E j). In other words, V j is
extended by t terminals connected to ni in order to obtain Vri. It is also to be noted
that each terminal resides on a segment sk, having a pair of junctions (Jp,Jq) on
either ends. Therefore, each terminal (pin) contributes 2 pin-junction edges and thus
total 2t edges to Gri for all t terminals.
Hence, the construction of Gri takes O(t) time for each net. ⊓⊔
After routing a net ni successfully, we update uCapsk for all such segments sk
through which ni is routed. Subsequently, the weights of the edges in G j are up-
dated before we route the subsequent net ni+1. When congestion is about to occur in
a given segment (psk = 1), the weight of the corresponding edge in Gri becomes In-
finity. No routing is possible through such segments and the relevant edges virtually
disappear makingGri more sparse after each iteration of routing. To summarize, the
normalized usage psk in this framework is constrained to a maximum of 100%, thus
restricting the number of routed nets (uCap) through a given segment to be no more
than its capacity (rCap).
In order to extend this model forM(≥ 1)metal layers, we keep a parameter called
currLayer(sk) associated with each segment sk, initialized to 1 and can go up to a
maximum ofM metal layers. When congestion is about to occur in sk (psk = 1), we
increment currLayer(sk) to the subsequent metal layer. Here the subsequent metal
layer has different implication in (un)reserved layer model; the subsequent layer
can either be one layer above currLayer(sk) or the next permitted layer based on
the particular (horizontal/vertical) orientation of sk in the corresponding reserved
layer model. This means that the resource sk has exhausted its entire capacity (i.e.
uCapsk = rCapsk ) for the current metal layer and is now ready for routing the nets
through it for the next metal layer restricted by M. In this regard, the variation of
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 11
t a
t b
t h
A B
E
G H J
C
D
F
t g
t c
A B C
E
H
G
D
J
F
t a
t b
t h
t c
t g
t h
t b
t a
t c
t g
t a
t b
t h
A B C
E
H
G
J
F
D
t g
t c
A B
E
H
G
C
D
J
F
(b)(a)
Gl
ob
al 
ro
ut
in
g o
f t
he
 ne
t
Ro
ut
in
g p
ath
 on
 G
SR
G
Fl
oo
rp
lan
 w
ith
 pi
n l
oc
ati
on
s
GS
RG
Fig. 6 Illustration of the steps for constructing the Global staircase routing graph (GSRG) from
the Junction graph for (a) a 3-terminal net ni = {ta, tb, th}, and (b) a 2-terminal net n j = {tc, tg},
along with corresponding routed paths in GSRG and finally routed nets in the floorplan topology.
12 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
rCapsk across the metal layers (up to M) plays a significant role and thus directly
impacts the routing completion of all the nets.
1.0
4 8620
N
or
m
al
iz
ed
 C
ap
ac
ity
 (r
Ca
p)
No of Metal Layers
1/2
1/4
1/4
1/2
1/8
Ladder
1/8
Hyperbolic
Constant
(a) (b)
Fig. 7 (a) Normalized Capacity profile of a routing resource (sk) vs number of metal layers (M),
and (b) Metal layer variation across process nodes [15]
In Figure 7 (a), we study different scenario of uniform as well as varying ca-
pacity profile for all the routing resources sk across the metal layers. In case of
Uniform profile, rCapsk carries the same value across the metal layers. We consider
two different varying capacity profiles, one is Hyperbolic (1/M) pattern, while the
other being a Ladder pattern. In case of the former, rCapsk is more aggressively
scaled across the metal layers, the latter is a more realistic scenario that captures the
latest trend of the metal pitch/width variation across the metal layers in the recent
nanometer technologies (vide Figure 7 (b) [15]).
4.3 Multi-terminal Net Decomposition
In a global routing framework, routing a t(> 2)-terminal net is crucial and obtaining
an efficient solution for minimal length is a hard problem. Several works have been
done so far to obtain the best possible t−1 net segments for a t-terminal net such as
Rectilinear Steiner Minimal Tree (RSMT) topology proposed in FLUTE [3] based
on a well defined grid structure known as Hanan grid [6, 19]. Since the proposed
work is based on a gridless framework and the routing regions are aligned with
the MIS/MDS channels, we cannot adopt any grid-based RSMT framework such as
FLUTE [3].
Therefore, we propose a new method for multi-terminal net decomposition suit-
able for the proposed global routing framework.We construct a complete undirected
graph for a given t-terminal (t > 2) net ni ∈ N, Gci = (Vci,Eci) such that Vci = {tk},
∀tk ∈ ni and Eci = {{t j,tk} | ∀t j, tk ∈ ni and t j 6= tk}. The weight of each edge e jk =
{t j,tk} ∈ Eci is computed as half the perimeter length (HPWL) of the bounding box
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 13
for each terminal pair (ti, t j) (vide Figure 8 (a)). It is evident that |Vci| = O(t) and
|Eci| = O(t
2). By employing O(n2) Prim’s Minimum Spanning Tree (MST) algo-
rithm [4], we obtain a minimum spanning tree (MST) Tci for Gci having t−1 edges,
i.e., t − 1 valid 2-terminal pairs. For each edge e jk = {t j, tk} ∈ Tci, we perform 2-
terminal net routing by applying Dijkstra’s single source shortest path algorithm [4].
Once we obtain the routing for all such terminal pairs, we obtain the Steiner points
by identifying the common routing segments as illustrated by an example in Figure
8.
Let us consider an example of a 3-terminal net n1 with terminals {ta, tb, tc} to
illustrate the proposed 2-terminal net decomposition as shown in Figure 8. In this
case, Gc1, a 3-clique, has 3 vertices {ta, tb, tc}, and 3 edges, namely {ta, tb}, {tb, tc}
and {ta, tc}, along with their corresponding edge weights (vide Figure 8 (a)). As
shown in Figures 8 (b)-(i) and (b)-(ii), only one of the instances of minimum span-
ning tree Tc1 is greedily obtained by the said MST algorithm as the final solution.
Depending on a specific Tc1 thus obtained, the proposed 2-terminal net segment
routing, presented in the next section, for each valid terminal pair is applied. Once
the routing for all the designated terminal pairs are obtained, we identify the Steiner
points similar to the state-of-the-art grid-based multi-terminal net decomposition
methods (FLUTE [3]), as illustrated in Figure 8 (b). The main difference is that this
work is based on a gridless framework using monotone staircase channels as the
routing resources. This topology may be termed as Staircase Minimal Steiner Tree
(SMST).
4.4 STAIRoute: the proposed global routing algorithm
We present the proposed global routing algorithm STAIRoute using monotone stair-
case channels in Algorithm 1. This algorithm takes two inputs, namely a ordered set
of nets N and the junction graph G j as defined in Section 4.2. For each net ni ∈ N,
the GSRG Gri is constructed and a routing path for the net ni is obtained by apply-
ing a shortest path algorithm onGri. We have implementedO(n
2)Dijkstra’s shortest
path algorithm [4], namely DijkstraSSP(), presented in Algorithm 1.
For each 2-terminal net (segment), we consider two cases of identifying the
source vertex between a pair of terminals before we apply the shortest path algo-
rithm as:
1. the minimum x coordinate (or the minimum y coordinate in case both the termi-
nals have the same x coordinate)
2. the maximum x coordinate (or the maximum y coordinate in case both have the
same x coordinate)
and the procedure IdentifySource() in Algorithm 1 is used for that purpose.
We term them as Forward (FWD) and Backward (BACK) search respectively. In
Figure 9 (a) and (b), we illustrate the respective cases for a 2-terminal net {tg, tc}
and show that both search procedures can potentially give different routing paths.
14 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
tc
tb
ta
tc
tb
ta
tc
tb
ta
HPWL(BC)
HPWL(AB)
HPWL(AC)
ta
tb
tc
HP
W
L(A
B)
Area overlap
ta
tb
tc
ta
tb
tc
ta
tb
tc
co
m
m
on
 s
eg
me
nt
ta
tb
tc
Ste
ine
r p
oin
t
ta
tb
tc
Steiner point
ta
tb
tc
common segment
(ii)(i)
(b)
(a)
Fig. 8 Illustrating the proposed multi-terminal net decomposition
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 15
tg
tc
0(10) 3(10)
4(10)
8(10) 2(10)
0(10)
2(10)
0(10)
5(10)
6(10)
1(10)
1(10)
3(10)
tg
tc
0(10) 3(10)
4(10)
6(10) 2(10)
0(10)
2(10)
0(10)
5(10)
8(10)
1(10)
1(10)
3(10)
tg
tc
A B C
D
E F
H J
G tg
tc
A B C
D
E F
H J
G
tc
tg
0(10) 3(10)
2(10)
3(10)
8(10) 2(10)
0(10)
1(10)
0(10)
4(10)
5(10)
0(10)
0(10)
X Y
(Sink)
(Source)
tg
tc
0(10) 3(10)
2(10)
3(10)
5(10) 2(10)
0(10)
1(10)
0(10)
4(10)
8(10)
0(10)
0(10)
X Y
(Source)
(Sink)
(a) (b)
Fig. 9 Exploring a routing path based on: (a) Forward Search, and (b) Backward Search
One may have a potentially better solution than the other in terms of routability,
congestion scenario along with net length, and finally via count. The variation in
net length due to FWD (BACK) search arises when certain resource(s) along the
respective paths are fully utilized in a given metal layer; with the possibility of
switching to the next available metal layer if permitted, leads to increase in the
via count. Otherwise, the routing path is detoured beyond the bound box of the
terminals, leading to increase in length. As long as the alternatives paths remain
16 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
confined within the bounding box of the terminals, there is no variation among the
respective net lengths.
Algorithm 1 STAIRoute
Inputs: G j(Vj ,E j), Ordered nets N
Outputs: Global routing for each t-terminal (t ≥ 2) nets (ni ∈N) with 100% routability and usage
≤ 100%
for all sorted nets ni ∈ N do
Gri = ConstructGSRG(G j ,ni)
if Netdegree(ni ) == 2 then
/*Netdegree(ni) = Number of terminals in ni*/
Source = IdentifySource(ni .terminals) /*for Forward or Backward search (vide Fig. 9)*/
Path(Source,Sink) = DijkstraSSP(Gri ,Source)
if There exists a routing path from Source to Sink then
ni is routed.
Update uCap for the respective channel segments.
NetLength(ni)
ViaCount(ni)
else
Routing ni is a failure and continue for ni+1
end if
else
Gci = ConstructNodeClique(ni .terminals)
Tci = ObtainMST(Gci) /*described in Section 4.3*/
for all edges (t j, tk) ∈ Tci do
Source = IdentifySource(t j , tk) /*for Forward or Backward search (vide Fig. 9)*/
Path(Source,Sink) = DijkstraSSP(Gri ,Source)
if There exists a routing path from Source to Sink then
2-terminal net segment is routed; calculate the segment length.
update the uCap for the respective channel segments.
else
Routing ni is a failure and continue for ni+1
end if
end for
Identify the Steiner Point(s) /*vide Fig. 8*/
NetLength(ni)
ViaCount(ni)
end if
end for
In the unreserved layer model, routing a net may incur a number of vias due
to difference in the metal layers used to route through the corresponding routing
resources. It does not depend on their vertical/horizontal orientation. In case of re-
served layer model, the number of vias along a routing path depends on the number
of bends in it, i.e., the alternating (horizontal/vertical) orientation of the contiguous
routing resources, for a minimum change of one metal layer among the resources
along that path [9, 18]. Congestion in channels may also contribute to the number
of vias along a routing path, in both the cases. From the example shown in Figure
10 (a) and (b), we notice that the routing path for a given net (tg,tc) needs 3 and 5
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 17
vias for FWD and BACK searches respectively. Therefore, depending on the netlist
and the floorplan topology of a given circuit, one method may dominate over the
other. This method can be extended to t (> 2)-terminal nets, since we decompose
those nets using the method stated earlier into 2-terminal net segments and a better
routing path for each of the resulting net segments can be obtained while employing
either of the search procedures at a time.
t g
t c
t g
t c
(b)(a)
A B C
F
JH
E
D
G(Source)
(Sink)
A B
F
JH
E
D
G
C
(Source)
(Sink)
Fig. 10 Impact on via count in a routing path based on: (a) Forward Search with 3 vias, and (b)
Backward Search with 5 vias
Before the routing procedure starts, the nets (ni ∈ N) are ordered based on their
half perimeter wire length (HPWL), and the number of terminals (Netdegree). The
net ordering (priority) is determined based on the non-decreasing order of HPWL
first and then Netdegree. A net with smaller HPWL and then Netdegree, has the
precedence over other nets. The aim is to ensure that the shorter (local) nets are
routed before the longer ones so as to avoid congestion in the routing resources as
well as have a uniform routing distribution across the layout of the design.
We illustrate the working of this algorithm for t (≥ 2)-terminal nets in Figure 6.
Theorem 1. Given a floorplan having n blocks and k nets having at most t-terminals
(t ≥ 2), the algorithm STAIRoute takes O(n2kt) time.
Proof. From Lemma 4, we say that GSRG construction takes O(t) time. For each
2-terminal net routing, finding the Source vertex takesO(n) and our implementation
of Dijkstra’s single source shortest path algorithm (DijkstraSSP) takes O(n2).
Again, for t-terminal (t > 2) nets, computingGci takes O(t
2) and our implemen-
tation of Prim’s algorithm takes O(t2). For each terminal pair (ti, t j), we obtain the
shortest path using DijkstraSSP in O(n2) time. Thus, for each t(≥ 2) terminal net,
the time complexity is O(t+ t2+ n2t), i.e., O(n2t), since a given net may be con-
nected to all n blocks resulting in t = n in the worst case. Therefore, the overall
worst case time complexity for all k nets is O(n2kt). ⊓⊔
18 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
5 Experimental Results
We have implemented the proposed algorithm STAIRoute in C and run on a 64bit
Linux platform powered by Intel Core2 Duo (1.86GHz) and 2GB RAM. We used
source code for top-down hierarchical monotone staircase bipartitioning algorithm
implemented by [7] to obtain the BAG and MSC tree data structure. We used
MCNC/GSRC hard floorplanning benchmark circuits as given in Table 1. In or-
der to test our algorithm, four different instances for each of the benchmarks were
generated with a random seed using Parquet [1, 17] tool. For a given circuit, the best
case (BC) and the worst case (WC) instances among a set of floorplan topologies
are solely designated in the context of total half perimeter wire length (HPWL) of
all the nets, as the ones with the smallest and the largest HPWL respectively.
Suite Circuit #Blocks #Nets Avg. Net
original modified degree
MCNC apte 9 97 44 3.500
hp 11 83 44 3.545
xerox 10 203 183 2.508
ami33 33 123 84 4.154
ami49 49 408 377 2.337
GSRC n10 10 118 54 2.129
n30 30 349 147 2.102
n50 50 485 320 2.112
n100 100 885 576 2.135
n200 200 1585 1274 2.138
n300 300 1893 1632 2.161
Table 1 MCNC and GSRC benchmark Circuits.
In our experiments, we consider only the internal nets without IO PAD connec-
tivity and hence modify the given netlist. In this work, the focus is on the signal nets
only. Since there are many nets which have IO PAD connectivity, we modify them
to signal nets by removing the IO PAD connectivity with at least 3 terminals and
discard those having fewer terminals, because modifying it would result a floating
net with only one terminal connected to it. Due to lack of pin location information in
GSRC benchmarks, we assumed those pins to be situated at the center of the blocks,
unlike the circuits in MCNC benchmark.
5.1 Results using Unreserved Layer Model
To the best of our knowledge, our global routing method, using monotone stair-
case channels, is novel. Therefore, comparison of our results with those by existing
global routers is not meaningful. Instead, we compare the length of each of the t-
terminal (t > 2) nets given by our algorithm with that computed by FLUTE [3] that
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 19
does not consider any congestion scenario. We obtained all the results given in this
subsection by using unreserved layer model up to 2 metal layers and ensure that no
congestion takes place as per the framework.
In Table 2, we summarize the results obtained for runtime and net length and
related statistics for each of the circuits as given in Table 1. As reported, there is
no congestion in any of the routing resources. These results show that routed net
length given by our method is comparable to both HPWL and FLUTE length, as the
average Netdegree for GSRC benchmark circuits is slightly higher than 2, while the
same is 4.15 for MCNC benchmark circuits. These are captured in R1, R2 and R3
columns of Table 2 respectively. It shows that our net length to Steiner length ratio
(R3) computed by FLUTE [3] is about 1.15 for almost all the cases.
Net Length Net Length Ratio
Circuit Runtime Routing Routed(L) HPWL(H) FLUTE[3](F) R1 R2 R3 #Via
Name (sec) %age (µm) (µm) (µm) (L/H) (F/H) (L/F)
apte 0.192 100 397447.031 340975.188 376652.000 1.166 1.105 1.055 0
hp 0.200 100 284601.844 243290.828 245801.000 1.169 1.010 1.158 2
xerox 0.316 100 688107.625 616963.000 633533.000 1.115 1.027 1.086 0
ami33 0.744 100 161636.359 132371.625 142748.000 1.221 1.078 1.132 4
ami49 2.012 100 1794979.375 1601862.500 1629255.000 1.120 1.017 1.101 0
n10 0.164 100 18505.500 16635.500 16626.000 1.112 0.999 1.113 0
n30 0.444 100 56475.500 49394.500 49370.000 1.143 0.999 1.143 0
n50 1.372 100 144451.000 124991.500 125018.000 1.155 1.001 1.155 4
n100 6.748 100 237653.500 214426.000 214578.000 1.108 1.001 1.107 6
n200 43.991 100 410849.000 380510.500 381021.000 1.080 1.002 1.078 12
n300 104.382 100 744416.000 698348.000 699006.000 1.065 1.001 1.064 2
Average Ratio 1.132 1.022 1.108 -
Table 2 Summary of global routing results using unreserved layer model for up to 2 metal layers
[8]
The last column contains the estimated number of vias given by our algorithm
for each of the circuits. It is important to note that we obtained all the above results
by restricting ourselves up to 2 metal layers. We see that while some of the circuits
return zero via count, confining the routing of all the nets in the first metal layer
only, the other instances resulting in non zero via count indicates routing through
two metal layers in case of congestion (psk = 1) in certain routing resources in the
bottom metal layer.
5.2 Results using Reserved Layer Model
We further extend our experiments by running the proposed global routing method
using HV reserved layer model restricted up to 8 metals layers. We consider BC and
WC floorplan topologies for each of the circuits. In these experiments, we refer to
Figure 7 (a) for 3 different capacity scaling profiles, and also refer to Figure 9 for
20 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
two possible directions in order to explore possible routing paths. We iterate that all
the results presented here correspond to 100% routability and restricted to Under-
Congestion region of Figure 5 that ensures no congestion in any routing resource.
Following are the configurations we consider while conducting the experiments
on the benchmark circuits given in Table 1:
1. Forward search method with No Capacity Scaling (FCN)
2. Forward search method with Hyperbolic Capacity Scaling (FCH)
3. Forward search method with Ladder type Capacity Scaling (FCL)
4. Backward search method with No Capacity Scaling (BCN)
5. Backward search method with Hyperbolic Capacity Scaling (BCH)
6. Backward search method with Ladder type Capacity Scaling (BCL)
In Figure 11, we present the global routing results for n300 for all the six run
configurations and two different floorplan instances, namely BC and WC. While
studying these plots, we notice that the forward (backward) search with hyperbolic
scaling FCH (BCH) gives the worst results as compared to the other two configura-
tions {FCN,FCL} ({BCN,BCL}) both in terms of routed net length and via count,
both in BC and WC. This is due to the fact that the hyperbolic profile is the most
stringent profile among the other profiles depicted in Figure 7 (a).
(a) (b)
Fig. 11 Plot for best case (BC) and worst case (WC) floorplan instances of n300 vs. different run
configurations: (a) Net length (µm), and (b) Via Count
Next, we focus on the corresponding results obtained for the remaining configu-
rations {FCN,FCL} ({BCN,BCL}) for both BC and WC topologies. Figure 11 (a)
shows that FCN (BCN) gives better net length against FCL (BCL) both in BC and
WC. Although it reflects a similar trend in the respective via count for the WC topol-
ogy, FCL (BCL) has better via count as compared to FCN (BCN) in BC (Figure 11
(b)). We conduct another set of comparison for net length and via count between
FCN and BCN (FCL and BCL) for both BC and WC. Although backward search
produces better net length as compared to that in forward search method, it incurs
more vias to route a set of nets than its counterpart. This clearly shows that a global
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 21
routing solution not only depends on the search direction and the capacity profiles,
but also on different floorplan instances of the same circuit.
Circuit FCN(µm) FCH(µm) FCL(µm) BCN(µm) BCH(µm) BCL(µm) FLUTE(F)
(FCN/F ) (FCH/F) (FCL/F) (BCN/F ) (BCH/F ) (BCL/F ) (µm)
apte 398137.031 398411.250 398137.031 396034.313 396308.563 396034.313 338628.000
(1.1757) (1.1765) (1.1757) (1.1695) (1.1703) (1.1695)
hp 201996.672 203060.359 201996.672 211536.609 210086.344 211536.609 123716.000
(1.6327) (1.6413) (1.6327) (1.7099) (1.6981) (1.7099)
xerox 716916.688 717291.625 716916.688 710575.375 710575.375 710575.375 633533.000
(1.1316) (1.1322) (1.1316) (1.1216) (1.1216) (1.1216)
ami33 111126.102 111563.070 111126.102 111481.008 111896.906 111481.008 92330.000
(1.2036) (1.2083) (1.2036) (1.2074) (1.2119) (1.2074)
ami49 1925760.875 2009223.625 1925760.875 1926176.750 2010630.125 1926176.750 1608746.000
(1.1971) (1.2489) (1.1971) (1.1973) (1.2498) (1.1973)
n10 19837.000 19837.000 19837.000 18497.500 18497.500 18497.500 16626.000
(1.1931) (1.1931) (1.1931) (1.1126) (1.1126) (1.1126)
n30 59585.000 59585.000 59585.000 58761.500 58761.500 58761.500 49370.000
(1.2069) (1.2069) (1.2069) (1.1902) (1.1902) (1.1902)
n50 151604.000 152119.000 151851.000 150741.000 151256.000 150988.000 125018.000
(1.2127) (1.2168) (1.2146) (1.2058) (1.2099) (1.2077)
n100 251455.500 252647.500 251476.500 250771.000 251917.000 250792.000 212112.000
(1.1855) (1.1911) (1.1856) (1.1823) (1.1877) (1.1824)
n200 429854.500 430043.500 429923.500 428987.000 429128.000 429056.000 381021.000
(1.1282) (1.1287) (1.1283) (1.1259) (1.1263) (1.1261)
n300 792135.000 792323.000 792229.000 791707.500 791895.500 791801.500 699006.000
(1.1332) (1.1335) (1.1334) (1.1326) (1.1329) (1.1328)
Table 3 Net length comparison for the run configurations vs. FLUTE [3] length: for best case
(BC) floorplan instance
In Table 3 (4), we summarize the net length obtained for all the circuits in all the
configurations for the respective BC (WC) topologies and compare them with the
corresponding FLUTE [3] length. For a given configuration, the corresponding net
length is accompanied by its ratio of net length and FLUTE length, e.g., FCN/F in
the bracket below it and the best ratio(s) are highlighted. It is evident that from these
results that, except for the circuits with 100 or more blocks, the length ratio pairs
FCN/F and FCL/F (BCN/F and BCL/F) have little variation and also shows that
BCN yields the best net length of a given circuit with respect to the corresponding
FLUTE length for most of the circuits.
In Table 5, we present the via count for all the configurations for each circuit
in BC (WC results in brackets). The results clearly point out the consequence of
forward and backward search on via count. It is also evident that via count in
case of FCH (BCH) is the worst as compared to other two configurations, namely
{FCN,FCL} ({BCN,BCL}), as we have encountered in case of net length. There
is little variation in via count for relatively smaller circuits in case of {FCN,FCL}
({BCN,BCL}) and becomes significant for relatively larger circuits. The best via
22 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
Circuit FCN(µm) FCH(µm) FCL(µm) BCN(µm) BCH(µm) BCL(µm) FLUTE(F)
(FCN/F ) (FCH/F) (FCL/F) (BCN/F ) (BCH/F ) (BCL/F ) (µm)
apte 450376.469 450376.469 450376.469 438912.781 439198.781 438912.781 389806.000
(1.1554) (1.1554) (1.1554) (1.1260) (1.1267) (1.1260)
hp 232782.313 232782.313 232782.313 230255.938 230255.938 230255.938 144993.000
(1.6055) (1.6055) (1.6055) (1.5880) (1.5880) (1.5880)
xerox 1542729.875 1590995.125 1542729.875 1511243.250 1558185.250 1511243.250 1391401.000
(1.1088) (1.1434) (1.1088) (1.0861) (1.1199) (1.0861)
ami33 120903.578 120903.578 120903.578 118746.414 118969.086 118746.414 105025.000
(1.1512) (1.1512) (1.1512) (1.1306) (1.1328) (1.1306)
ami49 1914369.375 1914369.375 1914369.375 1898528.125 1898528.125 1898528.125 1684114.000
(1.1367) (1.1367) (1.1367) (1.1273) (1.1273) (1.1273)
n10 24526.500 25116.500 24526.500 23707.500 24297.500 23707.500 20012.000
(1.2256) (1.2551) (1.2256) (1.1847) (1.2141) (1.1847)
n30 74743.500 75105.500 74838.500 74151.500 74513.500 74246.500 59879.000
(1.2482) (1.2543) (1.2498) (1.2384) (1.2444) (1.2399)
n50 187971.000 189752.000 188476.000 187197.000 188994.000 187702.000 158173.000
(1.1884) (1.1996) (1.1916) (1.1835) (1.1949) (1.1867)
n100 277304.500 277950.500 277583.500 276545.000 277251.000 276824.000 238841.000
(1.1610) (1.1637) (1.1622) (1.1579) (1.1608) (1.1590)
n200 836136.000 865590.000 837493.000 835535.500 864329.500 836928.500 749479.000
(1.1156) (1.1549) (1.1174) (1.1148) (1.1532) (1.1167)
n300 946039.500 949547.500 947076.500 945688.500 949165.500 946675.500 830035.000
(1.1398) (1.1440) (1.1410) (1.1393) (1.1435) (1.1405)
Table 4 Net length comparison for the run configurations vs. FLUTE [3] length: for worst case
WC floorplan instance
count for each circuit in both BC and WC (in brackets) is mostly obtained in case
of {FCN,FCL}.
In our congestion analysis, we use the method prescribed in [22] to analyze
the congestion scenario of the given floorplan instance of a circuit to estimate its
routability. The authors [22] proposed a new metric called Average Congestion
per edge for certain percentage of all congested global routing edge, denoted as
ACE(x%) where x is the percentage value of the worst congested edges. They sub-
sequently computed another parameter which is an weighted average of ACE(x%)
for four different values of x = 0.5,1,2,5 and denoted as wACE4. In our case, we
consider the monotone staircase channels as routing resources and their normalized
usage (vide Equation 2) is the measure of congestion.
In Table 6, we capture wACE4 for each circuit for all the configurations in BC
andWC (in brackets) to showcase the corresponding congestion scenario with 100%
routability of the nets and validate that our global routing framework conforms to
that depicted in Figure 5. In these experiments, wACE4 is chosen as the maximum
of the respective wACE4 values for each of 8 metal layers.
The plot in Figure 12 shows the impact of different run configurations on runtime
variation for n300. Clearly, those given by FCH (BCH) are the worst among the
respective FWD (BACK) configurations, while FCL gives the best runtime for both
BC and WC with respect to the remaining configurations.
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 23
Circuit FCN FCH FCL BCN BCH BCL
apte 404 508 404 412 504 412
(452) (660) (452) (460) (656) (460)
hp 502 720 502 536 730 536
(430) (630) (430) (430) (626) (430)
xerox 1190 1238 1190 1220 1272 1220
(1401) (2261) (1401) (1527) (2369) (1547)
ami33 1156 1500 1156 1162 1530 1162
(1240) (1528) (1240) (1234) (1586) (1234)
ami49 3290 4466 3290 3406 4676 3406
(3629) (3789) (3629) (3877) (4137) (3877)
n10 176 176 176 176 176 176
(220) (278) (220) (222) (260) (222)
n30 937 941 937 956 960 956
(1047) (1100) (1014) (1059) (1108) (1026)
n50 3194 3609 3184 3191 3598 3181
(3252) (5089) (3296) (3402) (5078) (3442)
n100 6748 7553 6748 6795 7623 6799
(7742) (10050) (7746) (7846) (10146) (7850)
n200 18016 18040 18008 17977 17993 17969
(16905) (26828) (17610) (17253) (27041) (17571)
n300 29639 29785 29627 29687 29841 29675
(32814) (35955) (33093) (33235) (36624) (33461)
Table 5 Via count comparison for the run configurations: for best case (worst case) floorplan
instance
Circuit FCN FCH FCL BCN BCH BCL
apte 0.9011 0.8384 0.9011 0.9449 0.9063 0.9449
(0.8738) (0.9861) (0.8738) (0.7971) (0.8750) (0.7971)
hp 0.9914 0.9871 0.9914 0.9871 0.9853 0.9871
(0.9906) (0.8135) (0.9906) (0.9906) (0.8021) (0.9906)
xerox 0.6229 0.6139 0.6229 0.6229 0.6090 0.6229
(0.9940) (0.9375) (0.9940) (0.9583) (0.9375) (0.9583)
ami33 0.9673 0.6874 0.9673 0.9712 0.6920 0.9712
(0.9388) (0.9944) (0.9388) (0.9120) (0.8102) (0.9120)
ami49 0.9750 0.9931 0.9750 0.9750 0.9976 0.9750
(0.7129) (0.7215) (0.7129) (0.7115) (0.7068) (0.7115)
n10 0.6435 0.6435 0.6435 0.6435 0.7670 0.6435
(0.9625) (0.9500) (0.9625) (0.9625) (0.9500) (0.9625)
n30 0.6585 0.9826 0.6585 0.6570 0.9788 0.6570
(0.9547) (0.9667) (0.9547) (0.9546) (0.9500) (0.9546)
n50 0.8179 0.6627 0.8209 0.8163 0.6343 0.8182
(0.8399) (0.8246) (0.8488) (0.8644) (0.8351) (0.8733)
n100 0.9904 0.9772 0.9904 0.9871 0.9883 0.9871
(0.9187) (0.9974) (0.9187) (0.9282) (0.9984) (0.9282)
n200 0.4892 0.8269 0.4892 0.4831 0.8152 0.4831
(0.8750) (0.6793) (0.8555) (0.8923) (0.6463) (0.8328)
n300 0.6561 0.9863 0.6561 0.6681 0.9843 0.6681
(0.9813) (0.9964) (0.9811) (0.9774) (0.9903) (0.9770)
Table 6 Congestion (wACE4) comparison for the run configurations: for best case (worst case)
floorplan instance
24 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
Fig. 12 Runtime(sec): best case (BC) and worst case (WC) floorplan instances of N300 vs. differ-
ent run configurations
In Table 7, we report runtime in seconds for all the circuits versus the said run
configurations. These results correspond to both BC and WC (in brackets) floorplan
instances. As we can see that the best (as highlighted) runtime in the context of BC
and WC (in brackets) floorplan instances for a given circuit is given by either FCL
or BCL for most of the cases.
6 Conclusion
In this paper, we proposed a novel global routing framework based on monotone
staircase channels obtained by hierarchical bipartitioning of the flooplan instances
of a given circuit. It thus immediately follows the floorplanning stage and hence re-
quire no detailed placement. Unlike the existing global routers, the monotone stair-
case channels act as the routing resources and the nets are routed strictly though
them for a given number of metal layers, using unreserved or reserved layer model.
The congestion scenario is modeled in the global routing model in such way that the
utilization is no more than 100% in any of the routing resources, supported by the
values of wACE4 parameter. Multi-terminal net decomposition using the proposed
Steiner tree method is unique and no other existing methods are known to work in
this regard. Our experimental results show that 100% routing completion is possible
without any congestion even for different capacity profiles that include constrained
metal pitch/width variation due to recent fabrication processes. Additionally, em-
ploying different search directions shows that improvement in routed net length and
via count may be achieved.
STAIRoute: Early Global Routing using Monotone Staircases for Congestion Reduction 25
Circuit FCN FCH FCL BCN BCH BCL
apte 0.114 0.109 0.103 0.108 0.109 0.113
(0.112) (0.106) (0.102) (0.103) (0.104) (0.103)
hp 0.115 0.107 0.105 0.107 0.102 0.103
(0.114) (0.107) (0.100) (0.107) (0.102) (0.101)
xerox 0.126 0.124 0.122 0.122 0.121 0.121
(0.140) (0.125) (0.116) (0.130) (0.125) (0.120)
ami33 0.172 0.167 0.151 0.161 0.159 0.156
(0.186) (0.178) (0.158) (0.158) (0.166) (0.170)
ami49 0.464 0.453 0.438 0.441 0.458 0.442
(0.483) (0.465) (0.446) (0.463) (0.465) (0.460)
n10 0.122 0.099 0.101 0.101 0.101 0.102
(0.109) (0.101) (0.105) (0.103) (0.105) (0.102)
n30 0.181 0.164 0.166 0.164 0.160 0.164
(0.162) (0.161) (0.153) (0.163) (0.164) (0.157)
n50 0.423 0.410 0.398 0.414 0.399 0.396
(0.449) (0.437) (0.440) (0.444) (0.441) (0.432)
n100 2.480 2.408 2.412 2.399 2.445 2.386
(2.076) (2.096) (2.064) (2.073) (2.070) (2.054)
n200 18.342 18.741 17.718 18.201 18.175 17.533
(21.472) (20.885) (20.909) (21.452) (21.315) (20.625)
n300 53.151 54.596 50.929 52.739 53.656 51.730
(57.993) (58.719) (55.090) (55.856) (58.742) (57.069)
Table 7 Runtime (sec) Comparison for the run configurations: for best case (worst case) floorplan
instance
Therefore, the proposed global routing method has a two fold advantage: (a)
evaluate the feasibility of global routing at the floorplanning stage and (b) estimate
the global routingmetrics and related information essential for the subsequent stages
of the design flow. This technique, therefore, opens up a new versatile option in
the traditional design flow. This work may be extended to incorporate design for
manufacturability (DFM) issues by suitably modeling them into this framework.
References
1. S. N. Adya and I. L. Markov, “Fixed-outline Floorplanning : Enabling Hierarchical Design”,
IEEE Transactions on VLSI Systems, Vol. 11, No. 6, pp. 1120-1135, December 2003.
2. Z. Cao, et. al., “Fashion: A Fast and Accurate Solution to Global Routing Problem”, IEEE
Transactions on Computer Aided Design of Integrated Circuits and Systems, Vol. 27, No. 4,
pp. 726-737, April 2008.
3. C. Chu, “FLUTE: Fast lookup table based wire length estimation technique”, Proc. of Inter-
national Conference on Computer-Aided Design, pp. 696-70, 2004.
4. T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein, “Introduction to Algorithms”, 3rd
Edition, MIT Press, 2009.
5. P. Dasgupta, P. Pan, S.C. Nandy and B.B. Bhattacharya, “Monotone Bipartitioning Problem
in a Planar Point Set with Applications to VLSI”, ACM Transactions on Design Automation
of Electronic Systems, Vol. 7, No. 2, pp. 231-248, 2002.
26 Bapi Kar, Susmita Sur-Kolay and Chittaranjan Mandal
6. M. Hanan, “On Steiner’s Problem with Rectilinear Distance”, SIAM Journal of Applied Math-
ematics, Vol. 30, No. 1, pp. 104-114, January 1976.
7. B. Kar, S. Sur-Kolay, S.H. Rangarajan and CMandal, “A Faster Hierarchical Balanced Biparti-
tioner for VLSI Floorplans usingMonotone Staircase Cuts”, Proc. of International Symposium
on VLSI Design and Test (VDAT), LNCS Vol. 7373, pp. 327-336, 2012.
8. B. Kar, S. Sur-Kolay and C Mandal, “STAIRoute: Global Routing using Monotone Staircase
Channels”, Proc. of International Symposium on VLSI (ISVLSI), pp. 90-95, 2013.
9. B. Kar, S. Sur-Kolay and CMandal, “Global Routing using Monotone Staircases withMinimal
Bends”, Proc. of International Conference on VLSI Design (VLSID), pp. 369-374, 2014.
10. R. Kastner, E. Bozorgzadeh and M. Sarrafzadeh, “Pattern Routing: Use and Theory for In-
creasing Predictability and Avoiding Coupling”, IEEE Transactions on Computer Aided De-
sign of Integrated Circuits and Systems, Vol. 21, No. 7, pp. 777-790, July 2002.
11. C.Y. Lee, “An Algorithm for Path Connections and Its Applications”, IRE Transactions on
Electronic Computers, pp.346-365, September 1961.
12. S. Majumder, S. C. Nandy and B. B. Bhattacharya, “On Finding a Staircase Channel with
Minimum Crossing Nets in a VLSI Floorplan”, Journal of Circuits, Systems and Computers,
Vol. 13, No. 5, pp. 1019-1038, 2004.
13. S. Majumder, S. Sur-Kolay, B. B. Bhattacharya and S. Das, “Hierarchical Partitioning of VLSI
Floorplans by Staircases”, ACM Transactions on Design Automation of Electronic Systems,
Vol. 12, No. 1, Article 7, pp. 141-159, 2007.
14. M. Cho, K. Lu, K. Yuan and D. Z. Pan, “BoxRouter 2.0: A Hybrid and Robust Global Router
with Layer Assignment for Routability”, ACM Transactions on Design Automation of Elec-
tronic Systems, Vol. 14, No. 2, Article 32, pp. 1-21, March 2009.
15. http://www.eetimes.com/document.asp?doc_id=1279842
16. M. Pan and C. Chu , “FastRoute: A Step to Integrate Global Routing into Placement”, Proc.
of IEEE International Conference on Computer Aided Design (ICCAD), November 2006.
17. “Parquet Floorplanner, Rev-4.5”, http://vlsicad.eecs.umich.edu/BK/parquet ,
University of Michigan, 2006.
18. J. Roy and I. Markov, “High-performance Routing at the Nanometer Scale”, IEEE Trans-
actions on Computer Aided Design of Integrated Circuits and Systems, Vol. 27, No. 6, pp.
1066-1077, June 2008.
19. N. Sherwani, “Algorithms for VLSI Physical Design Automation”, Kluwer Academic Pub-
lishers, 1993.
20. S. Sur-Kolay and B. B. Bhattacharya, “The cycle structure of channel graphs in non-sliceable
floorplans and a unified algorithm for feasible routing order”, Proc. of IEEE International
Conference on Computer Design (ICCD), pp. 524-529, 1991.
21. D.Wong and M. Guruswamy, “Channel Ordering for VLSI Layout with Rectilinear Modules”,
Transactions on Computer Aided Design of Integrated Circuits and Systems, Vol. 10, No. 11,
pp. 1425-1431, November 1991.
22. Y. Wei et. al., “GLARE: Global and local wiring aware routability evaluation”, Proc. of
IEEE/ACM Design Automation Conference (DAC), pp. 768-773, June 2012
23. H. Yang and F. Wong, “Efficient network flow based min-cut balanced partitioning”, IEEE
Transactions on Computer Aided Design and Integrated Circuits and Systems, vol. 15, no. 12,
pp. 1533-1540, December 1996.
