Abstract With aggressively shrinking process technologies, physical design faces severe challenges and early detection of failures is mandated. It may otherwise lead to many iterations and thus impact time-to-market. This has encouraged to devise a feedback mechanism from a lower abstraction level of the design flow towards the higher levels. Some of these efforts include placement driven synthesis, routability (timing) driven placement etc. Motivated by this philosophy, we propose a novel global routing method using monotone staircase routing regions (channels), defined at the floorplanning stage. The intent is to identify the feasibility of a floorplan topology of the given design netlist by estimating routability, routed net length and the number of vias while taking into account global congestion scenario across the layout. This framework works on both unreserved as well as HV reserved layer model for M(≥ 2) metal layers and accommodates different capacity profiles of the routing resources, due to uniform or different cases of metal pitch variation across the metals layers akin to the latest technologies. This algorithm takes O(n 2 kt) time for a given design with n blocks and k nets having at most t terminals. Experimental results on MCNC/GSRC floorplanning benchmarks show 100% routability while congestion in the routing regions restricted to 100%. The routed net length for all t-terminal (t ≥ 2) nets is comparable with the steiner length computed by FLUTE. An estimation on the number of vias for different capacity profiles are also obtained.
Introduction
In IC design flow, global routing (GR) is indispensable, particularly as an aide to detailed routing (DR) of the wires through different metal layers. Shrinking feature dimensions with technological advances in IC fabrication process pose more challenges on the physical design phase. There has been a tremendous increase in routing constraints arising from not only stringent layout design rules but also process variations and sub-wavelength effects of optical lithography. Successful routing completion of the nets without too many iterations or sacrifice in the performance of the designs is thus mandated. In grid based routing methods, multi-terminal nets are decomposed into two terminal segments using Steiner tree decomposition using Rectilinear minimum spanning tree (RMST) [18] , or Rectilinear Steiner minimal tree (RSMT) [16] topology as an initial solution with minimum length. Subsequently, congestion driven routing for each two terminal net segment is adopted through the routing regions. The congestion models in those methods have been formulated based on the capacity of the grids and the routing demands through them, along with a penalty function. For any unsuccessful routing due to over congestion (≥ 100%), Rip-up and Re-reroute (RRR) techniques using maze routing [11, 19] have been applied for possible routing completion while compromising in net length due to detour. The major chal-lenge in the event of unsuccessful routing is to get back to placement stage in order to generate a new placement solution, but with no guarantee for successful routing completion (vide Figure 1 (a) ). This may lead to several iterations until the goal is achieved and thus prove to be very costly if the entire design implementation is not completed within a stipulated time frame. In other words, this may have severe impact on time-to-market of the intended design.
The possibility of recurring iterations at the placement stage (vide Figure 1 (a)) due to failure at global routing stage may however be avoided if we can predict the feasibility of global routing as early as at the floorplanning stage, as depicted in Figure 1 (b) . This comprises the identification of monotone staircase channels as the routing resources while estimating their capacity and formulating the congestion model. These types of routing resources are known to have advantages of acyclic routing order for successful routing completion [20, 21] and avoidance of switch box routing [19] . They also allow easy channel resizability [21] to mitigate heavy congestion (≥ 100%).
In the recent past, single bend (L shaped) [10] , two bend (Z shaped) [10, 16] , or even with more bends such as monotone staircase patterns [2] has gained significant importance in grid-based global routing. With increasing number of bends, they yield more flexibility in order to find a possible routing path, but at the cost of more vias. It is also shown that pattern based routing [10] is much faster than maze routing, and monotone staircase pattern routing [2] has the same time complexity as with Z shaped patterns. A thoughtful trade off between routability (also net length) and the number of vias has to be made while keeping in mind that the routing resources are not over congested. Recent work monotone staircase bipartitioning method [9] attempted to address the minimization of the number of vias along a monotone staircase routing path by minimizing the number of bends in it [2] . Additionally, the pattern based routing are shown to help in cross talk minimization [10] .
Our Contribution

Outline of the proposed Global Routing technique
In this paper, we propose a new global routing method following the floorplanning stage using monotone staircase channels for routing completion with no congestion in the routing regions. It is important to note that this global routing framework is not grid based alike [2, 10, 14, 16, 18] . Outline of the proposed global router STAIRoute (vide Figure 2 ) are as follows:
1. Identification of the routing regions as monotone staircase channels derived from a given floorplan topology by using monotone staircase bipartitioning algorithm; a graph theoretic formulation with these staircase channels is used to determine a feasible routing path for all the nets; Monotone Staircase Channel Definition [7] for each net Fig. 2 Outline of the proposed global router 2. Net ordering based on half perimeter wire length (HPWL) and the number of terminals (Netdegree); 3. Decomposing multi-terminal nets into an equivalent set of two-terminal net segments using minimum spanning tree algorithm and defining a new Steiner tree topology; 4. Routing solution for a given number of metal layers using a shortest path algorithm to find the best possible routing path while respecting the prevailing congestion scenario (of < 100% utilization) across the layers; 5. Ensuring congestion in the routing regions is restricted to 100% across a given number of metal layers.
Salient Features
The salient features of the proposed global routing method presented in this paper are as follows:
1. monotone staircase channels as the routing resources for improved flexibility in identifying a routing path of a net 2. routing the nets through the monotone staircase channels only 3. over-congestion free global routing model 4. new steiner tree topology for multi-terminal net decomposition 5. compatible to both unreserved and reserved layer model for a given number of metal layers 6. estimation of the number of vias This paper is organized as follows: Section 3 revisits the preliminaries of monotone staircase bipartitioning paradigm followed by Section 4 that includes related topics and the proposed global routing method using monotone staircases as the routing resources. Results are presented in Sections 5, and concluding remarks in 6.
Preliminaries
In this section, we briefly review monotone staircase channels and the birpartitioning framework in order to obtain them immediately after the floorplan stage. Methods for top-down hierarchical monotone staircase bipartitioning of VLSI floorplans, both in Area-balanced and Number-balanced bipartition appear in [5, 7, 9, 12, 13] . Area-balanced bipartition is employed when the area of the blocks in a given floorplan have significant variance, whereas Number-balanced bipartition is applicable for negligible variance in the area of the blocks. In [12, 13] , the balanced bipartitioner used iterative max-flow based [23] min-cut algorithm and thereby incurred higher time complexity at each level of the hierarchy. In [5] , emphasis has been given to the hierarchical number balanced monotone staircase bipartitioning using depth-first traversal method in linear time at a given level of the hierarchy.
Recently, a faster yet more accurate top-down hierarchical monotone staircase bipartition [7] has been proposed to generate monotone staircase cuts, abbreviated as ms-cut as we subsequently refer to it. Their algorithm takes O(nk logn) time and also ensures ms-cuts of increasing (decreasing) orientation at alternate levels of the hierarchy (vide Figure 4 (a) and (b)), namely MSC tree.
In order to identify monotone increasing (decreasing) staircase channels (C I (C D ) as depicted in Figure 3 (a) ((b)), abbreviated as MIS (MDS), for a given a planar embedding of a floorplan topology with n blocks, an unweighted directed graph [7, 12] Lemma 1. Given a floorplan with n blocks, its MSC tree (V m , E m ) corresponding to the set C of monotone staircase channels has n − 1 ms-cuts (internal nodes).
Proof. In a full binary tree, an internal node has two children (out degree = 2); whereas an external (leaf) node has no children (out degree = 0). In our case, the internal nodes correspond to the ms-cuts in the MSC tree, and the external nodes are the blocks in the given floorplan.
is the resulting MSC tree as shown in Figure 4 (b). ⇒ 2 * |C| + 0 * n = (|C| + n) -1; where |C| and n are the number of ms-cuts and blocks respectively.
Global Routing using monotone staircase channels
Routing Region Definition
Using the recent hierarchical monotone staircase bipartition framework [7] , we obtain a set of MIS (MDS) channels C = {C i } at alternate levels of the hierarchy in MSC tree. These channels are used as the routing resources for the proposed global routing framework. Each monotone staircase channel consists of one or more rectilinear segments, called channel segment, bounded by a distinct pair of blocks. For each channel and its segment(s), the number of nets to be routed through it, denoted as its reference capacity rCap, is computed from the number of cut nets in the respective ms-cut node in the MSC tree. During routing, its capacity usage uCap, gives the channel utilization; it is initialized to 0. For the rest of the paper, we refer channel and segment to monotone staircase channel and its rectilinear segment respectively. As in Figure 3 (a), the highlighted ms-cut C I on BAG contains seven cut edges {GH, DH, EH, EJ, EF, BF and BC} corresponds to the MIS channel C I having seven segments. Additionally, it has two more horizontal segments: one for the bottom side of the block G at the bottom-left corner with the boundary of the floorplan while the other is on the top side of the block C at the top-right corner of the floorplan. Figure 3 (b) illustrates a case of MDS channel C D having six segments. Figure  4 (a) shows a flooplan along with a set of MIS/MDS channels C 0 to C 7 . Channels with one segment having either vertical or horizontal orientation are termed as degenerate monotone staircase channel. The channel C 7 in Figure 4 (a) is an example of such a channel.
However, there exist a few more isolated segments along the boundary of the floorplan that are not identified as part of the MSC tree generation, and can be termed as non-MS channels. In Figure 4 (a), C n1 to C n4 are the example of a few such channels. Their capacity rCap is computed based on the number terminals on it, and those with nonzero rCap contribute to global routing as valid routing resources.
Junction Graph and the Congestion model
In this section, we present our global routing framework using monotone staircase channels and their intersection points, the T-junctions (vide Figure 4 (c)). It is evident that there exists a segment between each pair of adjacent T-junctions; henceforth referred as junctions.
Lemma 2. Given a floorplan with n blocks, the number of T junctions in it is 2n − 2.
Proof. Every internal face in BAG corresponds to a T-junction, and is bounded by 3 edges. Thus we have 3( f − 1) = 2m excluding the exterior face, where f and m being the number of faces and edges in BAG respectively. Using Euler formula for planar graphs, n − m + f = 2, and replacing f by 2m/3 + 1, we get m = 3(n − 1). Hence, the number of T-junctions in the floorplan = f − 1 = 2m/3 = 2n − 2.
⊓ ⊔ Using the notion of T-junctions, we construct a weighted undirected graph (vide Figure 4 (d)), called junction graph, G j = (V j ,E j ), where V j = {J p }, corresponds to a set of junctions, and E j = {{J p ,J q } | a pair of adjacent junctions {J p , J q } with a segment s k of a channel C m ∈ C between them}. As depicted in Figure 4 (c), all the junctions, except those near the corners of the floorplan with degree two, have degree of three in G j , i.e., have edges with three adjacent junctions. Using Lemma 2, it can be shown that
The weight of each edge e pq ∈ E j is computed as
where p s k , the normalized usage through the segment s k , is defined as:
And, we define (1 − p s k ) as the usage penalty on the edge weight for routing a net through the corresponding segment s k . In Figure 5 , we illustrate the variation of edge weight with respect to the normalized usage p s k . In the proposed global routing framework, congestion is avoided in all the segments by restricting p s k to be no more than 1.0. This is achieved by setting the edge weight to Infinity whenever p s k = 1.0. The corresponding edge is virtually removed from E j . This ensures that the case of p s k > 1.0 does not occur. In Figure 5 , we mark the regions p s k ≤ 1.0 and p s k > 1.0 as Under-Congestion and Over-Congestion regions respectively. Therefore, we restrict to Under-Congestion while formulating the global routing graph such that there is no congestion in any of the routing resources. However, it may be noted that routing may fail for some of the nets due to insufficient capacity of some of the routing resources for a specified number of metal layers.
Lemma 3. The construction of the junction graph takes O(n) time.
Proof. By Lemma 2, we know that there are O(n) edges in the BAG, where each edge corresponds to a channel segment. Therefore, for each segment s k having a pair of junctions {J p ,J q } as its endpoints, an edge is inserted in the G j . Hence, the construction of the junction graph G j takes O(n) time.
⊓ ⊔
Global Staircase Routing Graph
In this section, we present our proposed global routing framework by extending the junction graph G j for each net. Let N be a set of nets for a given floorplan. For each t-terminal (t ≥ 2) net n i ∈ N, we use G j as the backbone to derive the corresponding Global Staircase Routing Graph (GSRG) as depicted in Figure 6 (a) and 6 (b). The GSRG is defined as G ri = (V ri ,E ri ), where V ri = V j {t l |t l ∈ n i }, and E ri = E j E l p . Each pin-junction edge e l p ∈ E l p is defined as e l p = {t l ,J p } | ∀t l ∈ n i and ∃J p ∈ J, the pin t l resides on a segment s k associated with the junction J p }. As before, we calculate the weight of a pin-junction edge e l p as:
and define (1− p s k ) as the usage penalty on the edge weight for routing a net through the corresponding segment s k .
Lemma 4. For a t-terminal net, the construction of its GSRG takes O(t) time.
Proof. As defined, the GSRG G ri = (V ri ,E ri ) for a given net n i with t terminals is obtained by augmenting the junction graph G j = (V j , E j ). In other words, V j is extended by t terminals connected to n i in order to obtain V ri . It is also to be noted that each terminal resides on a segment s k , having a pair of junctions (J p , J q ) on either ends. Therefore, each terminal (pin) contributes 2 pin-junction edges and thus total 2t edges to G ri for all t terminals. Hence, the construction of G ri takes O(t) time for each net. ⊓ ⊔ After routing a net n i successfully, we update uCap s k for all such segments s k through which n i is routed. Subsequently, the weights of the edges in G j are updated before we route the subsequent net n i+1 . When congestion is about to occur in a given segment (p s k = 1), the weight of the corresponding edge in G ri becomes Infinity. No routing is possible through such segments and the relevant edges virtually disappear making G ri more sparse after each iteration of routing. To summarize, the normalized usage p s k in this framework is constrained to a maximum of 100%, thus restricting the number of routed nets (uCap) through a given segment to be no more than its capacity (rCap).
In order to extend this model for M(≥ 1) metal layers, we keep a parameter called currLayer(s k ) associated with each segment s k , initialized to 1 and can go up to a maximum of M metal layers. When congestion is about to occur in s k (p s k = 1), we increment currLayer(s k ) to the subsequent metal layer. Here the subsequent metal layer has different implication in (un)reserved layer model; the subsequent layer can either be one layer above currLayer(s k ) or the next permitted layer based on the particular (horizontal/vertical) orientation of s k in the corresponding reserved layer model. This means that the resource s k has exhausted its entire capacity (i.e. uCap s k = rCap s k ) for the current metal layer and is now ready for routing the nets through it for the next metal layer restricted by M. In this regard, the variation of Illustration of the steps for constructing the Global staircase routing graph (GSRG) from the Junction graph for (a) a 3-terminal net n i = {t a ,t b ,t h }, and (b) a 2-terminal net n j = {t c ,t g }, along with corresponding routed paths in GSRG and finally routed nets in the floorplan topology.
rCap s k across the metal layers (up to M) plays a significant role and thus directly impacts the routing completion of all the nets. In Figure 7 (a), we study different scenario of uniform as well as varying capacity profile for all the routing resources s k across the metal layers. In case of Uniform profile, rCap s k carries the same value across the metal layers. We consider two different varying capacity profiles, one is Hyperbolic (1/M) pattern, while the other being a Ladder pattern. In case of the former, rCap s k is more aggressively scaled across the metal layers, the latter is a more realistic scenario that captures the latest trend of the metal pitch/width variation across the metal layers in the recent nanometer technologies (vide Figure 7 (b) [15] ).
Multi-terminal Net Decomposition
In a global routing framework, routing a t(> 2)-terminal net is crucial and obtaining an efficient solution for minimal length is a hard problem. Several works have been done so far to obtain the best possible t − 1 net segments for a t-terminal net such as Rectilinear Steiner Minimal Tree (RSMT) topology proposed in FLUTE [3] based on a well defined grid structure known as Hanan grid [6, 19] . Since the proposed work is based on a gridless framework and the routing regions are aligned with the MIS/MDS channels, we cannot adopt any grid-based RSMT framework such as FLUTE [3] .
Therefore, we propose a new method for multi-terminal net decomposition suitable for the proposed global routing framework. We construct a complete undirected graph for a given t-terminal (t > 2) net n i ∈ N, G ci = (V ci ,E ci ) such that V ci = {t k }, ∀t k ∈ n i and E ci = {{t j ,t k } | ∀t j ,t k ∈ n i and t j = t k }. The weight of each edge e jk = {t j ,t k } ∈ E ci is computed as half the perimeter length (HPWL) of the bounding box for each terminal pair (t i ,t j ) (vide Figure 8 (a) ). It is evident that |V ci | = O(t) and |E ci | = O(t 2 ). By employing O(n 2 ) Prim's Minimum Spanning Tree (MST) algorithm [4] , we obtain a minimum spanning tree (MST) T ci for G ci having t − 1 edges, i.e., t − 1 valid 2-terminal pairs. For each edge e jk = {t j , t k } ∈ T ci , we perform 2-terminal net routing by applying Dijkstra's single source shortest path algorithm [4] . Once we obtain the routing for all such terminal pairs, we obtain the Steiner points by identifying the common routing segments as illustrated by an example in Figure  8 .
Let us consider an example of a 3-terminal net n 1 with terminals {t a ,t b ,t c } to illustrate the proposed 2-terminal net decomposition as shown in Figure 8 . Figure 8 (a) ). As shown in Figures 8 (b)-(i) and (b)-(ii) , only one of the instances of minimum spanning tree T c1 is greedily obtained by the said MST algorithm as the final solution.
Depending on a specific T c1 thus obtained, the proposed 2-terminal net segment routing, presented in the next section, for each valid terminal pair is applied. Once the routing for all the designated terminal pairs are obtained, we identify the Steiner points similar to the state-of-the-art grid-based multi-terminal net decomposition methods (FLUTE [3] ), as illustrated in Figure 8 (b). The main difference is that this work is based on a gridless framework using monotone staircase channels as the routing resources. This topology may be termed as Staircase Minimal Steiner Tree (SMST).
STAIRoute: the proposed global routing algorithm
We present the proposed global routing algorithm STAIRoute using monotone staircase channels in Algorithm 1. This algorithm takes two inputs, namely a ordered set of nets N and the junction graph G j as defined in Section 4.2. For each net n i ∈ N, the GSRG G ri is constructed and a routing path for the net n i is obtained by applying a shortest path algorithm on G ri . We have implemented O(n 2 ) Dijkstra's shortest path algorithm [4] , namely DijkstraSSP(), presented in Algorithm 1.
For each 2-terminal net (segment), we consider two cases of identifying the source vertex between a pair of terminals before we apply the shortest path algorithm as:
1. the minimum x coordinate (or the minimum y coordinate in case both the terminals have the same x coordinate) 2. the maximum x coordinate (or the maximum y coordinate in case both have the same x coordinate) and the procedure IdentifySource() in Algorithm 1 is used for that purpose. We term them as Forward (FWD) and Backward (BACK) search respectively. In Figure 9 (a) and (b), we illustrate the respective cases for a 2-terminal net {t g ,t c } and show that both search procedures can potentially give different routing paths. (10) 3 (10) 4 (10) 8 (10) 2 (10) 0 (10) 2 (10) 0 (10) 5 (10) 6 (10) 1 (10) 1 (10) 3 (10) t g t c 0(10) 3 (10) 4 (10) 6 (10) 2 (10) 0 (10) 2 (10) 0 (10) 5 (10) 8 (10) 1 (10) 1 (10) 3 ( (10) 3 (10) 2 (10) 3 (10) 8 (10) 2 (10) 0 (10) 1 (10) 0 (10) 4 (10) 5 (10) 0 ( (10) 2 (10) 3 (10) 5 (10) 2 (10) 0 (10) 1 (10) 0 (10) 4 (10) 8 (10) 0 ( One may have a potentially better solution than the other in terms of routability, congestion scenario along with net length, and finally via count. The variation in net length due to FWD (BACK) search arises when certain resource(s) along the respective paths are fully utilized in a given metal layer; with the possibility of switching to the next available metal layer if permitted, leads to increase in the via count. Otherwise, the routing path is detoured beyond the bound box of the terminals, leading to increase in length. As long as the alternatives paths remain confined within the bounding box of the terminals, there is no variation among the respective net lengths.
Algorithm 1 STAIRoute
Inputs: G j (V j ,E j ), Ordered nets N Outputs: Global routing for each t-terminal (t ≥ 2) nets (n i ∈ N) with 100% routability and usage ≤ 100% for all sorted nets n i ∈ N do G ri = ConstructGSRG(G j ,n i ) if Netdegree(n i ) == 2 then /*Netdegree(n i ) = Number of terminals in n i */ Source = IdentifySource(n i .terminals) /*for Forward or Backward search (vide Fig. 9 )*/ Path(Source,Sink) = DijkstraSSP(G ri ,Source) if There exists a routing path from Source to Sink then n i is routed. Update uCap for the respective channel segments.
Routing n i is a failure and continue for n i+1 end if else G ci = ConstructNodeClique(n i .terminals) T ci = ObtainMST(G ci ) /*described in Section 4.3*/ for all edges (t j ,t k ) ∈ T ci do Source = IdentifySource(t j ,t k ) /*for Forward or Backward search (vide Fig. 9 In the unreserved layer model, routing a net may incur a number of vias due to difference in the metal layers used to route through the corresponding routing resources. It does not depend on their vertical/horizontal orientation. In case of reserved layer model, the number of vias along a routing path depends on the number of bends in it, i.e., the alternating (horizontal/vertical) orientation of the contiguous routing resources, for a minimum change of one metal layer among the resources along that path [9, 18] . Congestion in channels may also contribute to the number of vias along a routing path, in both the cases. From the example shown in Figure  10 (a) and (b), we notice that the routing path for a given net (t g ,t c ) needs 3 and 5 vias for FWD and BACK searches respectively. Therefore, depending on the netlist and the floorplan topology of a given circuit, one method may dominate over the other. This method can be extended to t (> 2)-terminal nets, since we decompose those nets using the method stated earlier into 2-terminal net segments and a better routing path for each of the resulting net segments can be obtained while employing either of the search procedures at a time. Before the routing procedure starts, the nets (n i ∈ N) are ordered based on their half perimeter wire length (HPWL), and the number of terminals (Netdegree). The net ordering (priority) is determined based on the non-decreasing order of HPWL first and then Netdegree. A net with smaller HPWL and then Netdegree, has the precedence over other nets. The aim is to ensure that the shorter (local) nets are routed before the longer ones so as to avoid congestion in the routing resources as well as have a uniform routing distribution across the layout of the design.
We illustrate the working of this algorithm for t (≥ 2)-terminal nets in Figure 6 . Theorem 1. Given a floorplan having n blocks and k nets having at most t-terminals (t ≥ 2), the algorithm STAIRoute takes O(n 2 kt) time.
Proof. From Lemma 4, we say that GSRG construction takes O(t) time. For each 2-terminal net routing, finding the Source vertex takes O(n) and our implementation of Dijkstra's single source shortest path algorithm (DijkstraSSP) takes O(n 2 ). Again, for t-terminal (t > 2) nets, computing G ci takes O(t 2 ) and our implementation of Prim's algorithm takes O(t 2 ). For each terminal pair (t i ,t j ), we obtain the shortest path using DijkstraSSP in O(n 2 ) time. Thus, for each t(≥ 2) terminal net, the time complexity is O(t + t 2 + n 2 t), i.e., O(n 2 t), since a given net may be connected to all n blocks resulting in t = n in the worst case. Therefore, the overall worst case time complexity for all k nets is O(n 2 kt).
Experimental Results
We have implemented the proposed algorithm STAIRoute in C and run on a 64bit Linux platform powered by Intel Core2 Duo (1.86GHz) and 2GB RAM. We used source code for top-down hierarchical monotone staircase bipartitioning algorithm implemented by [7] to obtain the BAG and MSC tree data structure. We used MCNC/GSRC hard floorplanning benchmark circuits as given in Table 1 . In order to test our algorithm, four different instances for each of the benchmarks were generated with a random seed using Parquet [1, 17] tool. For a given circuit, the best case (BC) and the worst case (WC) instances among a set of floorplan topologies are solely designated in the context of total half perimeter wire length (HPWL) of all the nets, as the ones with the smallest and the largest HPWL respectively. In our experiments, we consider only the internal nets without IO PAD connectivity and hence modify the given netlist. In this work, the focus is on the signal nets only. Since there are many nets which have IO PAD connectivity, we modify them to signal nets by removing the IO PAD connectivity with at least 3 terminals and discard those having fewer terminals, because modifying it would result a floating net with only one terminal connected to it. Due to lack of pin location information in GSRC benchmarks, we assumed those pins to be situated at the center of the blocks, unlike the circuits in MCNC benchmark.
Results using Unreserved Layer Model
To the best of our knowledge, our global routing method, using monotone staircase channels, is novel. Therefore, comparison of our results with those by existing global routers is not meaningful. Instead, we compare the length of each of the tterminal (t > 2) nets given by our algorithm with that computed by FLUTE [3] that does not consider any congestion scenario. We obtained all the results given in this subsection by using unreserved layer model up to 2 metal layers and ensure that no congestion takes place as per the framework.
In Table 2 , we summarize the results obtained for runtime and net length and related statistics for each of the circuits as given in Table 1 . As reported, there is no congestion in any of the routing resources. These results show that routed net length given by our method is comparable to both HPWL and FLUTE length, as the average Netdegree for GSRC benchmark circuits is slightly higher than 2, while the same is 4.15 for MCNC benchmark circuits. These are captured in R1, R2 and R3 columns of Table 2 respectively. It shows that our net length to Steiner length ratio (R3) computed by FLUTE [3] Table 2 Summary of global routing results using unreserved layer model for up to 2 metal layers [8] The last column contains the estimated number of vias given by our algorithm for each of the circuits. It is important to note that we obtained all the above results by restricting ourselves up to 2 metal layers. We see that while some of the circuits return zero via count, confining the routing of all the nets in the first metal layer only, the other instances resulting in non zero via count indicates routing through two metal layers in case of congestion (p s k = 1) in certain routing resources in the bottom metal layer.
Results using Reserved Layer Model
We further extend our experiments by running the proposed global routing method using HV reserved layer model restricted up to 8 metals layers. We consider BC and WC floorplan topologies for each of the circuits. In these experiments, we refer to Figure 7 (a) for 3 different capacity scaling profiles, and also refer to Figure 9 for two possible directions in order to explore possible routing paths. We iterate that all the results presented here correspond to 100% routability and restricted to UnderCongestion region of Figure 5 that ensures no congestion in any routing resource.
Following are the configurations we consider while conducting the experiments on the benchmark circuits given in In Figure 11 , we present the global routing results for n300 for all the six run configurations and two different floorplan instances, namely BC and WC. While studying these plots, we notice that the forward (backward) search with hyperbolic scaling FCH (BCH) gives the worst results as compared to the other two configurations {FCN, FCL} ({BCN, BCL}) both in terms of routed net length and via count, both in BC and WC. This is due to the fact that the hyperbolic profile is the most stringent profile among the other profiles depicted in Figure 7 (a). Next, we focus on the corresponding results obtained for the remaining configurations {FCN, FCL} ({BCN, BCL}) for both BC and WC topologies. Figure 11 (a) shows that FCN (BCN) gives better net length against FCL (BCL) both in BC and WC. Although it reflects a similar trend in the respective via count for the WC topology, FCL (BCL) has better via count as compared to FCN (BCN) in BC (Figure 11  (b) ). We conduct another set of comparison for net length and via count between FCN and BCN (FCL and BCL) for both BC and WC. Although backward search produces better net length as compared to that in forward search method, it incurs more vias to route a set of nets than its counterpart. This clearly shows that a global routing solution not only depends on the search direction and the capacity profiles, but also on different floorplan instances of the same circuit. In Table 3 (4), we summarize the net length obtained for all the circuits in all the configurations for the respective BC (WC) topologies and compare them with the corresponding FLUTE [3] length. For a given configuration, the corresponding net length is accompanied by its ratio of net length and FLUTE length, e.g., FCN/F in the bracket below it and the best ratio(s) are highlighted. It is evident that from these results that, except for the circuits with 100 or more blocks, the length ratio pairs FCN/F and FCL/F (BCN/F and BCL/F) have little variation and also shows that BCN yields the best net length of a given circuit with respect to the corresponding FLUTE length for most of the circuits.
In Table 5 , we present the via count for all the configurations for each circuit in BC (WC results in brackets). The results clearly point out the consequence of forward and backward search on via count. It is also evident that via count in case of FCH (BCH) is the worst as compared to other two configurations, namely {FCN, FCL} ({BCN, BCL}), as we have encountered in case of net length. There is little variation in via count for relatively smaller circuits in case of {FCN, FCL} ({BCN, BCL}) and becomes significant for relatively larger circuits. The best via In our congestion analysis, we use the method prescribed in [22] to analyze the congestion scenario of the given floorplan instance of a circuit to estimate its routability. The authors [22] proposed a new metric called Average Congestion per edge for certain percentage of all congested global routing edge, denoted as ACE(x%) where x is the percentage value of the worst congested edges. They subsequently computed another parameter which is an weighted average of ACE(x%) for four different values of x = 0.5, 1, 2, 5 and denoted as wACE4. In our case, we consider the monotone staircase channels as routing resources and their normalized usage (vide Equation 2) is the measure of congestion.
In Table 6 , we capture wACE4 for each circuit for all the configurations in BC and WC (in brackets) to showcase the corresponding congestion scenario with 100% routability of the nets and validate that our global routing framework conforms to that depicted in Figure 5 . In these experiments, wACE4 is chosen as the maximum of the respective wACE4 values for each of 8 metal layers.
The plot in Figure 12 shows the impact of different run configurations on runtime variation for n300. Clearly, those given by FCH (BCH) are the worst among the respective FWD (BACK) configurations, while FCL gives the best runtime for both BC and WC with respect to the remaining configurations. In Table 7 , we report runtime in seconds for all the circuits versus the said run configurations. These results correspond to both BC and WC (in brackets) floorplan instances. As we can see that the best (as highlighted) runtime in the context of BC and WC (in brackets) floorplan instances for a given circuit is given by either FCL or BCL for most of the cases.
Conclusion
In this paper, we proposed a novel global routing framework based on monotone staircase channels obtained by hierarchical bipartitioning of the flooplan instances of a given circuit. It thus immediately follows the floorplanning stage and hence require no detailed placement. Unlike the existing global routers, the monotone staircase channels act as the routing resources and the nets are routed strictly though them for a given number of metal layers, using unreserved or reserved layer model. The congestion scenario is modeled in the global routing model in such way that the utilization is no more than 100% in any of the routing resources, supported by the values of wACE4 parameter. Multi-terminal net decomposition using the proposed Steiner tree method is unique and no other existing methods are known to work in this regard. Our experimental results show that 100% routing completion is possible without any congestion even for different capacity profiles that include constrained metal pitch/width variation due to recent fabrication processes. Additionally, employing different search directions shows that improvement in routed net length and via count may be achieved. Therefore, the proposed global routing method has a two fold advantage: (a) evaluate the feasibility of global routing at the floorplanning stage and (b) estimate the global routing metrics and related information essential for the subsequent stages of the design flow. This technique, therefore, opens up a new versatile option in the traditional design flow. This work may be extended to incorporate design for manufacturability (DFM) issues by suitably modeling them into this framework.
