Abstract-This work is the first to present global routing models for capturing the impact of local congestion caused by varying-size vias. The models are then incorporated to dynamically drive a proposed layer assignment algorithm. A typical characteristic of advanced technology nodes is significantly-high variation in wire sizes that may exist between adjacent metal layers. Routing from a global cell (g-cell) to its top metal layer results in using a via which may be up to twice the size of unit wire track within that g-cell. This results in significant decrease in the available routing tracks that could pass the boundaries of the g-cell. Ignoring this issue hampers the effectiveness of traditional global routing algorithms due to acting as, now, a significant source of mismatch with the detailed routing stage. Based on these observations, we propose "via-aware edge overflow" and "edge-aware via overflow" models for capturing the impact of both unstacked and stacked vias of arbitrary sizes during global routing. Our models can be used to drive any layer assignment algorithm and replace the traditional edge overflow and via overflow metrics. To show the impact of our models, we also incorporate them in a proposed two-stage layer assignment algorithm and compare with a competitive layer assignment technique. This is also the first work to actually evaluate the impact of global routing solutions using a commercial detailed router. In our experiments we report less number of DRC violations by only changing the layer assignment at global routing, and detailed route using the Olympus-SoC of Mentor Graphics.
I. INTRODUCTION
Routability has turned into a serious hurdle in modern technology nodes. On one hand, placement constraints such as irregular placeable areas and narrow channels between blocks impact routability [3] . On the other hand, there is significant mismatch between global and detailed routing stages, thereby diminishing the effectiveness of global routing and significantly increasing the complexity at the detailed routing stage. The mismatch between global and detailed routing stages is caused by issues such as increasing complexity of design rules in advanced technology nodes, pin accessability, wide variation in wire size and spacing rules, virtual pins at higher metal layers, and various types of routing blockages [2, 3] .
Apart from many recent, significant efforts to improve routabilitydriven placement algorithms [3, 16, 17, 19] , some efforts have also been made to improve the accuracy of global routing to reduce its mismatch with detailed routing. In [13, 15] , local congestion in a gcell was incorporated in the global routing grid-graph by adjusting the edge capacities, and by introducing a vertex capacity, respectively. In [18] , various techniques were proposed to model usage of local routes within a g-cell to feed global routing with more accurate models.
The above-mentioned prior work on global routing have the limitation that they only modeled sources of local congestion as a one-time model determined in a static manner; while static modeling at the global routing stage may be fine for sources of congestion such as local nets, other sources of local congestion such as vias of signal nets dynamically change during global routing, during the layer assignment phase when a 3D routing solution is generated from a finalized 2D solution. However such dynamic sources of congestion can become quite significant in new technology nodes.
In fact a typical characteristic of advanced technology nodes is significantly-high variation, up to a factor of two, in wire sizes that may exist between adjacent metal layers. Routing from a metal layer to its top layer results in using a via which is typically as wide as the wire size on the top layer [1, 2] . The via will then consume wire tracks from the lower metal layer so it acts as a source of local congestion within the (lower layer) g-cell. This source of local congestion is a dynamic one which varies during the global routing stage as a routing solution gets evolved. Moreover, as pointed out in [2] , similar to pins in library cells, vias do not scale as well as devices with each technology node so the issue is expected to deteriorate with further technology scaling. Figure 1 shows an example illustrating the issue. Consider case 2 or 3. They each show a wire that gets connected to a lower metal layer. Here, the used via does not block additional routing tracks so from that aspect it can be considered similar to case 1 except for the portion of a routing track that is utilized by the wire. This is because the lower layer always has same or lower track width. However consider case 4 or 5. Here connecting to the top layer with wider track results in the via to actually block available routing tracks in the g-cell. In case 4, the blockage is only corresponding to the left boundary of the g-cell. Therefore unstacked vias can act as local congestion and reduce the number of available tracks that can pass one or more boundaries of the g-cell. Such issue has not been modeled in prior literature.
Prior work on layer assignment have done limited study on modeling local congestion caused by vias [7, 8, 12] . Specifically, prior work such as [5] introduced a model for via overflow (VOF) of a g-cell which took into account the area taken by the used wire tracks, considering the size and spacing requirements, as well as area and spacing requirements for existing vias, and other types of routing blockages inside a g-cell to compute how many new stacked vias could pass through the g-cell. However, prior work typically assumed stacked and unstacked vias in a g-cell have the same size which can be inaccurate in computing the via overflow.
Moreover, unstacked vias can impact stacked vias beyond what was considered in prior work. As an example consider Figure 1 again. In case 5, a rectangular region as wide as the via size (and spacing requirement) and as long as the tile's side is blocked by the unstacked via. This region would have been smaller if the via connection was to a lower layer. As a result, the tile can support fewer stacked vias before having a via overflow, and a stacked via such as case 6 cannot be placed in this entire blocked region.
978-1-4673-9569-4/16/$31.00 ©2016 IEEE

4C-2
In this work we propose new models for local congestion caused by stacked and unstacked vias of varying sizes in a g-cell which can be incorporated in any layer assignment algorithm. Specifically, we propose a via-aware edge overflow (VA-EOF) metric to model the impact of unstacked vias on the available g-cell boundary. We also extend the existing via overflow model from prior work to an edgeaware via overflow (EA-VOF) metric to account for the impact of unstacked vias to better control the stacked ones.
We then propose a fast two-step layer assignment procedure which incorporates our models. The first stage derives from a commonly used dynamic programming (DP) framework. It aims at minimizing VA-EOF, EA-VOF, and via count but can also be set to, additionally, minimize the traditional edge overflow (EOF) metric. The second stage is based on a novel linear programming (LP) method and uses the solution of DP as input and explicitly minimizes via count with minimal degradation to the EA-VOF and VA-EOF metrics. The LP benefits from the property of totally unimodular polyhedron in optimization theory.
In our simulation results using recent placement benchmarks from the ISPD 2011 contest [17] , we first show it is possible to achieve no increase in the (traditional) EOF, with on average only 1% increase (and sometimes decrease) in via count, but a reduction of on average 61% and 32% in VA-EOF and EA-VOF, respectively.
In addition, to the best of our knowledge, this is the first work which evaluates the quality of global routing using a commercial detailed router. We created an infrastructure to allow feeding a global routing solution to the Olympus-SoC detailed router [14] , all compatible with standard LEF/DEF format. To show the actual impact of our metrics, we report a lower number of DRC violations at the detailed routing stage based on our generated global routing solutions, compared to detailed routing a global routing solution obtained from a competitive algorithm that only optimizes EOF and via count.
We also compare with another variation of our algorithm when we do not optimize the traditional EOF in our procedure, and instead optimize our proposed overflow metrics. Interesting, we show that this case results in even more decrease in DRC violations in detailed routing, suggesting the effectiveness of our metrics over traditional EOF.
The summary of our contributions are listed below:
• This work presents a novel via model as a dynamic source of local congestion. It is the first work to study the impact of vias of various sizes on the g-cell boundaries as a via-aware edge overflow model. It also extends a popular via overflow model of [5] to an edge-aware via overflow model. • With the novel via modeling, we present a two-stage layer assignment algorithm. The first stage of our algorithm derives from commonly used dynamic programming-based algorithm by introducing the impact of our proposed metrics. The second stage is based on a novel linear programming method, which benefits from the property of totally unimodular polyhedron in optimization theory.
• We tested our layer assignment results with Olympus SoC tool from Mentor Graphics. To the best of our knowledge, this is the first global routing work to use commercial detailed routing tool for evaluation. The solutions generated by our algorithm have better detailed routing results, which implies that our metrics are reasonable and practical.
The remainder of the paper, Section II, presents our models. In Section III, we present our two-stage algorithm. Simulation results are presented in Section IV followed by conclusions. 
II. VIA SIZE-AWARE OVERFLOW MODELS
A. Preliminaries
A global routing instance is defined using a grid-graph G = (V, E). Each vertex v ∈ V represents a global cell (g-cell) and each edge e ∈ E represents the common boundary of two adjacent g-cells.
For edge e ∈ E, a capacity c e represents the available length of the gcell's boundary that can be used for routing after accounting for static routing blockage. Traditionally, when a route t crosses the boundary of a g-cell, it was assumed that it utilizes routing resource equal to the summation of its width w t and spacing s t [17] . Utilization of an edge is expressed by u e = ∀t passing e (wt + st). The overflow of edge e is given by oe = max(0, ue − ce). The total edge overflow metric is given by EOF= ∀e∈E oe.
B. Via-Aware Edge Overflow (VA-EOF)
Consider an edge e, and a wire t. In this work we model various cases in which wire t may partially or fully utilize a track which are the cases included in Figure 1 . A core idea behind our overflow metrics is modeling how a wire utilizes routing resources on the common boundary of two g-cells as well as within one g-cell.
First, consider one boundary e between two g-cells. We define r L|R te to be the amount of routing resource taken by wire t from the left (right) hand side of the boundary between two g-cells. Note as shown in Figure 2 it is possible that a wire needs different amount of routing resource from the common boundary of two g-cells when vias are considered. Therefore we differentiate with respect to a g-cell boundary.
In Table I , column 2, we first calculate an expression for the abovedefined r te , for each of the cases shown in Figure 1 . 
Here wt and st are the width and spacing of wire t, and wv and s v are the wire width and spacing of the corresponding via. For cases 1 and 2 in Figure 1 , r L|R te is equal to the sum of width and spacing of t and edge e could be either of the two boundaries of the g-cell (for passing a horizontal wire). The r L|R te is expressed the same way in case 3 but here edge e corresponds to the right boundary of g as shown in Figure 1 . For cases 4 and 5, the given expressions in the table are for the right and left boundaries of the g-cell, respectively. They are determined based on the via width and spacing, wv and sv, respectively, because the via size is wider than the wire size in these two cases. Note, modeling the impact of a stacked via on blocking a g-cell boundary (case 6) is outside the scope of this work.
4C-2
We now define the via-aware utilization of an edge e representing the common boundary of two g-cells g 1 and g 2 as follows.
The above equation computes an edge utilization for each side of a common boundary. The via-aware utilization of the edge is then defined as the maximum of the utilization of the two sides. The via-aware overflow of an edge is denoted by o . This is because when via sizes are not considered, the utilizations of a wire from each side of a common boundary of two g-cells are the same and equal to summation of wire width and spacing.
C. Edge-Aware Via Overflow (EA-VOF)
We first derive expressions for resource usage of a wire that is routed inside a g-cell which is essentially the area used by the wire. We denote this area by a tg for wire t and g-cell g. Table I , column 3, gives the expression for at g for each of the cases in the example of Figure 1 . In the table, the new parameter W is the g-cell's width.
The expressions are similar to the boundary expressions (column 2 of the table) except that they require an estimation of the portion of the track that a wire uses inside the g-cell to compute the area. For cases 1, 2, and 5, the expressions are exact because the wires take the entire track. For cases 3 and 4, the expressions are approximate and assume the wire uses half of the track (W/2). We used this assumption from prior work [5] and find it to be a reasonable assumption if the actual wire lengths inside a g-cell have a uniform distribution. For case 6, the expression is exact and is the area of the stacked via after considering its spacing requirement. Note, it assumes the stacked via has a square shape in order to keep the notation simple, but in case it is not square, the expression can be easily modified to reflect that which is actually considered in our experiments for some benchmarks in which the g-cells are not square.
Next we can derive the edge-aware utilization of a g-cell by a group of wires passing through it as u EA g = ∀t passing g at g . We also denote the capacity of a g-cell by c g which is the available area inside the g-cell after accounting for static routing blockage including an approximation of resources needed for the local nets, for example as described in [18] . The edge-aware overflow of a g-cell is denoted by o EA g and expressed as below.
The total edge-aware via overflow is given by EA-VOF= ∀g o EA g . The main difference between our edge-aware via overflow model compared to prior work is how a tg is computed for each wire t that passes g. Specifically, situations that are more accurately handled by our model are cases 4 and 5 in Figure 1 , when a wire that crosses a g-cell edge connects to a larger unstacked via inside the g-cell, thereby taking more area. This is why the via overflow model is "edge-aware".
Moreover, as an optimization metric, EA-VOF correlates most with controlling the number of stacked vias if minimized in a layer assignment procedure. Therefore EA-VOF can also be thought as capturing the resource overhead caused by unstacked vias on determining the available area in a g-cell, which in turn influences the allocation of stacked vias in a layer assignment procedure. Figure 3 gives an overview of our layer assignment framework. The input is a projected 2D routing solution. The output is a 3D solution which has the same 2D projection as the input. Our framework has two stages. The first stage is a dynamic programming (DP) algorithm which minimizes a cost function that can be a combination of traditional metrics EOF and via count, as well as our proposed metrics VA-EOF and EA-VOF. It can also be set to remove the traditional EOF metric and only use our proposed metric. We show the impact of both variations when reporting our simulation results. The DP algorithm operates on a single net and sequentially processes the nets similar to [11] . The algorithm results in slight increase in via count because minimizing VA-EOF tends to increase the via count during layer assignment, compared to only minimizing EOF and via count.
III. LAYER ASSIGNMENT FRAMEWORK
The second stage of our framework is solving a simple but elegant linear programming (LP) formulation. It aims to minimize the via count of the solution generated by DP without increase in EOF and with minimal compromise on VA-EOF and implicitly optimize EA-VOF. The LP formulation operates on an "edge-set" which is the set of 3D edges that have the same 2D projection. So it considers all the edges in the same set concurrently. It processes the edge-sets using a novel "Nautilus-shaped" ordering. The LP benefits from the property of totally unimodular polyhedron in optimization theory.
A. (DP): Compromising Via Count with Overflow Metrics
At the first stage, we visit the nets sequentially according to a net ordering. For each individual net, a dynamic programming (DP) formulation is solved which determines the layer assignment for that net. Our DP formulation extends other prior work such as [7, 11] to account for our proposed metrics. We determine our net ordering by evaluating the following "score" for each net n given by score(n) =
+C2 ×Deg(n), where C1 and C2 are two constants, W L(n) is the wirelength of the 2D projection of net n and Deg(n) is the number of pins of the net. We set C1 = 1000 and C2 = 0.4 in all our experiments. Nets with higher score are processed first.
To solve the DP formulation for one net n, we receive as input, the 2D routing tree of the net which we denote by tn. The output of DP is the layer for each edge e ∈ t n. We first introduce some notations before discussing the DP algorithm.
• τv: subtree rooted at v for a node v ∈ tn.
• Cv: the set of all the children nodes of v. The edge between v and a child node cv ∈ Cv is called a "child-edge".
• CLA(v): an array [l1, . . . , lm] representing an assigned layer for each child-edge of v with m children. The i th entry of CLA(v) is the layer assigned to the i th child-edge.
Algorithm 1 shows the procedure called SUBTREELA(v) which determines the layer assignment for all the child-edges of v. It returns a quadruple with four elements which are scores directly related to EOF, via count (denoted by VC), VA-EOF and EA-VOF, respectively. We start by calling SUBTREELA for the root node of t n and the procedure recursively calls itself to ensure each node is only processed after its children are processed.
When the procedure SUBTREELA starts processing node v, it considers the edge between v and child c v to be assigned to any layer, and further considers all combinations of layer assignments among all the child-edges of v, ∀c v ∈ Cv.
4C-2
Algorithm 1 DP-based algorithm for a sub-tree τ v 1: procedure SUBTREELA(v) 2: if v is leaf then return {0,0,0,0}
3:
bestCLA←unknown; bestScore ← ∞; bestQuad←
4:
{EOF=∞,VC=∞, VA-EOF=∞,EA-VOF=∞ }
5:
for each possible combination of CLA(v) do 6: st=GETSCORE(v, CLA(v)) 7: score=st.EOF+st.VC+st.VA-EOF+st.EA-VOF
8:
if score < bestScore then 9: bestCLA ← CLA(v); bestQuad ← st;
10:
bestScore ← score 11: assign layers to all the child-edges according to bestCLA return bestQuad Algorithm 2 Compute DP score of sub-problem
score ← {0,0,0,0}
for each child node c v of v do 4: add elements in SUBTREELA(cv) to score
5:
for each layer li assigned to the i th child-edge do 6: add EOF of edge (c v ,v) to score.EOF 7: update score.VC so subtrees of c v and v connect 8: update score.VA-EOF for edge (cv,v) 9: for each g-cell with same x,y coordinate as v do 10: update score.EA-VOF for vertex v return score For each combination, the function GETSCORE is then called to compute a corresponding quadruple and score, based on the alreadycomputed CLA(c v ) of the child, and the layer for edge (v,c v ), for all c v ∈ Cv. Note, the computed quadruple of one combination represents the EOF, VC, VA-EOF, and EA-VOF of τc v . Among the combinations, the one with the smallest score will be selected as the best one which determines the layer assignment for all the childedges. In case the best score is tied among multiple assignments, the tie is broken by selecting the assignment with lower EOF, else lower VC, else lower VA-EOF, and else lower EA-VOF.
B. (LP): Minimizing Via Count with Minimal Overflow Degradation
Adding VA-EOF in the DP's cost function makes the DP to slightly increase via count because a g-cell experiences overflow earlier when via sizes are considered. However minimizing via count is also a very important objective of layer assignment. Therefore, at the second stage, LP tries to minimize the via count as its only objective with minimal or no degradation in already-optimized VA-EOF and EOF. Figure 4 shows an overview to explain the LP with a simple example. We first group all the 3D edges with the same 2D projection in the same edge-set. Let G 2D = (V2D, E2D) be the 2D projection of the 3D global routing grid-graph. We denote the edge-set defined by a 2D edge e ∈ V 2D as S e . The edge-sets are then sorted according to a novel Nautilus-shaped sorting procedure which we discuss later. Using the generated sorting, the edge-sets are processed sequentially. To process an edge-set S e, we further identify, from the 2D routing solution, all the nets whose route contain e ∈ V2D. We refer to this related set of nets for edge-set Se as Ne. Next an LP is formed which considers all the nets in Ne concurrently.
1) LP Formulation:
For each net n ∈ Ne, the LP changes the layer assignment for its 2D edge. The LP minimizes via count as its objective. When evaluating the assignment of a 2D edge of a net to a specific layer, the number of vias are estimated by looking at the assigned layer for the continuation of the net in neighboring edge-sets. Each individual LP also has a constraint which explicitly guarantees that EOF will not increase compared to DP. At the same time, it is quite effective in controlling the increase in VA-EOF in that edge-set. Consider net n ∈ Ne and edge-set Se. Based on our prior definitions, this means the 2D projected route of net n passes from edge e ∈ V 2D and the edge-set S e is made of all the 3D edges that map to e. During LP, we decide if we assign (this edge of) net n to layer . So we define variable x n which falls between 0 and 1, and is assigned to 1 if and only if net n is assigned to layer in this edge-set. Here refers to any edge (or alternatively corresponding layer) that belongs to the edge-set.
Our formulation is given below.
First we note that the above LP formulation is totally unimodular because there exists a bi-coloring row partition [6] in it's constraint matrix. It implies that each feasible solution of this LP problem is integral by Hoffman-Kruskal theorem [4] . This property ensures that the solution for each x n variable is either 0 or 1. Therefore, Equation 3 ensures each net is only assigned to one layer in the edge-set. Inequality 4 ensures the number of nets assigned to (a 3D edge in) layer of edge-set S e is bounded by quantity u . Here u is a constant parameter. It is equal to the number of routed nets passing that 3D edge from the layer assignment solution generated by DP.
Inquality 4 explicitly guarantees that the utilization of each edge does not increase compared to DP. Thus it ensures that traditional EOF will not increase and is effective in limiting any increase in VA-EOF.
The objective of our formulation is minimizing via count for the nets in the edge-set. For each variable x n , define weight w n indicating the number of vias if net n is assigned to layer in the edge-set as it connects to its other fragments in neighboring edge-sets.
For the example in Figure 4 , assigning net n to layer 6 results in 2 vias with the left and 1 via with the right edge-set so the value of x n6 is equal to 3. We assume a solution for the neighboring edge-set as follows. If the edge-set is already processed, we take the solution generated by LP. Otherwise, the solution is taken from the DP stage.
Finally, our LP formulation resembles a network flow formulation. The work [9] proposes a network flow graph (not formulation) for generic layer assignment. Our LP formulation is unique because of the way it has been written for an edge-set, as well as it is set up to ensure the edge utilizations do not exceed the first, DP stage. It also has the totally unimodular property which results in the LP to be solved quite efficiently as we show in our simulations. 2) Nautilus Edge Set Ordering: Edge-sets are processed using an ordering procedure which we call Nautilus ordering, as illustrated in Figure 5 . The figure shows the 2D projected grid-graph so each 2D edge represents an edge-set. The main idea behind the Nautilus ordering is as follows. First, note that each edge-set has 6 adjacent edge-sets. For example the edge labeled e1 in the figure, has neighboring edge-sets labeled e2 to e7. Second, recall in the LP that in order to count the number of vias when assigning a net n to layer we need to know how the net continues in the adjacent edge-sets. In other words, during LP for an edge-set we assume the solution for the neighboring edge-sets are already fixed.
This assumption is not much accurate when we just start processing the very first edge-set. But as more edge-sets get processed, it becomes more likely that the neighboring edge-sets have been processed by LP to a finalized solution. Therefore, the goal of our ordering is to solve neighboring edge-sets consecutively, as much as possible, which we propose to do in a nautilus-shaped order. (See Figure 5. )
We start by selecting the first edge-set (indexed e1) in the figure which is the edge with the highest via count based on the DP solution generated at the first stage. The via count for an edge-set is computed similar to the example shown in Figure 4 . We store the two vertices corresponding to the selected edge-set (vertices v1 and v2 in the figure) in a queue. The ordering proceeds as follows. We select the vertex from the head of the queue, and process the edge-sets connected to it (if they are not already processed) in a counterclockwise fashion. We then enqueue the neighboring vertices of this vertex, again in a counter-clockwise fashion. The process continues until the queue does not contain anymore vertices.
In the example of Figure 5 , (after processing edge-set e1) we first enqueue v1 and v2. We dequeue v1 and process the edge sets e2 to e4. We then enqueue the vertices adjacent to v1 in counter-clockwise if they have not been enqueued before, which are v3, v4, v5. We then dequeue v2 and so on. The figure shows the ordering of edge-sets and enqueue of the vertices the edge and vertex labels.
IV. EXPERIMENTAL RESULTS
We implemented our two-stage layer assignment algorithm in C++. We refer to our program as VALA (for Via size-Aware Layer Assignment). We used CPLEX 12.6 to solve the linear programs. All experiments ran on a Linux machine with a 2.8GHz Intel CPU and 12GB of memory. The input solution to VALA is a 2D projected global routing solution. We first used the NCTU-GR 2.0 [11] router to generate a 3D global routing solution and then created the 2D-projected version of that solution to feed as input to VALA. We used the ISPD 2011 [17] benchmarks. Specifically, for each benchmark we chose the best placement solution among the contestants (which are posted on the 2011 contest website).
In our experiments we made comparison with the following cases.
• Base: We ran a variation of the first stage of our framework based on dynamic programming (DP) in which our proposed metrics (EA-VOF and VA-EOF) were completely eliminated. In other words, to compute the score of a solution for a subproblem the only considered metrics were via count and EOF.
• VALA (DP): We only ran the first stage of our framework (DP) and considered all the metrics including EOF, via count (VC), VA-EOF and EA-VOF.
• VALA (DP+LP): We ran the two-stage VALA framework completely by applying LP after VALA (DP).
• VALA (DP(No-EOF)+LP): We eliminated EOF from the DP cost function so only considered via count, VA-EOF, and EA-VOF in DP. We then applied LP to the generated DP solution. This case allows measuring the impact of our proposed overflow metric as a replacement to the traditional overflow (EOF) metric. We verified the 3D global routing results of Base were very similar to the one generated by NCTU-GR [11] in EOF and via count. The Base case allows us to make a fair comparison to evaluate VALA because it is a variation of our existing implementation and during dynamic programming, we like to keep a controllable behavior to precisely measure the effects of enabling or disabling VA-EOF and EA-VOF which are primary goals of this work.
A. Global Routing Comparisons
Here we made comparison at the global routing stage between the following metrics: traditional total edge overflow (EOF), via count, as well as our proposed metrics which are total via-aware edge overflow (VA-EOF) and total edge-aware via-overflow (EA-VOF). Table II shows the results. For Base, we report the actual values of these quantities. For the remaining cases we report the percentage increase compared to Base. We also report the runtime of each approach in seconds. (For DP(No-EOF+LP), we do not report any runtime because it is very similar to DP+LP.) The runtime of DP+LP includes the runtime of DP. We also only report the EOF for Base because the EOF of all 3 variations of VALA are identical to Base except for benchmark s2 in which all 3 variations of VALA have slightly lower EOF compared to Base. So, overall, EOF is unchanged or improved in all VALA variations.
As can be seen DP allows decrease in VA-EOF and EA-VOF by on-average 65.88% and 19.27% respectively. But it increases the via count by on-average 2.35%. However after applying LP, the increase in via count reduces to on-average 0.92% of Base. This is expected because the objective of LP is to explicitly minimize via count of the LP's solution. This results in minimal degradation in VA-EOF to 61.19%. The EA-VOF improves by 32.35% because of the reduction in the number of vias. For DP(No-EOF)+LP, the improvements in VA-EOF and EA-VOF are 59.38% and 23.09% respectively. Note it has the most reduction in the number of vias (better than DP+LP), and on-average only 0.23% of Base. The average runtimes of Base, DP, and LP stages were 213.6, 219.3, and 303.0 seconds, respectively. As expected the runtimes of Base and DP are both very fast and similar to each other. The runtime of LP is fast even though it operates on many edge-sets because each LP has a very small size. The runtimes of DP(No-EOF)+LP are almost identical to DP+LP.
Overall, at the global routing stage, we conclude there is significant room for improving our proposed metrics during layer assignment with no or minimal degradation in traditional EOF and via count.
B. Detailed Routing Comparisons
In this experiment we made comparison between Base, DP+LP, and DP(No-EOF)+LP, at the detailed routing stage. We created a setup to feed in our global routing solutions to the Olympus-SoC detailed router [14] . Specifically, global routes were loaded into Olympus using a custom tcl script. The script parses a routing file generated by us which was in the standard format used in the ISPD 2011 contest. Horizontal and vertical wire segments on metal layers are added using the create_wire command and vias are inserted one at a time using the create_via command. We imitate the Olympus global router's operation by placing end points of global wire segments and vias at the center of global cells.
Furthermore, to convert the placement info from the Bookshelf format to LEF/DEF, we wrote a perl script similar to the "convert" function in [10] . This differs from the NCTU-GR converter in a few ways. First, it adds rectilinear-shaped cells to the OVERLAP layer, to properly represent their shapes in Olympus and remove placement violations. Second, the NCTU-GR converter scales via size on a metal layer based on the minimum width of only that layer. Our converter allowed vias to be additionally scaled when changing layers with differing minimum widths. Third, our script outputs a Verilog file to specify each net's pin connections. Finally, pins on cells with blockage are at their default/actual locations which is metal1 according to the Bookshelf guidelines for such cells that are not declared NI (non-interfering). This causes issues with pin access, so blockage is removed in order to allow these pins to be routed. The router still is forced to route around these blockages as the track router strictly follows the global routing solution, and only violates blockages in order to access these pins.
After importing our global routing solution into Olympus, we then ran the track_route command which creates an initial detailed routing solution. In order to respect routing blockages, we instructed the track_route to strictly follow the global routing solution during track assignment. Track routing ran in congestion mode (as opposed to timing mode). To save time, the track router ran with "medium" effort, which runs two passes.
The results are shown in Table III . For each benchmark we report the number of DRC violations. For Base we report the actual number of violations. For the two VALA variations we report a percentage improvement compared to Base. As can be seen both VALA variations improve the number of DRC violations. The improvements for DP+LP and DP(No-EOF)+LP are on-average 3.6% and 9.1%, respectively. Note, these improvements are made only by changing the layer assignment of the same 2D global routing solution. In DP(No-EOF)+LP, when we completely replace optimizing EOF with our proposed metrics, the reduction in the number of DRC violations is higher which suggests our proposed overflow metrics may act as a good replacement for traditional EOF during global routing.
V. CONCLUSIONS AND FUTURE WORK
This work is the first to introduce novel overflow metrics to consider the impact of varying via sizes. The models are generic and may be used by any layer assignment algorithm. We also introduced a fast two-stage layer assignment algorithm which was driven by our proposed metrics. In our experiments, we showed significant improvement at the global routing stage is possible in our proposed metrics without compromising the traditional global routing metrics. We also showed that optimizing our metrics during global routing helps reduce number of DRC violations at the detailed routing stage.
