Abstract-As VLSI technology nodes scale down, via defects are becoming a major yield concern. Thus, via estimation modeling is becoming more important for yield analysis. In this paper, the recent via distribution model of [15] is revisited and analyzed, and possible inaccuracies and deficiencies are pointed out and experimentally verified. Then, a new taxonomy of via modeling approaches is presented, including analytical, netlist-based, and placement-based approaches. We focus on placement-based via estimation, and propose and validate a new model using real industry chips and public-domain testcases. Experimental results show that our via modeling approach is more accurate than the previous via distribution model.
I. INTRODUCTION
As VLSI technology advances, via opens are emerging as a major defect type which negatively affects metal integrity and reliability, and hinders further metal shrinking [2, 9] . A complete via failure results in a broken net, which causes circuit malfunction, while a partial via failure increases the resistance of the via, which increases signal delay and degrades the circuit performance. To improve manufacturing yield as well as integrity and reliability of integrated circuits, via defects must be considered as early as possible during circuit design.
In 0.13-micron technology and below, it is greatly encouraged that redundant vias be added to improve robustness with respect to random via failures, and thus enhance manufacturing yield and circuit reliability [12] . A redundant via is an additionally placed via that accompanies the existing via at a given connection point and reduces the failure rate of that connection. Much research has been performed on redundant via insertion, encompassing redundant via-aware routing algorithms [3, 16] and post-routing redundant via placement algorithms [1, 8, 11] . Although insertion rates of redundant vias have reportedly reached over 90% in recent years [16] , it is still difficult to obtain a near-100% insertion rate, especially for congested designs. Via analysis and estimation are thus of great importance for yield prediction, even with use of redundant vias.
Uezono et al. [15] propose the first via distribution model, which estimates the number of vias according to the wirelength and track utilization. The wirelength is estimated using the wirelength distribution (WLD) model of [7] . As discussed in Section II, there are inaccuracies in the model of [15] due to its heavy dependence on wirelength and track utilization. In the present work, we propose a new taxonomy of via modeling approaches, including analytical, netlist-based, and placement-based methods. We focus on the placement-based method and present a new placement-based via estimation model, along with validations on industry designs at 90nm, 65nm and 45nm technology nodes. Our contributions are as follows.
• We revisit the previous via distribution model of [15] and point out possible inaccuracies and gaps through experimental testing of the model.
* Research at UCSD was supported in part by the Semiconductor Technology Academic Research Center (STARC).
• We present a taxonomy of via modeling approaches -i.e., analytical, netlist-based, and placement-based -based on different design stages and input parameters.
• We present a new placement-based via estimation model which estimates a lower bound on the number of vias, along with wiring detour related vias, based on Steiner trees computed for the placed netlist and routing congestion. The remainder of this paper is organized as follows. In Section II, the previous via distribution model of [15] is revisited and analyzed with specially designed experiments that reveal potential inaccuracies. Section III discusses a new taxonomy of analytical, netlist-based, and placement-based via estimation methods. A new placement-based via estimation model is presented in Section IV. Section V gives experimental results on real industry chips and public-domain testcases. Finally, Section VI presents conclusions and future research directions.
II. REVIEW OF VIA DISTRIBUTION MODEL

A. Uezono et al.'s Via Distribution Model
The number of vias is strongly dependent on wirelength, and in general, a longer wire has more vias. So, the number of vias should be a function of wirelength. Uezono et al. [15] propose a statistical via distribution model for yield estimation, based on the wirelength distribution model of Davis et al. [7] and simple track utilization ut as defined in
where l used is the total wirelength actually used, and l track is the total length of all wiring tracks (determined by wiring pitch, chip size and the number of available metal layers). Using ut, Uezono et al. model the via distribution as
Coefficients α1, α2, β1 and β2 are obtained from sample designs statistically, and the number of interconnects with wirelength l follows the wirelength distribution model in Eq. (3) and Eq. (4) according to [7] .
Region2 :
Here, a is used to account for multiple fanouts, k is the average number of terminals of a gate in the design, p is Rent's parameter, and Γ is a normalization factor used to reduce the modeling error.
The number of vias will increase due to detouring from high congestion if the track utilization increases. Therefore, if the wirelength distribution model is correct and the coefficients are representative of general designs, Uezono et al.'s via distribution model might appear reasonable as it achieves average 11% error on reported testcases. However, the model has strange implications for the relationship between Rent's parameter and the number of vias. In Fig. 8 and Fig. 9 of the paper [15] , the number of vias increases when Rent's parameter decreases below 0.5 and increases again when Rent's parameter increases above 0.6. In the paper, the authors explain that "since a smaller-p circuit has a larger number of shorter wires, the number of vias can increase when p gets smaller, and since a larger-p circuit has greater total wirelength, the number of vias can increase when p gets larger".
We evaluated this statement by collecting data from testcases generated by using the Rentian circuit generator, gnl [14] , which can generate circuits with user-specified Rent's parameter. Fig. 1 shows the dependency of the number of vias on Rent's parameter p and the number of terminals k. These samples contain 10,000 instances. We can find a root cause of the error of Uezono et al.'s model in its use of the Davis et al. [7] wirelength distribution (WLD) model. This WLD model is generated by analytical calculation from Rent's rule. However, as pointed out by Christie et al. [5] , in calculating the number of nearest neighbor nets (l = 1), Davis 
with N being the number of components in the design. T internal is the number of internal terminals that are connected within the design, and T external is the number of external terminals that are connected to the outside of the design. The first part of the righthand side counts the number of all the terminals in the design. Since the total number of terminals of a design (the first part of the right-hand side of Eq. (5)) is fixed when k and N are fixed, and since the number of external terminals decreases exponentially as Rent's parameter decreases according to Eq. (6), Eq. (5) implicitly means that the number of internal terminals increases exponentially as Rent's parameter decreases. However, in actual designs, due to the existence of a Region II Rent's parameter [10] , the number of external terminals is not changed as much as Eq. (6) . In other words, the number of internal terminals does not increase significantly, even for small Rent's parameter designs. This overestimated number of internal terminals for small p makes Uezono's via distribution model show larger via count for small p designs.
B. Key Parameters to Estimate Via Count
Via count may depend on various circuit and implementation parameters, such as used metal layers, wiring pitch, utilization, average terminals per gate, Rent's parameter, etc. In this subsection, we analyze the dependencies of via count on such parameters. We have executed a comprehensive design of experiments (DOE) over the number of gates (N ), the average number of terminals (k), Rent's parameter (p), target placement utilization (U ), and top metal layer (M ). We generate a netlist for each case by using gnl software, and then perform placement and routing in each of two technologies (T ), 90nm and 65nm. Table I shows the parameters' values used in the DOE. We observe that k has a strong relationship with via count, and larger p shows larger via count from Fig. 1. Fig. 2 shows the dependencies on U , M and T for both the N = 10,000 and 20,000 netlists, respectively. We observe that the number of vias is proportional to N , and that high utilization increases via count, while the impact of T and M do not show large changes in via count. In addition to the DOE with Rentian circuits, we assess the impact of average terminals per gate (k), wiring pitch and clock frequency with the AES core from opencores.org [13] . To control the average terminals per gate, we restrict the usable set of cells in a library for the implementation. For a small-k design, we only use small fanin library cells, such as INV, BUF, NAND2 and DFF, while all cells in the library are available for a large-k design. Results are summarized in Table II . To see the dependency on clock frequency, we implement AES at two extreme clock frequencies, i.e., 50MHz and 400MHz. Results are again summarized in Table II . To see the impact of wiring pitch, we prepare a fake technology that has half of the original pitch. Results are summarized in Table III. From the tables, we observe again that via count is strongly dependent on the number of instances or nets. However, comparing small-k and large-k designs, we observe that the number of instances increases by about a factor of 3, but the number of vias increases by 50%. It implies that, in addition to N , k is another major factor for via count. We also observe that clock frequency does not strongly affect the number of instances. Even if the clock frequency increases by a factor of 8, the number of instances sees only 2% increase for the small-k design, and the number of vias increases by only 4 ∼ 11%. In addition, reduced wiring pitch reduces via count by about 17%. We can attribute the via count difference between original and half pitches to the impact of detour routing due to the via blockage effect [4] . Table  IV summarizes the impact of each parameter on via count. From the results, we conclude that the major factors for via count are N and k, which directly affect via count by attaching pins of gates to nets; other major factors are p, U , M and wiring pitch which define the available routing resources and routing congestion.
III. TAXONOMY OF VIA MODELING APPROACHES Based on the different design stages and available information at each stage, we now present a new taxonomy of via modeling approaches, i.e., analytical, netlist-based, and placement-based methods.
Analytical methods.
For analytical methods, the netlist is not known. From the observations in Section II, the input may consist of the major factors, the total number of gates (N ) and average number of pins per gate (k). To estimate the impact of congestion, Rent's parameter p may be required for wirelength estimation as in [15] . With the above input parameters, an analytical via estimation model roughly estimates the total number of vias in a design of the specific characteristics. As an example, we could make a rough via count estimator, e.g., 2.5 · k · N , which gives a rough lower bound on via count in our testcases, and we can further adopt Rent's parameter to include the impact of congestion. Uezono et al.'s via estimation model [15] can also be classified as an analytical via estimation method.
Netlist-based methods.
When the netlist is given, the total number of gates, total number of nets, total number of pins, and fanout distribution are easily obtained. Since we know exact values of major parameters, i.e., number of pins and gates in a design, the total number of vias can be predicted more accurately than analytical methods. In addition, since all connectivity information is given, we can compute Rent's parameter p and use it to model congestion. Placement-based methods.
After the placement stage, all the pins in the netlist are placed and the available routing resources, including number of routing layers, available routing tracks, etc. are known. At this stage, better estimation accuracy is obtained using constructive methods. For example, a fast Steiner-tree based net decomposition process can be adopted to obtain a lower bound on the total number of vias. Even a fast heuristic layer and/or track assignment process can be used to further improve the accuracy of the via estimator.
Among the above three different via estimation methods, analytical methods are the most inaccurate and placement-based methods are the most accurate, due to the difference in design information available at the corresponding design stages. In the following, we focus on the placement-based approach and a new via estimation model.
IV. PLACEMENT-BASED VIA ESTIMATION
In placement-based via estimation, the input consists of the placed netlist along with the design and technology information. Available routing tracks on each routing layer can be computed considering the routing blockages and P/G pre-routes. Given the placed netlist, multi-pin nets are first decomposed into horizontal and vertical two-term wire segments using the FLUTE Steiner tree heuristic [6] 2 . With the available routing tracks and the decomposed wire segments, we compute the average routing congestion as
where Cavg represents average routing congestion, C h and Cv respectively represent the average horizontal and vertical routing congestion, S h and Sv respectively represent the total length of the horizontal and vertical wire segments, and T h and Tv respectively represent the total length of the available horizontal and vertical routing tracks. We present a lower-bound based via estimation model, where the total number of vias is estimated as the sum of the lower bound and the vias introduced by wiring detours. To calculate a lower bound on the total number of vias, we make two assumptions: (1) the Metal 1 routing layer is not used for routing, and (2) the routing pitches of Metal 2 and Metal 3 are zero, so that all wire segments can be routed on those two routing layers without causing design rule violations. Assumption (1) is based on the fact that current designs use Metal 1 for placement of cell instances and very few nets are completely routed on Metal 1. Assumption (2) is used to establish a lower bound on the number of vias, where the minimum number of layers is used for routing all wire segments without any detours. With the above assumptions, a trivial layer assignment process is performed, where we assume that Metal 2 is used for vertical interconnects and Metal 3 is used for horizontal interconnects, then assign all horizontal and vertical wire segments accordingly. Based on the layer assignment result, we establish a lower bound VL on the total number of vias. We then estimate the total number of vias, with consideration of average congestion, as (8) where Ve represents the total number of estimated vias, and VS represents the sum of the pins, the Steiner points, and the bend points of the constructed Steiner trees. 3 
V. EXPERIMENTS
We have integrated the FLUTE Steiner tree heuristic [6] and implemented layer assignment using only one tier of routing layers, i.e., Metal 2 and Metal 3, for placement-based via estimation. Two different sets of designs are needed to evaluate the model. The first set of designs (training designs) is used for training, i.e., for calibrating the model parameters α and β. The second set of designs (testing designs) is used to evaluate the accuracy of the model with fitted model parameters. Table V shows our experimental results, where "Design" gives the different designs (prefixes "45 ", "65 " and "90 " in design names denote 45nm, 65nm and 90nm designs, respectively), Va gives the actual number of vias after routing, Ve gives the estimated number of vias using our new placement-based via estimation model, VL gives the computed lower bound of the total number of vias, and "Error" gives the percentage error of the (fitted) new model.
The first 14 designs (rows 2-15) are training designs used for fitting the model, and result in fitted parameter values α = 6.64 and β = 1.63. The remaining 16 designs (rows 16-31) are testing designs used to evaluate the accuracy of the fitted model. From Table V , our model underestimates the number of vias for all 45nm designs, which may be attributed to those 45nm designs possibly having poor routing quality, so that more detours are generated than are required. The average percentage error for all training designs is 3.4% and that for all testing designs is less than 7.9%.
VI. CONCLUSIONS AND ONGOING WORK
In this paper, we revisit and analyze the recent via distribution model of [15] and point out possible inaccuracies and deficiencies. We present a new taxonomy of via modeling approaches, including analytical, netlist-based, and placement-based approaches. Details of the placement-based via estimation model are presented and validated with promising experimental results.
Our ongoing work includes improving the accuracy of our placement-based via estimation model by adopting fast heuristic track/layer assignment algorithms to achieve a more constructive via estimator; we are also investigating new analytical and netlistbased via estimation models. 
